Skip to content

bpo-47009: Streamline list.append for the common case #31864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions Include/internal/pycore_list.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,24 @@ struct _Py_list_state {

#define _PyList_ITEMS(op) (_PyList_CAST(op)->ob_item)

extern int
_PyList_AppendTakeRefListResize(PyListObject *self, PyObject *newitem);

static inline int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for an inline function.
Let the LTO pass decide if it wants to inline it.

_PyList_AppendTakeRef(PyListObject *self, PyObject *newitem)
{
assert(self != NULL && newitem != NULL);
assert(PyList_Check(self));
Py_ssize_t len = PyList_GET_SIZE(self);
Py_ssize_t allocated = self->allocated;
assert((size_t)len + 1 < PY_SSIZE_T_MAX);
if (allocated > len) {
PyList_SET_ITEM(self, len, newitem);
Py_SET_SIZE(self, len + 1);
return 0;
}
return _PyList_AppendTakeRefListResize(self, newitem);
}

#ifdef __cplusplus
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Improved the performance of :meth:`list.append()` and list comprehensions by optimizing for the common case, where no resize is needed. Patch by Dennis Sweeney.
36 changes: 18 additions & 18 deletions Objects/listobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -301,26 +301,27 @@ PyList_Insert(PyObject *op, Py_ssize_t where, PyObject *newitem)
return ins1((PyListObject *)op, where, newitem);
}

static int
app1(PyListObject *self, PyObject *v)
/* internal, used by _PyList_AppendTakeRef */
int
_PyList_AppendTakeRefListResize(PyListObject *self, PyObject *newitem)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the extra function?
Converting app1 to _PyList_AppendTakeRef would seem to be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried something like that and the results are on the bpo issue, but it seems there is a significant benefit to having the separate as-small-as-possible function that always gets inlined as opposed to a slightly longer version that we let the compiler figure out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the compiler isn't inlining the function because it isn't hot enough.
PRECALL_NO_KW_LIST_APPEND and LIST_APPEND are only 0.3% of all instructions executed, so maybe it is best not to inline.

Do you have benchmark numbers for the standard suite?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, IMO there is a relative lack of list comprehensions in the pyperformance suite.

{
Py_ssize_t n = PyList_GET_SIZE(self);

assert (v != NULL);
assert((size_t)n + 1 < PY_SSIZE_T_MAX);
if (list_resize(self, n+1) < 0)
Py_ssize_t len = PyList_GET_SIZE(self);
assert(self->allocated == -1 || self->allocated == len);
if (list_resize(self, len + 1) < 0) {
Py_DECREF(newitem);
return -1;

Py_INCREF(v);
PyList_SET_ITEM(self, n, v);
}
PyList_SET_ITEM(self, len, newitem);
return 0;
}

int
PyList_Append(PyObject *op, PyObject *newitem)
{
if (PyList_Check(op) && (newitem != NULL))
return app1((PyListObject *)op, newitem);
if (PyList_Check(op) && (newitem != NULL)) {
Py_INCREF(newitem);
return _PyList_AppendTakeRef((PyListObject *)op, newitem);
}
PyErr_BadInternalCall();
return -1;
}
Expand Down Expand Up @@ -844,9 +845,10 @@ static PyObject *
list_append(PyListObject *self, PyObject *object)
/*[clinic end generated code: output=7c096003a29c0eae input=43a3fe48a7066e91]*/
{
if (app1(self, object) == 0)
Py_RETURN_NONE;
return NULL;
if (_PyList_AppendTakeRef(self, Py_NewRef(object)) < 0) {
return NULL;
}
Py_RETURN_NONE;
}

/*[clinic input]
Expand Down Expand Up @@ -963,9 +965,7 @@ list_extend(PyListObject *self, PyObject *iterable)
Py_SET_SIZE(self, Py_SIZE(self) + 1);
}
else {
int status = app1(self, item);
Py_DECREF(item); /* append creates a new ref */
if (status < 0)
if (_PyList_AppendTakeRef(self, item) < 0)
goto error;
}
}
Expand Down
13 changes: 4 additions & 9 deletions Python/ceval.c
Original file line number Diff line number Diff line change
Expand Up @@ -2213,10 +2213,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
TARGET(LIST_APPEND) {
PyObject *v = POP();
PyObject *list = PEEK(oparg);
int err;
err = PyList_Append(list, v);
Py_DECREF(v);
if (err != 0)
if (_PyList_AppendTakeRef((PyListObject *)list, v) < 0)
goto error;
PREDICT(JUMP_BACKWARD_QUICK);
DISPATCH();
Expand Down Expand Up @@ -5044,14 +5041,12 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
DEOPT_IF(!PyList_Check(list), PRECALL);
STAT_INC(PRECALL, hit);
SKIP_CALL();
PyObject *arg = TOP();
int err = PyList_Append(list, arg);
if (err) {
PyObject *arg = POP();
if (_PyList_AppendTakeRef((PyListObject *)list, arg) < 0) {
goto error;
}
Py_DECREF(arg);
Py_DECREF(list);
STACK_SHRINK(2);
STACK_SHRINK(1);
Py_INCREF(Py_None);
SET_TOP(Py_None);
Py_DECREF(callable);
Expand Down