Skip to content

bpo-28280: Make PyMapping_Keys(), PyMapping_Values() and PyMapping_Items() always return a list #3840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Oct 8, 2017

Conversation

orenmn
Copy link
Contributor

@orenmn orenmn commented Sep 30, 2017

In addition:

  • Add get_mapping_keys(), get_mapping_values() and get_mapping_items() to _testcapi, so that we would be able to test PyMapping_Keys(), PyMapping_Values() and PyMapping_Items().
  • Add tests to test_capi to verify that each of these functions returns the appropriate list, and that it raises the appropriate error in case of a bad argument.
  • Update the docs accordingly.

https://bugs.python.org/issue28280

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also an entry in What's New. This will help people to revise their code and fix errors in old versions of Python.

On success, return a list or tuple of the keys in object *o*. On failure,
return *NULL*.
On success, return a list of the keys in object *o*. On failure, return
*NULL*.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a "versionchanged" directive.

This is needed for users that write a code that can be used with older Python.

@@ -272,6 +272,50 @@ class C(): pass
self.assertIn(b'MemoryError 2 20', out)
self.assertIn(b'MemoryError 3 30', out)

def test_mapping_keys_values_items(self):
class Mapping1(dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to test this class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its purpose is to test cases in which the methods already return a list, i.e. when PyList_CheckExact(meth_output) is true.

@@ -0,0 +1,2 @@
Make `PyMapping_Keys()`, `PyMapping_Values()` and `PyMapping_Items()` always
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that it was possible to return a tuple.

PyErr_Format(PyExc_TypeError,
"%.200s.%s() must return an iterable",
Py_TYPE(o)->tp_name,
meth_id->string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use meth_id->string and the %U format. The result of formatting is a PyUnicode object.

Copy link
Contributor Author

@orenmn orenmn Oct 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think i understand your comment..
meth_id->string is of type const char*, so when i use %U, it causes a crash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a redundant call of PyUnicode_DecodeUTF8Stateful() inside of PyUnicode_FromFormatV called from PyErr_Format(). It can be avoided by passing a PyUnicode object (meth_id->string) and using the %U format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, you suggest something like:

            PyErr_Format(PyExc_TypeError,
                         "%.200s.%U() returned a non-iterable (type %.200s)",
                         Py_TYPE(o)->tp_name,
                         PyUnicode_FromString(meth_id->string),
                         Py_TYPE(meth_output)->tp_name);

But then i would also have to handle refcounts, and the case in which PyUnicode_FromString() failed. Do you think it's worth it?

Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I suggest to use meth_id->object which is a cached PyUnicode_FromString(meth_id->string).

Copy link
Member

@serhiy-storchaka serhiy-storchaka Oct 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just have noticed typos in my previous comments! Of course I meant meth_id->object rather of meth_id->string!

if (it == NULL) {
if (PyErr_ExceptionMatches(PyExc_TypeError)) {
PyErr_Format(PyExc_TypeError,
"%.200s.%s() must return an iterable",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an actual type of meth_output.

@@ -2147,55 +2147,72 @@ PyMapping_HasKey(PyObject *o, PyObject *key)
return 0;
}

static PyObject *
method_output_as_list(PyObject *o, _Py_Identifier *meth_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment that this is mainly a copy of PySequence_Fast.

@@ -336,6 +336,10 @@ Build and C API Changes
is now of type ``const char *`` rather of ``char *``. (Contributed by Serhiy
Storchaka in :issue:`28769`.)

* The result of :c:func:`PyMapping_Keys`, :c:func:`PyMapping_Values` and
:c:func:`PyMapping_Items` is now always a list, rather than a list or a
tuple. (Contributed by Serhiy Storchaka and Oren Milman in :issue:`28280`.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mention my name. I'm just a reviewer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are the author of the issue in bpo, but as you wish.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one who opened an issue is not the one who fixed it.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this PR, but since I have proposed this change I can miss flaws of it. I want to get the approve from other core developers.

if (meth_output == NULL || PyList_CheckExact(meth_output)) {
return meth_output;
}
it = PyObject_GetIter(meth_output);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just calling list(obj)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PySequence_List() few lines below is just a C API equivalent of calling list(). But first we want to produce better error message if keys() returns non-iterable. It is in the current code, and removing it will simplify the code, but introduce a regression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to pass meth_output to PySequence_List() rather than iter(meth_output). list_extend() has a fast-path for tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't matter. keys() returning tuple should be very rare. I don't know any example.

Calling PyObject_GetIter() on an iterator is cheap, it should return the iterator itself. But calling PyObject_GetIter() on a general iterable creates a new iterator object. This is not so cheap. This will add an overhead (and possible side effects) in most common cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list(iter(obj)) is not exactly the same than list(obj). My point on performance was the cost of PySequence_List(iterator) vs PySequence_List(tuple).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a special case for tuples:

if (PyTuple_CheckExact(meth_output)) {
    result = PySequence_List(meth_output);
    Py_DECREF(meth_output);
    return result;
}

But I think this is a premature optimization. Do you know any class that return keys() as a tuple?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, it's fine. I don't care much. The current code is ok.

@serhiy-storchaka serhiy-storchaka merged commit 0ccc0f6 into python:master Oct 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants