Skip to content

gh-136660: fix UB in dictobject memcpy when there are 0 entries #136662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

marler8997
Copy link

@marler8997 marler8997 commented Jul 14, 2025

@python-cla-bot
Copy link

python-cla-bot bot commented Jul 14, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Jul 14, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@@ -2095,7 +2095,8 @@ dictresize(PyInterpreterState *interp, PyDictObject *mp,
if (unicode) { // combined unicode -> combined unicode
PyDictUnicodeEntry *newentries = DK_UNICODE_ENTRIES(newkeys);
if (oldkeys->dk_nentries == numentries && mp->ma_keys->dk_kind == DICT_KEYS_UNICODE) {
memcpy(newentries, oldentries, numentries * sizeof(PyDictUnicodeEntry));
// avoid memcpy on 0 entries to avoid UB on oldentries
if (numentries > 0) memcpy(newentries, oldentries, numentries * sizeof(PyDictUnicodeEntry));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please conform to PEP 7, our C code style guide.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, apologies for missing this

@bedevere-app
Copy link

bedevere-app bot commented Jul 14, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@methane
Copy link
Member

methane commented Jul 15, 2025

This is already fixed.
#127568

@methane methane closed this Jul 15, 2025
@zbynekwinkler
Copy link

IIUC this is about memcpy with zero size, while #127568 was about misaligned reads.

@methane
Copy link
Member

methane commented Jul 15, 2025

See #136660
UBSan reports misaligned read, but not 0-byte memcpy.

Is 0-byte memcpy undefined behavior?

@marler8997
Copy link
Author

This one was also a misaligned read. The other way to solve this problem would have been to ensure oldentries was initialized to an aligned address before reaching this codepath which seems to be the final fix in #127568. It also looks like the initial way that issue was going to be solved was to add a (const void*) cast to oldentries, which also would have fixed this issue.

So, it looks like the same issue, I was just building version 3.11.5 before the fix. I'll close the issue since it appears to have been fixed, if I run into it again with the fix I'll open the issue again.

@picnixz
Copy link
Member

picnixz commented Jul 15, 2025

Is 0-byte memcpy undefined behavior?

No it's a well-defined behavior. At least according to C99 §7.21.1/2:

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

It also looks like the initial way that issue was going to be solved was to add a (const void*) cast to oldentries, which also would have fixed this issue.

This (my) fix was not really correct as it simply suppresses the compiler warning (also, I'm not sure about the void* alignment guarantees).

@marler8997
Copy link
Author

This (my) fix was not really correct as it simply suppresses the compiler warning (also, I'm not sure about the void* alignment guarantees).

My understanding is this UB was a combination of the unaligned address and memcpy. If you pass an unaligned address to a function and it's never used, it's well-defined, however, if you pass it to memcpy it becomes undefined behavior, even if the size is 0. memcpy is "special". Since void* has no alignment, casting the pointer to a void* before it gets passed to memcpy should also avoid the UB. This also matches how ubsan behaves, it no longer triggers if you add the void pointer cast.

@andrewrk
Copy link

UndefinedBehaviorSanitizer is not expected to produce false positives. If you see one, look again; most likely it is a true positive!

https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#issue-suppression

@picnixz
Copy link
Member

picnixz commented Jul 18, 2025

AFAIR, casting to void * to prevent the UBSan to complain still incorrect. The C standards say:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.

EDIT: though in this case it says "object type", a void * is not a a pointer to an object type. I however think that because the prototype of memcpy takes a void *, then casting to a void * made the alignment check disappear.

EDIT 2: Note that the pointers compare equal but are not necessarily the same (alignment may have changed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants