Skip to content

Unify recognization of message digest names for _hashlib and _hmac #131876

Open
@picnixz

Description

@picnixz

Feature or enhancement

When calling hmac.new(key, digestmod=HASH), the HASH can be:

  • a named algorithm recognized by hashlib.new, (e.g., sha256);
  • a digest constructor (e.g., hashlib.openssl_sha256);
  • an object supporting PEP-247.

In #130157, I've only supported named algorithms as HACL* only supports named algorithms. When using OpenSSL HMAC instead of HACL* HMAC, determining which hash function to use from HASH is left to the hashlib C implementation, which itself delegates this task to OpenSSL based on NIDs.

I'm creating this issue so that we can brainstorm and decide how we should make HACL* HMAC able to also possibly detect objects supporting PEP-247. I plan to first upgrade the HMAC documentation:

Return a new hmac object. key is a bytes or bytearray object giving the secret key. If msg is present, the method call update(msg) is made. digestmod is the digest name, digest constructor or module for the HMAC object to use. It may be any name suitable to hashlib.new(). Despite its argument position, it is required.

As you may see, the terms "digest constructor" and "module" are not well-defined. So I first plan to explain these two. In a second phase, I plan to extract the code in _hashopenssl.c responsible for determining whether a name is known or not in a separate module so that it can be shared with the HMAC C implementation later. The idea is to ease the future transition where we would drop OpenSSL (but this is still not planned nor decided, and this would likely require a PEP) and entirely rely on HACL* instead, both for speed and security.

Some questions we need to address before letting HACL* HMAC support non-named algorithms:

  • Should we restrict digest constructors to HACL* ones only or not?
  • If not, should we consider hashlib.openssl_sha256 equivalent to using HACL* SHA-256?
  • If not again, should we fallback to a generic implementation of HMAC which directly calls that callable

Currently, passing a digest constructor to the (OpenSSL) C implementation does not mean that we're using it. We're actually using it to recover the algorithm name, so it can regarded as an alias.

OTOH, in the Python implementation of HMAC, any callable is considered a digest constructor and will be used as is. In some sense, it's a way to have HMAC implemented using an arbitrary hash function (HMAC is designed as such).

Linked PRs

Metadata

Metadata

Assignees

Labels

extension-modulesC modules in the Modules dirtype-refactorCode refactoring (with no changes in behavior)

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions