Skip to content

Parallel signing #1468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Parallel signing #1468

wants to merge 9 commits into from

Conversation

jku
Copy link
Member

@jku jku commented Jul 15, 2025

This uses threads to sign multiple artifacts in parallel. The issue that is being solved here is that rekor v2 only responds to the entry request after the entry has been included in the log and the inclusion proof is available. When entry requests are done in parallel, they all get responses in the next "inclusion batch" (instead of one response per batch), speeding up multi-artifact signing by a lot. A practical example of the use case could be cpython releases where they sign ~10 artifacts.

There are multiple changes:

  • Refactor signing so the "threaded method" is as small as possible -- the threadpool code is from ramon
  • Avoid non-thread-safe print() in threaded method
  • Avoid requests Session use from multiple threads. It's not 100% certain this is required as the thread safety of session is unclear -- it used to be unsafe but the situation has improved -- so I tried to be conservative and use single-use Sessions since doing so does not seem like a major issue for the sigstore client. With Refactor: replace requests with urllib3 #1040 we can likely get rid of this.
  • refactor CLI output so that it remains in logical order even with threads
  • Add test that signs multiple artifacts

Signer.sign_artifact() and Signer.sign_dsse() are now expected to be thread safe.

ramonpetgrave64 and others added 5 commits July 10, 2025 12:07
This is (especially) useful with rekor 2 where the service only
responds after log inclusion: We would prefer to get all signatures
in the same inclusion batch.

The change still affects both rekor v1 and rekor v2.

This commit is a rebase of a bunch of Ramons commits.

Signed-off-by: Ramon Petgrave <[email protected]>
Signed-off-by: Jussi Kukkonen <[email protected]>
* Session thread safety is ambiguous: it may now be safe but situation
  is unclear
* Use a single Session per create_entry() call: This may have a little
  more overhead but seems like the safest option
* Note that RekorLog (v1) other methods may not be thread safe: I only
  reviewed the create_entry() flow

Signed-off-by: Jussi Kukkonen <[email protected]>
It's a little unclear if Session is now thread safe or not: avoid reuse
just in case

Signed-off-by: Jussi Kukkonen <[email protected]>
* Separate the thread method so it's easier to see potential safety issues
* Using print() with the same file object is generally not thread safe: Avoid
  it from the threaded method

The output remains effectively the same except:
* The b64 encoded signature is no longer printed to terminal
* Some print()s are now logger.info(): e.g.
    Transparency log entry created at index: 5562
* Other print()s happen in a batch now, after he signing has finished

Signed-off-by: Jussi Kukkonen <[email protected]>
@jku jku force-pushed the parallel-signing branch from 74c94d4 to 9d6455e Compare July 15, 2025 12:56
@jku
Copy link
Member Author

jku commented Jul 16, 2025

The included test only signs a few artifacts at a time but I have run this on more pathological cases:

touch a b c d e f g h i j k l m n o p q r s t u v x y z
time sigstore sign --overwrite a b c d e f g h i j k l m n o p q r s t u v x y z

Runtime on current main branch

  • rekor1: 20 seconds
  • rekor2: 63 seconds

Runtime on parallel-sign branch

  • rekor1: 7 seconds (reduction of 65%)
  • rekor2: 12 seconds (reduction of 80%)

The time differences here are not that big but I think that's mostly because staging rekor 2 is currently configured with small and quick batches: With longer batching times the rekor2 difference will only grow

@jku jku marked this pull request as ready for review July 16, 2025 11:24
Signed-off-by: Jussi Kukkonen <[email protected]>
@woodruffw woodruffw added component:cli CLI components component:signing Core signing functionality labels Jul 17, 2025
Make every RekorLog have a Session of their own by default.
This means RekorClient no longer needs to manage that.

Signed-off-by: Jussi Kukkonen <[email protected]>
@jku
Copy link
Member Author

jku commented Jul 21, 2025

The time differences here are not that big but I think that's mostly because staging rekor 2 is currently configured with small and quick batches: With longer batching times the rekor2 difference will only grow

Colleen confirmed this for me: tessera checkpoint interval is currently configured at 2 seconds in staging, this might get bumped up to 10 seconds

The effect of this with the 25 artifact rekor2 test (with the current 10 thread limit) is :

  • with current staging tessera configuration (2 second checkpoint interval)
    • sigstore-python main waits for 25 checkpoints: ~49 secs
    • parallel-signing branch waits for 3 checkpoints: ~5 secs
  • with possible staging tessera configuration (10 second checkpoint interval)
    • sigstore-python main waits for 25 checkpoints: ~ 240 secs
    • sigstore-python main waits for 3 checkpoints: ~ 25 secs

The timings for this branch depend on the max number of threads: current 10 threads means 25 artifacts splits to 3 checkpoints.

The takeaways to me are:

  • we can/should increase the max number of threads as far as is reasonable
  • if there are more than max_threads artifacts to sign, we could warn the user that the signing might take a while -- basically signing max_threads artifacts takes ~0.5 checkpoint intervals on average but even one more artifact takes ~1.5 checkpoint intervals on average

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:cli CLI components component:signing Core signing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants