Skip to content

fix: reinitialize locks after fork to prevent deadlocks in child processes #4626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dshivashankar1994
Copy link

Description

This commit adds post-fork reinitialization of threading locks across multiple components in the OpenTelemetry Python SDK and API. It ensures that threading.Lock() instances are safely reinitialized in child processes after a fork(), preventing potential deadlocks and undefined behavior.

Details

  • Introduced usage of register_at_fork(after_in_child=...) from the os module to reinitialize thread locks.
  • Used weakref.WeakMethod() to safely refer to bound instance methods in register_at_fork.
  • Added _at_fork_reinit() methods to classes using threading locks and registered them to run in child processes post-fork.
  • Applied this to all usages of Lock, RLock

Rationale

Forked child processes inherit thread state from the parent, including the internal state of locks. This can cause deadlocks or runtime errors if a lock was held at the time of the fork. By reinitializing locks using the register_at_fork mechanism, we ensure child processes start with clean lock states.

This is especially relevant for WSGI servers and environments that use pre-fork models (e.g., gunicorn, uWSGI), where instrumentation and telemetry components may misbehave without this precaution.

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A

Does This PR Require a Contrib Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

…esses

Summary
This commit adds post-fork reinitialization of threading locks across multiple components in the OpenTelemetry Python SDK and API. It ensures that threading.Lock() instances are safely reinitialized in child processes after a fork(), preventing potential deadlocks and undefined behavior.

Details
Introduced usage of register_at_fork(after_in_child=...) from the os module to reinitialize thread locks.

Used weakref.WeakMethod() to safely refer to bound instance methods in register_at_fork.

Added _at_fork_reinit() methods to classes using threading locks and registered them to run in child processes post-fork.

Applied this to all usages of Lock, RLock

Rationale
Forked child processes inherit thread state from the parent, including the internal state of locks. This can cause deadlocks or runtime errors if a lock was held at the time of the fork. By reinitializing locks using the register_at_fork mechanism, we ensure child processes start with clean lock states.

This is especially relevant for WSGI servers and environments that use pre-fork models (e.g., gunicorn, uWSGI), where instrumentation and telemetry components may misbehave without this precaution.
@dshivashankar1994 dshivashankar1994 requested a review from a team as a code owner June 9, 2025 09:16
Copy link

CLA Not Signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant