Skip to content

The futex facility returned an unexpected error code. #93228

Closed
@gyakovlev

Description

@gyakovlev

Gentoo users and developers report rustc failing periodically with this error:
The futex facility returned an unexpected error code.

example:

Compiling serde_json v1.0.64
     Running `rustc --crate-name build_script_build --edition=2018 /var/tmp/portage/dev-util/maturin-0.10.6/work/cargo_home/gentoo/serde_json-1.0.64/build.rs --error-format=json --json=diagnostic-rendered-ansi --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debug-assertions=off --cfg 'feature="default"' --cfg 'feature="std"' --cfg 'feature="unbounded_depth"' -C metadata=d4ad0344b57abcdb -C extra-filename=-d4ad0344b57abcdb --out-dir /var/tmp/portage/dev-util/maturin-0.10.6/work/maturin-0.10.6/target/release/build/serde_json-d4ad0344b57abcdb -L dependency=/var/tmp/portage/dev-util/maturin-0.10.6/work/maturin-0.10.6/target/release/deps --cap-lints allow`
The futex facility returned an unexpected error code.
error: could not compile `serde_json`

happens to different crates, randomly, at different times.
not clearly reproducible and not clear what triggers it.

most often seen on our tinderbox ( package build testing machine) with random packages.
happens with user-build rustc but also with rust-bin package, which uses official standalone installers published on downloads page https://forge.rust-lang.org/infra/other-installation-methods.html#standalone
retrying build usually works fine.

error seems to be coming from glibc:

https://github.com/bminor/glibc/blob/e8d52b64a54ba9ed7778ca9ce1f084eb5808f8d1/sysdeps/nptl/futex-internal.h#L82

/* Calls __libc_fatal with an error message.  Convenience function for
   concrete implementations of the futex interface.  */
static __always_inline __attribute__ ((__noreturn__)) void
futex_fatal_error (void)
{
  __libc_fatal ("The futex facility returned an unexpected error code.\n");
}



....

static __always_inline int
futex_wait (unsigned int *futex_word, unsigned int expected, int private)
{
  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
  switch (err)
    {
    case 0:
    case -EAGAIN:
    case -EINTR:
      return -err;

    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
		     being normalized.  Must have been caused by a glibc or
		     application bug.  */
    case -ENOSYS: /* Must have been caused by a glibc bug.  */
    /* No other errors are documented at this time.  */
    default:
      futex_fatal_error ();
    }
}

Unfortunately I was never able to reproduce myself.

Any ideas what can it be, what's happening and how to catch it properly?
maybe it's a specific configuration issue on user's systems?

some example downlstream bug reports, those contain build logs, but those looks same, just futex error for random crate:

some bugs:

this one is very recent:

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-atomicArea: Atomics, barriers, and sync primitivesC-bugCategory: This is a bug.E-needs-mcveCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleO-linuxOperating system: LinuxT-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions