Skip to content

otelconf: skip flaky exporter tests on windows #7469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Jun 25, 2025

Conversation

bacherfl
Copy link
Contributor

@bacherfl bacherfl commented Jun 16, 2025

This PR disables the test execution of the exporter tests due to flakiness occurring on windows

Related to #7446

Copy link

codecov bot commented Jun 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.5%. Comparing base (e6f4bb9) to head (8c6136e).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #7469   +/-   ##
=====================================
  Coverage   81.5%   81.5%           
=====================================
  Files        198     198           
  Lines      17963   17963           
=====================================
  Hits       14647   14647           
  Misses      2917    2917           
  Partials     399     399           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bacherfl added 3 commits June 16, 2025 08:54
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
@pellared
Copy link
Member

FYI, we never had similar problems with https://github.com/signalfx/splunk-otel-go/blob/main/distro/otel_test.go where we do not even need to call assert.EventuallyWithT

Signed-off-by: Florian Bacher <[email protected]>
@bacherfl
Copy link
Contributor Author

FYI, we never had similar problems with https://github.com/signalfx/splunk-otel-go/blob/main/distro/otel_test.go where we do not even need to call assert.EventuallyWithT

thanks for the link - it's rather odd why this issue appears only here - apart from the server being used for traces/metrics/logs simultaneously (e.g. https://github.com/signalfx/splunk-otel-go/blob/af5cdb19b988e6ad035a4401defe63c34febc7db/distro/otel_test.go#L852-L855) i don't really see anything that is fundamentally different from the tests here. Do you think there could be some concurrency issues in the net.Listen function when finding a free port for the test server to run, i.e. it could happen that the same port is used for two or more tests at the same time?

@pellared
Copy link
Member

Do you think there could be some concurrency issues in the net.Listen function when finding a free port for the test server to run, i.e. it could happen that the same port is used for two or more tests at the same time?

No, I doubt it.

@pellared
Copy link
Member

Right now, I see you do not handle error when serving/shutting down.
Maybe you would get some more information on the issue if you do something like this:

https://github.com/signalfx/splunk-otel-go/blob/af5cdb19b988e6ad035a4401defe63c34febc7db/distro/otel_test.go#L857-L866

@pellared
Copy link
Member

pellared commented Jun 16, 2025

Is it always failing on windows-latest? 😆

@bacherfl
Copy link
Contributor Author

Is it always failing on windows-latest? 😆

i had a look at previous fails and all of them seem to be on windows, yes

bacherfl added 11 commits June 16, 2025 13:02
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
@bacherfl
Copy link
Contributor Author

@pellared I have to admit I'm running out of ideas as to why the test fails randomly - should we skip this test for now (maybe at least for windows) to not cause any troubles for other PRs until we have figured out the root cause?

@pellared
Copy link
Member

should we skip this test for now (maybe at least for windows)

I think we should do it as a temporary workaround.
We would still need to keep the issue open and also add a comment in code to reference the issue.

E.g. we could do

if runtime.GOOS == "windows" {
	// TODO (#7446): Fix the flakiness on Windows.
	t.Skip("Test is flaky on Windows.")
}

What do you think?

@bacherfl
Copy link
Contributor Author

sounds good - I will add that, and undo the changes to the timeout values, as they seemingly did not have any effect. I will keep the added error handling though

@pellared pellared added the Skip Changelog Allow PR to succeed without requiring an addition to the CHANGELOG label Jun 24, 2025
Co-authored-by: Robert Pająk <[email protected]>
@bacherfl bacherfl marked this pull request as ready for review June 24, 2025 08:19
@bacherfl bacherfl requested a review from a team as a code owner June 24, 2025 08:19
@pellared
Copy link
Member

Can you please adjust the PR title?

@bacherfl bacherfl changed the title otelconf: adjust timeout settings for flaky exporter tests otelconf: skip flaky exporter tests on windows Jun 24, 2025
@dmathieu dmathieu merged commit 9f780ae into open-telemetry:main Jun 25, 2025
29 checks passed
@pellared pellared added this to the v1.37.0 milestone Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog Allow PR to succeed without requiring an addition to the CHANGELOG
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants