-
Notifications
You must be signed in to change notification settings - Fork 14.5k
KAFKA-18845: Fix flaky QuorumControllerTest#testUncleanShutdownBrokerElrEnabled #19240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ElrEnabled Signed-off-by: PoAn Yang <[email protected]>
@@ -376,10 +375,12 @@ public void testElrEnabledByDefault() throws Throwable { | |||
} | |||
} | |||
|
|||
@Flaky("KAFKA-18845") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FrankYang0529 we need to keep the annotation here until the test proves itself to no longer be flaky. I have updated the wiki https://cwiki.apache.org/confluence/display/KAFKA/Flaky+Test+Management to reflect this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder. I added Flaky back and will remove it after we have sufficient data.
setListeners(listeners)); | ||
brokerEpochs.put(brokerToBeTheLeader, replyLeader.get().epoch()); | ||
partition = active.replicationControl().getPartition(topicIdFoo, 0); | ||
assertArrayEquals(new int[]{brokerToBeTheLeader}, partition.elr, partition.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a real message instead of just the partition if the assertion fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more content for it. Thanks.
Signed-off-by: PoAn Yang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ownBrokerElrEnabled (#19403) It has been around two weeks since fixing QuorumControllerTest#testUncleanShutdownBrokerElrEnabled PR #19240 was merged. There is no flaky result after 2025/03/21, so it has enough evidence to prove the flaky is fixed. It's good to remove flaky tag. Reviewers: Chia-Ping Tsai <[email protected]>
There're two root causes:
brokerToBeTheLeader
, we didn't wait for the result. That means when we send heartbeat to unfence broker, it has chance to use stale broker epoch to send the request. [0]brokerToBeTheLeader
cannot be elected as a new leader. [1][0]
kafka/metadata/src/test/java/org/apache/kafka/controller/QuorumControllerTest.java
Lines 484 to 497 in a5325e0
[1]
kafka/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
Lines 2470 to 2477 in a5325e0
Reviewers: David Arthur [email protected]