Skip to content

[fix][test] Fix flaky ConnectionTimeoutTest by correcting latch count#25320

Open
merlimat wants to merge 1 commit intoapache:masterfrom
merlimat:fix-connection-timeout-test
Open

[fix][test] Fix flaky ConnectionTimeoutTest by correcting latch count#25320
merlimat wants to merge 1 commit intoapache:masterfrom
merlimat:fix-connection-timeout-test

Conversation

@merlimat
Copy link
Contributor

Summary

  • Fix ConnectionTimeoutTest.testLowTimeout which was flaky due to incorrect CountDownLatch count
  • The latch was initialized with backlogSize + 1 but only backlogSize connections could succeed (the extra connection hangs because the server backlog is full), so the latch never reached zero
  • Changed the latch count to backlogSize so the test proceeds once all possible connections are established
  • Added debug logging to help diagnose future issues

Test plan

  • Run ConnectionTimeoutTest.testLowTimeout — should complete in seconds instead of timing out after 5 minutes

Documentation

  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:


This change is the fix for #15068 regression

The CountDownLatch was initialized with `backlogSize + 1` but only
`backlogSize` connections could succeed (the extra one hangs because
the server backlog is full). This meant the latch never reached zero,
causing the test to hang until the 5-minute timeout.

Changed the latch count to `backlogSize` and added logging to help
diagnose future issues.
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Mar 14, 2026
@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.70%. Comparing base (ce5bc13) to head (d93dee3).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #25320      +/-   ##
============================================
- Coverage     72.73%   72.70%   -0.03%     
+ Complexity    34671    33843     -828     
============================================
  Files          1967     1954      -13     
  Lines        156326   154715    -1611     
  Branches      17812    17704     -108     
============================================
- Hits         113697   112487    -1210     
+ Misses        33552    33176     -376     
+ Partials       9077     9052      -25     
Flag Coverage Δ
inttests 25.77% <ø> (+0.02%) ⬆️
systests 22.43% <ø> (+0.06%) ⬆️
unittests 73.69% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 127 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

merlimat added a commit to merlimat/pulsar that referenced this pull request Mar 14, 2026
Cherry-pick of apache#25320. The latch count was backlogSize+1 but
only backlogSize connections can succeed, leaving the latch never reaching
zero and the test relying on timing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/test doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants