Skip to content

fix(tool): add retry to api compare tests#6568

Merged
hanabi1224 merged 7 commits intomainfrom
hm/retry-api-compare-tests
Feb 10, 2026
Merged

fix(tool): add retry to api compare tests#6568
hanabi1224 merged 7 commits intomainfrom
hm/retry-api-compare-tests

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Feb 9, 2026

Summary of changes

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes #6563

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • New Features

    • Configurable per-test retry option for Compare (default: 2).
  • Improvements

    • Automatic per-test retries with exponential backoff and refined success evaluation.
    • More robust concurrent test execution with early exit when no tests remain; reporting and data dumps preserved.
  • Chores

    • Test environment image updated and startup wait removed for faster orchestration.

@hanabi1224 hanabi1224 added the RPC requires calibnet RPC checks to run on CI label Feb 9, 2026
@hanabi1224 hanabi1224 marked this pull request as ready for review February 9, 2026 10:49
@hanabi1224 hanabi1224 requested a review from a team as a code owner February 9, 2026 10:49
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and removed request for a team February 9, 2026 10:49
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

Walkthrough

Adds a configurable per-test retry mechanism to API compare: ApiCommands::Compare now includes n_retries: usize (Clap-exposed, default 2). run_tests is updated to spawn per-test tasks via tokio::task::JoinSet, apply exponential-backoff retries, and use a new evaluate_test_success helper for retry decisions.

Changes

Cohort / File(s) Summary
CLI Configuration
src/tool/subcommands/api_cmd.rs
Added n_retries: usize field to ApiCommands::Compare (Clap-exposed, default 2) and forwarded it into the compare test runner call.
Test Retry Logic
src/tool/subcommands/api_cmd/api_compare_tests.rs
Replaced FuturesUnordered with tokio::task::JoinSet; added per-test retry loop with exponential backoff; introduced evaluate_test_success helper; updated run_tests(..., n_retries) signatures and wired retry-aware success into reporting and dumping.
Imports / Helpers
src/tool/subcommands/api_cmd/api_compare_tests.rs
Added use tokio::task::JoinSet; and a new internal function evaluate_test_success.
Test infra config
scripts/tests/api_compare/.env, scripts/tests/api_compare/docker-compose.yml
Updated LOTUS_IMAGE tag and removed a wait loop after wallet import in docker-compose; no code-signature changes.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI (Compare cmd)
  participant Runner as TestRunner (run_tests)
  participant JoinSet as JoinSet (per-test tasks)
  participant Node as External Node (Forest/Lotus)
  participant Reporter as Reporter/Dump

  CLI->>Runner: invoke run_tests(n_retries)
  Runner->>JoinSet: spawn task per test
  loop per-test retry loop
    JoinSet->>Node: send RPC request
    Node-->>JoinSet: response (valid/timeout/rejected)
    JoinSet->>JoinSet: evaluate_test_success(response, criteria)
    alt success
      JoinSet-->>Runner: return success result
    else failure and retries remain
      JoinSet->>JoinSet: sleep (exponential backoff)
      JoinSet->>Node: retry RPC request
    else final failure
      JoinSet-->>Runner: return final failure result
    end
  end
  Runner->>Reporter: update report / optional dump per result
  Reporter-->>Runner: ack
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

  • chore: fix GetActorEventsRaw tests #6319 — touches src/tool/subcommands/api_cmd/api_compare_tests.rs and modifies evaluation logic for rejected test results; likely related to retry/evaluation behavior changes.

Suggested reviewers

  • LesnyRumcajs
  • akaladarshi
  • sudo-shashank
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding retry functionality to API compare tests.
Linked Issues check ✅ Passed Code changes implement the backoff retry mechanism for test execution as specified in issue #6563.
Out of Scope Changes check ✅ Passed All changes directly support the retry mechanism objective; environment and configuration updates align with the core implementation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/retry-api-compare-tests

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/tool/subcommands/api_cmd/api_compare_tests.rs (1)

2994-2994: Consider wrapping test_criteria_overrides in Arc to avoid per-task cloning.

Currently, the vector is cloned for each spawned task. Wrapping it in Arc<[TestCriteriaOverride]> before the loop would share the allocation:

♻️ Suggested improvement
+    let test_criteria_overrides: Arc<[TestCriteriaOverride]> = test_criteria_overrides.into();
     ...
     for test in tests.into_iter().unique_by(...) {
         ...
-        let test_criteria_overrides = test_criteria_overrides.to_vec();
+        let test_criteria_overrides = test_criteria_overrides.clone();

This is a minor optimization since the vector is typically small.


Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 added the RPC requires calibnet RPC checks to run on CI label Feb 9, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/tool/subcommands/api_cmd/api_compare_tests.rs`:
- Line 3021: The loop using while let Some(Ok((success, test, test_result))) =
futures.next().await swallows task JoinError by only matching Ok and exiting on
the first Err; change the loop to match Some(res) and explicitly handle
Err(join_err) by propagating it with context (e.g., using anyhow::Context or
map_err and a descriptive message) so failed test tasks return an error instead
of terminating silently, then continue processing or return the propagated error
as appropriate from the surrounding function that manages futures.

@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.34%. Comparing base (837ed77) to head (dd69e13).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/tool/subcommands/api_cmd.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/tool/subcommands/api_cmd.rs 0.00% <0.00%> (ø)

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 837ed77...dd69e13. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/tool/subcommands/api_cmd/api_compare_tests.rs (1)

2920-2939: ⚠️ Potential issue | 🟡 Minor

Add a doc comment for public run_tests (document n_retries).
Public functions should be documented, and the new retry parameter’s semantics (retries vs total attempts) deserve a brief note.

✏️ Suggested doc comment
+/// Run API comparison tests.
+///
+/// `n_retries` is the number of retry attempts per test (total attempts = n_retries + 1).
 pub(super) async fn run_tests(
As per coding guidelines "Document all public functions and structs with doc comments".
🧹 Nitpick comments (1)
src/tool/subcommands/api_cmd/api_compare_tests.rs (1)

2991-3011: Release the semaphore permit before backoff sleeps to preserve concurrency.
Holding the permit across the sleep throttles unrelated tests; drop it before sleeping so only active attempts consume slots.

♻️ Suggested change
             let test_result = test.run(&forest, &lotus).await;
             let success = evaluate_test_success(&test_result, &test, &test_criteria_overrides);
             if success || n_retries_left == 0 {
                 return (success, test, test_result);
             }
+            drop(_permit);
             // Sleep before each retry
             tokio::time::sleep(Duration::from_secs(backoff_secs)).await;

@hanabi1224 hanabi1224 added this pull request to the merge queue Feb 10, 2026
Merged via the queue into main with commit 0e6c181 Feb 10, 2026
34 checks passed
@hanabi1224 hanabi1224 deleted the hm/retry-api-compare-tests branch February 10, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add backoff retry mechanism for head tipset synchronization between Forest and Lotus

2 participants