Skip to content

fix(server): set session to error state when sendTurn fails#1960

Open
sachssem wants to merge 1 commit intopingdotgg:mainfrom
sachssem:fix/reset-session-on-turn-start-failure
Open

fix(server): set session to error state when sendTurn fails#1960
sachssem wants to merge 1 commit intopingdotgg:mainfrom
sachssem:fix/reset-session-on-turn-start-failure

Conversation

@sachssem
Copy link
Copy Markdown

@sachssem sachssem commented Apr 12, 2026

What Changed

When sendTurnForThread fails inside ProviderCommandReactor, the catchCause handler now also calls setThreadSession to transition the session into a terminal state, in addition to the existing failure activity.

  • If an active provider binding exists: sets status: "error" (preserves binding for recovery).
  • If no binding exists (e.g. provider mismatch before session start, or a previously stopped session): sets status: "stopped" to avoid creating a synthetic session without a backing ProviderService directory entry.

Both paths set lastError so the client can detect the failure.

Single-file production change in ProviderCommandReactor.ts. Four regression tests added covering existing-session, null-session, stopped-session failure paths, and recovery after a provider-switch error.

Why

After the client dispatches thread.turn.start, the HTTP call succeeds so turnStartSucceeded = true. The client then waits for a server-side signal via hasServerAcknowledgedLocalDispatch, which checks for phase === "running", threadError, or a session snapshot change.

When sendTurn fails server-side, the existing handler only appended a failure activity — it never updated the session. If the session was already bound and unchanged, none of the acknowledgment conditions were met: phase stayed "ready", threadError was null, and the session snapshot was identical. This left isSendBusy permanently stuck, the send button disabled, and "Working for …" hanging until Cmd+R.

Previous attempts to fix this (#1348, #1393, #1542) added client-side workarounds. This change addresses the root cause on the server by ensuring the session always transitions to a detectable state on failure.

Validation

  • bun fmt — clean
  • bun lint — clean
  • 26/26 ProviderCommandReactor tests pass (4 new)
  • Full orchestration + provider suite: 300/300 pass

Checklist

  • This PR is small and focused
  • I explained what changed and why

Note

Medium Risk
Changes error-handling in ProviderCommandReactor to mutate persisted session state on turn-start failures, which can affect client behavior and session reuse/recovery paths. Covered by new regression tests across existing-session and no-session scenarios, reducing risk but still touching orchestration state transitions.

Overview
Ensures thread.turn-start-requested failures no longer only append a provider.turn.start.failed activity: the reactor now also calls setThreadSession to move the thread session into a detectable terminal state and populate lastError.

On failure, sessions with an existing non-stopped binding are marked status: "error" (preserving binding for recovery), while cases with no real binding (including previously stopped sessions) are marked status: "stopped" to avoid creating a synthetic bound session. Tests are updated/added to assert these statuses, lastError, and that a subsequent correct-provider turn can recover after a provider-switch failure.

Reviewed by Cursor Bugbot for commit aee8e69. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Set thread session to error state when sendTurn fails in ProviderCommandReactor

  • When a thread.turn.start fails, the reactor now updates the thread session: if an active session existed it transitions to status: 'error', otherwise it creates/updates a session with status: 'stopped'.
  • In both cases, activeTurnId is cleared and lastError is set to 'Provider turn start failed: …', alongside the existing provider.turn.start.failed activity.
  • New and updated tests in ProviderCommandReactor.test.ts cover the error, stopped, and recovery scenarios.
  • Behavioral Change: previously a failed turn start left the session in an ambiguous state; it now always reflects the failure explicitly.

Macroscope summarized aee8e69.

When sendTurn fails server-side after the client dispatch succeeds,
the catchCause handler only appended a failure activity without
updating the thread session. This left the client unable to detect
the failure: isSendBusy stayed true and the "Working for …" indicator
hung indefinitely because hasServerAcknowledgedLocalDispatch never
observed a phase, error, or snapshot change.

After appending the failure activity, transition the thread session
to reflect the failure. When an active provider binding exists, set
status to "error" so the binding is preserved for recovery. When no
binding exists (e.g. provider mismatch before session start, or a
previously stopped session), set status to "stopped" to avoid
creating a synthetic session that looks bound without a backing
ProviderService directory entry.

Both paths set lastError so the client observes threadError, resets
localDispatch, and re-enables the composer.

Adds regression tests for the existing-session, null-session, and
stopped-session failure paths, plus a recovery test confirming that
a subsequent turn succeeds after a provider-switch error.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 069ae7c7-d689-477a-87a3-89bd89940caa

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added size:M 30-99 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list. labels Apr 12, 2026
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Apr 12, 2026

Approvability

Verdict: Approved

This is a focused bug fix that properly transitions session state to 'error' when sendTurn fails, fixing a hanging UI indicator. The code change is self-contained within an existing error handler, and 85% of the diff is comprehensive test coverage for the new behavior.

You can customize Macroscope's approvability policy. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M 30-99 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant