fix(browse): skip parent-process watchdog in headed and tunnel modes#869
Open
zzyyfff wants to merge 5 commits intogarrytan:mainfrom
Open
fix(browse): skip parent-process watchdog in headed and tunnel modes#869zzyyfff wants to merge 5 commits intogarrytan:mainfrom
zzyyfff wants to merge 5 commits intogarrytan:mainfrom
Conversation
The watchdog from garrytan#808 polls BROWSE_PARENT_PID every 15s and calls shutdown() when the parent is gone. But BROWSE_PARENT_PID is set to the CLI's process.pid, which exits within milliseconds of spawning the server. In headed mode this kills the browser window after exactly 15s on every platform (reported as garrytan#867). Three guards, mirroring the idle timer: 1. Skip in headed mode (user controls the browser lifetime) 2. Skip in tunnel mode (remote agents reconnect sporadically) 3. Skip if lastActivity is within WATCHDOG_GRACE_MS (30s), meaning a client sent a command recently through a new CLI invocation Headless mode is unchanged in practice: the CLI auto-restarts the server on the next call, so the watchdog just acts as faster garbage collection for idle headless servers. The grace period prevents unnecessary cold-start restarts during rapid command sequences. Adds 6 source-level regression tests verifying the guards exist and are ordered correctly relative to the kill/shutdown calls. Fixes garrytan#867 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ommands The `domains` variable is defined in `handlePairAgent()` but was referenced in the connect and disconnect command handlers, causing `ReferenceError: domains is not defined` and preventing headed mode from launching. Looks like a merge artifact from the pair-agent body format being copied into unrelated request bodies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rom ESRCH Two issues found during review: 1. The watchdog setInterval handle was never stored, so shutdown() could not clear it. The callback could fire during teardown against partially-torn-down state. Now stored as watchdogInterval and cleared alongside the other intervals in shutdown(). 2. The catch block treated all process.kill(pid, 0) errors as "parent is dead." But EPERM means the process exists, we just lack permission to signal it. Now checks err.code and skips shutdown for EPERM, only acting on ESRCH (process not found). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Restore newtab exclusion from tab ownership check (server.ts:954). Scoped tokens with own-only tabPolicy were getting 403 on newtab because the ownership check ran before the newtab handler. Confirmed by Claude structured review, Claude adversarial, and Codex. - Flip EPERM→ESRCH logic in watchdog catch block. Only treat ESRCH (process doesn't exist) as "parent gone". All other error codes (EPERM, EINVAL, platform-specific) are treated as "parent alive". - Extract shouldSuppressAutoShutdown() shared guard. Both idle timer and watchdog now call the same function, eliminating guard duplication and ensuring future connection modes only need one update. - Restore BROWSE_PARENT_PID=0 for pair-agent connect subprocess. Prevents env variable leak from parent shell into the server, which could cause premature termination during headed handoff. - Restore newtab and BROWSE_PARENT_PID=0 regression tests in server-auth.test.ts. - Expand watchdog.test.ts: 14 tests covering source-level guards, shared function structure, behavioral logic extraction (ESRCH-only, grace period direction, headed/tunnel suppression), and shutdown cleanup. Replaces 6 pattern-only tests. - Add CHANGELOG entry for v0.15.15.1, bump VERSION. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merge origin/main into fix/watchdog-headed-crash. Conflicts in CHANGELOG.md (our improved entry supersedes main's) and server-auth.test.ts (keep all 3 regression tests: domains, BROWSE_PARENT_PID=0, and newtab). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The parent-process watchdog from #808 (shipped in v0.15.13.0) crashes headed Chromium after exactly 15 seconds on every platform.
BROWSE_PARENT_PIDis set to the CLI'sprocess.pid, which exits within milliseconds of spawning the server. On the first 15s poll, the watchdog finds the parent dead and callsshutdown(). The idle timer correctly skips headed mode (line 730); the watchdog did not.shouldSuppressAutoShutdown()guard used by both idle timer and watchdog:lastActivityis within 30s grace period (prevents unnecessary cold-start restarts during rapid headless CLI calls)Additional fixes from cross-model review (Claude + Codex)
command !== 'newtab'guard in the tab ownership check was accidentally removed. Scoped tokens withown-onlytabPolicy would get 403 on newtab, breaking multi-agent tab isolation. Guard restored with regression test.ESRCHas "parent gone". All other error codes (EPERM,EINVAL, platform-specific) are treated as "parent alive", preventing false shutdowns on containers and unusual platforms.BROWSE_PARENT_PID=0to prevent the parent shell's value from leaking into the new server and causing premature termination during the headed handoff.shouldSuppressAutoShutdown()replaces duplicate headed/tunnel checks in both the idle timer and watchdog. Future connection modes only need one update.Test plan
$B connectbrowser stays open past 15sReview coverage
/codex challenge/review(Claude structured)/review(Claude adversarial subagent)Cross-model agreement: All three sources independently identified the newtab regression as the highest-priority fix.
Fixes #867
Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com