Skip to content

Conversation

@trevwilson
Copy link
Contributor

Summary

  • Fixes console window appearing on Windows when the worker service spawns Claude subprocess via Agent SDK
  • Uses spawnClaudeCodeProcess option to wrap spawn with windowsHide: true

Relevant Issues

Fixes #304 (the visible 'claude' titled console window). PR #309 may prevent other utility windows from showing, but copying its fixes locally showed that they don't prevent the 'claude' one that persists for the duration of every response. Still doing investigation on the causes of the visible 'node' and 'uvx' windows.

Details

On Windows, the Claude Agent SDK spawns a visible console window when calling query() for observation processing. This creates a poor UX for background services.

The fix uses spawnClaudeCodeProcess (found in SDK type definitions) to provide a custom spawn function that passes windowsHide: true to Node's spawn(). This is a platform-safe option (no-op on non-Windows).

API Stability Note

spawnClaudeCodeProcess is typed but not documented in the SDK's web documentation. We're relying on it as an escape hatch with the understanding that:

  1. Failure mode is benign - If Anthropic removes or changes this option:

    • TypeScript will error at build time (we'll know)
    • At runtime, unknown options are silently ignored → falls back to SDK's internal spawn → console window reappears (same as before this fix)
    • No crashes or data loss
  2. Upstream issue filed - See anthropics/claude-agent-sdk-typescript#103 requesting they expose windowsHide as a first-class option. If adopted, we could replace spawnClaudeCodeProcess with a stable API (or remove the workaround entirely if they default to true)

  3. Maintainer discretion - This fix improves Windows UX today at the cost of depending on an undocumented API. If the maintainer prefers to wait for an upstream fix, this PR can be closed.

Test plan

  • Verified on Windows 11 - console window no longer appears during SDK subprocess execution
  • Build succeeds
  • Worker restarts successfully
  • Verify no regression on macOS/Linux (option is ignored on non-Windows)

🤖 Generated with Claude Code

On Windows, the Claude Agent SDK spawns a visible console window when
calling query(). Use the spawnClaudeCodeProcess option to wrap spawn()
with windowsHide: true, preventing the window from appearing.

The spawnClaudeCodeProcess option is documented in the SDK types at:
@anthropic-ai/claude-agent-sdk/entrypoints/agentSdkTypes.d.ts

Fixes console window appearing on every user message for Windows users.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@thedotmack
Copy link
Owner

I posted this to our discord:

@here anyone interested in working on the windows background windows bug? there's a possible solution in the PRs but I don't want to merge until someone else on windows validates it. https://github.com/thedotmack/claude-mem/pull/315

I will merge this ASAP but after 5x trying to fix this issue and only being able to properly test on Mac / Linux... I'm going to wait until someone else can validate this fix as well.

Seems like the actual real solution, since that windowsHide should have worked before, but you seem to have the detailed insights in to how it works from inside the api code. So I think this should work! And I'm sure it works on your machine. Just getting it double checked before merge.

Thanks! :)

@ToxMox
Copy link

ToxMox commented Dec 15, 2025

I worked on my own fix for the blank consoles and resolved it for myself locally but decided to test this PR. Here is my result:

PR #315 Windows Testing Feedback

Summary

Tested PR #315 on Windows 11. The windowsHide: true approach does not work on Windows when combined with detached: true. Blank terminal windows still appear.

Root Cause

This is a known Node.js limitation documented in Node.js issue #21825:

windowsHide: true is ignored when detached: true is also set.

The behavior persists in Bun as well, since Bun inherits Node.js process spawning semantics.

Tested Code (PR #315)

// ProcessManager.ts - Does NOT hide window on Windows
const child = spawn(bunPath, [script], {
  detached: true,
  stdio: ['ignore', 'pipe', 'pipe'],
  env: { ...process.env, CLAUDE_MEM_WORKER_PORT: String(port) },
  cwd: MARKETPLACE_ROOT,
  ...(isWindows && { windowsHide: true })  // <-- Ignored when detached: true
});

Working Solution

Use PowerShell Start-Process -WindowStyle Hidden on Windows:

// ProcessManager.ts - Working fix
private static async startWithBun(script: string, logFile: string, port: number): Promise<{ success: boolean; pid?: number; error?: string }> {
  const bunPath = getBunPath();
  if (!bunPath) {
    return {
      success: false,
      error: 'Bun is required but not found in PATH or common installation paths. Install from https://bun.sh'
    };
  }

  try {
    const isWindows = process.platform === 'win32';

    if (isWindows) {
      // Windows: Use PowerShell Start-Process with -WindowStyle Hidden
      // This properly hides the console window (Node.js bug #21825 workaround)
      const envVars = `$env:CLAUDE_MEM_WORKER_PORT='${port}'`;
      const psCommand = `${envVars}; Start-Process -FilePath '${bunPath}' -ArgumentList '${script}' -WorkingDirectory '${MARKETPLACE_ROOT}' -WindowStyle Hidden -PassThru | Select-Object -ExpandProperty Id`;

      const result = spawnSync('powershell', ['-Command', psCommand], {
        stdio: 'pipe',
        timeout: 10000,
        windowsHide: true
      });

      if (result.status !== 0) {
        return {
          success: false,
          error: `PowerShell spawn failed: ${result.stderr?.toString() || 'Unknown error'}`
        };
      }

      const pid = parseInt(result.stdout.toString().trim(), 10);
      if (isNaN(pid)) {
        return { success: false, error: 'Failed to get PID from PowerShell' };
      }

      // Write PID file
      this.writePidFile({
        pid,
        port,
        startedAt: new Date().toISOString(),
        version: process.env.npm_package_version || 'unknown'
      });

      return this.waitForHealth(pid, port);
    }

    // Unix: Standard spawn works fine
    const child = spawn(bunPath, [script], {
      detached: true,
      stdio: ['ignore', 'pipe', 'pipe'],
      env: { ...process.env, CLAUDE_MEM_WORKER_PORT: String(port) },
      cwd: MARKETPLACE_ROOT
    });

    // Write logs
    const logStream = createWriteStream(logFile, { flags: 'a' });
    child.stdout?.pipe(logStream);
    child.stderr?.pipe(logStream);

    child.unref();

    if (!child.pid) {
      return { success: false, error: 'Failed to get PID from spawned process' };
    }

    // Write PID file
    this.writePidFile({
      pid: child.pid,
      port,
      startedAt: new Date().toISOString(),
      version: process.env.npm_package_version || 'unknown'
    });

    return this.waitForHealth(child.pid, port);
  } catch (error) {
    return {
      success: false,
      error: error instanceof Error ? error.message : String(error)
    };
  }
}

SDK Subprocess Window (SDKAgent.ts)

The same issue applies to the SDK subprocess spawn. The spawnClaudeCodeProcess option in PR #315:

spawnClaudeCodeProcess: (opts) => spawn(opts.command, opts.args, { ...opts, windowsHide: true })

This may also need the PowerShell approach, or alternatively use windowsHide: true without detached: true if detachment isn't required for the SDK subprocess.

Additional Finding: SDK Subprocess Hangs

During testing, I also discovered that the SDK subprocess can hang indefinitely. When this happens:

  1. The abort() call from the AbortController doesn't terminate the subprocess
  2. The for await loop blocks forever
  3. Observation processing stops

Fix: Added a watchdog timer that kills child processes before calling abort:

function killChildProcesses(): void {
  const isWindows = process.platform === 'win32';

  try {
    if (isWindows) {
      execSync(`wmic process where "ParentProcessId=${process.pid}" delete`, {
        stdio: 'ignore',
        windowsHide: true,
        timeout: 5000
      });
    } else {
      execSync(`pkill -P ${process.pid}`, {
        stdio: 'ignore',
        timeout: 5000
      });
    }
  } catch (error) {
    // Ignore - child may already be dead
  }
}

// In the watchdog timeout handler:
const resetWatchdog = () => {
  if (watchdogTimer) clearTimeout(watchdogTimer);
  watchdogTimer = setTimeout(() => {
    logger.error('SDK', 'Query timeout - no response received, killing children and aborting');
    killChildProcesses();  // Kill subprocess FIRST
    session.abortController.abort();  // Then signal abort
  }, SDK_QUERY_TIMEOUT_MS);  // 2 minutes
};

Bun Zombie Socket Issue on Windows

When the worker process terminates on Windows, Bun leaves TCP sockets in LISTEN state. The port remains bound even though no process owns it. This happens regardless of termination method (process.exit(), external kill, Ctrl+C).

Symptoms:

  • Get-NetTCPConnection -LocalPort 37777 shows LISTEN state
  • OwningProcess is 0 or points to a dead PID
  • New worker cannot bind to the port: EADDRINUSE
  • Only a system reboot clears the zombie

Related Bun Issues:

Workarounds:

  1. Reboot - Only guaranteed way to clear zombie sockets
  2. Switch to Node.js - Run the worker under Node instead of Bun (no zombie issue)

Recommendation for claude-mem: Consider switching the worker runtime from Bun to Node.js for Windows stability, or accept that users may need to reboot to clear zombie ports.

Test Environment

  • Windows 11 Pro
  • Bun 1.3.4
  • Node.js 22.x
  • Claude Code CLI

Recommendation

  1. Use PowerShell Start-Process -WindowStyle Hidden for worker spawn on Windows
  2. Consider whether SDK subprocess needs detached: true - if not, windowsHide: true alone may work
  3. Add watchdog + child process killing for SDK subprocess timeout recovery

@trevwilson
Copy link
Contributor Author

I posted this to our discord:

@here anyone interested in working on the windows background windows bug? there's a possible solution in the PRs but I don't want to merge until someone else on windows validates it. https://github.com/thedotmack/claude-mem/pull/315

I will merge this ASAP but after 5x trying to fix this issue and only being able to properly test on Mac / Linux... I'm going to wait until someone else can validate this fix as well.

Seems like the actual real solution, since that windowsHide should have worked before, but you seem to have the detailed insights in to how it works from inside the api code. So I think this should work! And I'm sure it works on your machine. Just getting it double checked before merge.

Thanks! :)

Sounds good! Looking back at #304 it mentions multiple 'claude' windows spawning, whereas I was consistently just seeing one that persisted for the entirety of the time claude was processing and responding to a message. So there very well could be environment/config differences that make this not the complete fix, and more verification is absolutely welcome.

Here's my environment details:

Component Version
OS Windows 11 Home (Build 26100 / 24H2)
Node.js v25.2.1
Bun 1.3.4
Claude Code 2.0.69 (CLI)

@ToxMox
Copy link

ToxMox commented Dec 15, 2025

Now that I'm thinking about it. It's possible the PR actually solves the popup issue but the fix needs to also be applied to SDK subprocess spawn. The spawnClaudeCodeProcess. Since I'm testing with running subagents so maybe the worker now hides properly but the subagents don't yet. I'll test and report back

@ToxMox
Copy link

ToxMox commented Dec 15, 2025

Yeah so trying this PR's fixes with the subagent stuff didn't fix my popup issues. Both the worker and agent spawns generate the blank terminals. My powershell fixes seem to be needed for me. I am currently working on making a nice PR that will have the popups fixed, also the switch from bun to node for the worker to stop the zombie ports and also adding all kinds of recovery stuff to stuck messages in the queue and will add UI components to manage the queue to the web interface. Then I'll submit that. I've never submitted a PR to a public repo before so we'll see how that goes lol. I'll probably submit the PR tomorrow after I've thoroughly tested everything.

@trevwilson
Copy link
Contributor Author

trevwilson commented Dec 15, 2025

@ToxMox Yeah I think you're right. I don't use any subagents, so I wasn't triggering worker startup on a regular basis. This PR only affects the windows coming from the Agent SDK calls.

Edit: Finally got a chance to test and confirm that the powershell changes from above fix the issue I was targeting, without the need for changing the SDKAgent call. Closing this PR and would suggest going with ToxMox's more robust solution which hides the 'uvx' and 'node' windows, as well as at least 1 flashing cmd window that was coming from orphan cleanup attempts.

@trevwilson trevwilson closed this Dec 15, 2025
ToxMox added a commit to ToxMox/claude-mem that referenced this pull request Dec 15, 2025
WINDOWS FIXES (addresses issues from PR thedotmack#315):

Zombie Socket Issue:
- Switch worker from Bun to Node.js runtime
- Bun left TCP sockets in LISTEN state after process termination on Windows
- Users couldn't restart workers without rebooting the system
- Node.js properly releases sockets on exit

Blank Terminal Popups:
- Use PowerShell Start-Process with -WindowStyle Hidden on Windows
- Node.js bug #21825: windowsHide:true is ignored when detached:true
- PowerShell workaround properly hides background worker windows

SQLite Compatibility:
- Add better-sqlite3 compatibility layer for Node.js
- Replaces bun:sqlite which is Bun-specific
- Same API surface, works on both runtimes

QUEUE MONITORING UI:

Background: The message queue could hang indefinitely when SDK subprocesses
stalled during observation processing. AbortController.abort() didn't
reliably terminate stuck subprocesses, causing queue deadlock.

Solution:
- Real-time queue drawer showing pending/processing/failed messages
- Agent activity status indicator per session
- Self-healing: auto-reset stuck processing messages when no active agent
- Auto-restart SDK agent generator after self-healing
- Batch message completion tracking (fixes SDK message batching)
- Manual retry/abort controls and session recovery button
- Watchdog service for crash recovery
- Debug API endpoint (/api/debug/agent-log) for diagnostics
- Persistent message queue with SQLite (survives crashes)

Includes design docs in docs/plans/.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
ToxMox added a commit to ToxMox/claude-mem that referenced this pull request Dec 16, 2025
WINDOWS FIXES (addresses issues from PR thedotmack#315):

Zombie Socket Issue:
- Switch worker from Bun to Node.js runtime
- Bun left TCP sockets in LISTEN state after process termination on Windows
- Users couldn't restart workers without rebooting the system
- Node.js properly releases sockets on exit

Blank Terminal Popups:
- Use PowerShell Start-Process with -WindowStyle Hidden on Windows
- Node.js bug #21825: windowsHide:true is ignored when detached:true
- PowerShell workaround properly hides background worker windows

SQLite Compatibility:
- Add better-sqlite3 compatibility layer for Node.js
- Replaces bun:sqlite which is Bun-specific
- Same API surface, works on both runtimes

QUEUE MONITORING UI:

Background: The message queue could hang indefinitely when SDK subprocesses
stalled during observation processing. AbortController.abort() didn't
reliably terminate stuck subprocesses, causing queue deadlock.

Solution:
- Real-time queue drawer showing pending/processing/failed messages
- Agent activity status indicator per session
- Self-healing: auto-reset stuck processing messages when no active agent
- Auto-restart SDK agent generator after self-healing
- Batch message completion tracking (fixes SDK message batching)
- Manual retry/abort controls and session recovery button
- Watchdog service for crash recovery
- Debug API endpoint (/api/debug/agent-log) for diagnostics
- Persistent message queue with SQLite (survives crashes)

Includes design docs in docs/plans/.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
ToxMox added a commit to ToxMox/claude-mem that referenced this pull request Dec 16, 2025
WINDOWS FIXES (addresses issues from PR thedotmack#315):

Zombie Socket Issue:
- Switch worker from Bun to Node.js runtime
- Bun left TCP sockets in LISTEN state after process termination on Windows
- Users couldn't restart workers without rebooting the system
- Node.js properly releases sockets on exit

Blank Terminal Popups:
- Use PowerShell Start-Process with -WindowStyle Hidden on Windows
- Node.js bug #21825: windowsHide:true is ignored when detached:true
- PowerShell workaround properly hides background worker windows

SQLite Compatibility:
- Add better-sqlite3 compatibility layer for Node.js
- Replaces bun:sqlite which is Bun-specific
- Same API surface, works on both runtimes

QUEUE MONITORING UI:

Background: The message queue could hang indefinitely when SDK subprocesses
stalled during observation processing. AbortController.abort() didn't
reliably terminate stuck subprocesses, causing queue deadlock.

Solution:
- Real-time queue drawer showing pending/processing/failed messages
- Agent activity status indicator per session
- Self-healing: auto-reset stuck processing messages when no active agent
- Auto-restart SDK agent generator after self-healing
- Batch message completion tracking (fixes SDK message batching)
- Manual retry/abort controls and session recovery button
- Watchdog service for crash recovery
- Debug API endpoint (/api/debug/agent-log) for diagnostics
- Persistent message queue with SQLite (survives crashes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple visible console windows popping up (Windows 11, claude code in terminal)

3 participants