Skip to content

Surface silent parser content-capture gaps at ingest time #59

@willwashburn

Description

@willwashburn

Context

Discovered while investigating why ~99.7% of sessions in a local ledger fell into `burn waste`'s even-split attribution mode (see #58). Root cause: both `parseCodexSessionIncremental` and `parseOpencodeSessionIncremental` returned `content: []` unconditionally — a `TODO(#33-followup)` that went unaddressed for months while callers assumed content capture was universal. Nothing in ingest, no test, and no waste/summary command surfaced the gap; the symptom was a low-key footnote (`"39486/39587 sessions used even-split (no content sidecar)"`) that reads as an edge case, not a 99.7% degradation.

#58 fixes the parsers, but the shape of the bug — a parser silently returning empty content while advertising `contentMode === 'full'` — can recur for the next adapter (OpenHands, Crush, Kimi Code, etc., any of #27#36). We should detect and report it at ingest.

Proposed behavior

In `packages/cli/src/ingest.ts`, during each adapter's per-session ingest, check the post-parse result. If:

  1. `contentMode` is `'full'`, AND
  2. The session produced at least one turn that contains at least one tool call, AND
  3. The parser returned zero `ContentRecord`s of kind `'tool_result'` for that session

…then emit a structured warning, at most once per adapter per `burn` invocation (to avoid flooding stderr on a first ingest of a 40k-session backlog):

```
[burn] warning: codex parser produced 0 tool_result records for 47 sessions
with 312 tool calls. Content capture may not be implemented for this
adapter, so burn waste will fall back to even-split attribution. See #33.
```

Details:

  • Count sessions affected and tool calls observed during the walk; emit the single warning at the end of that adapter's ingest, not per-session.
  • Suppression: keep it to one line per adapter per invocation. Don't re-warn on an incremental ingest where no new tool-bearing sessions appeared.
  • Message must be actionable: name the adapter, point at Design: content sidecar store with retention and opt-out #33, and say what the downstream impact is (even-split attribution in `burn waste`).
  • Exit code unchanged (0). This is a warning, not a failure.

Why "tool_result" specifically

Other content kinds (`text`, `thinking`, `tool_use`) are nice-to-have and not currently consumed by any command except `quality.ts`'s assistant-text give-up detection. `tool_result` is the load-bearing kind for attribution. Scoping the warning to its absence keeps it actionable — firing on every missing `thinking` record would be noise, since many sessions legitimately have no reasoning.

What this is not

  • Not a hard error; `contentMode === 'hash-only'` or `'off'` is intentional and must stay silent.
  • Not per-session — a session with 0 tool calls legitimately produces 0 `tool_result` records even under `full`.
  • Not a retroactive check on already-ingested sessions; only on the current invocation's parse output. Historical diagnosis belongs in `burn diagnose`.

Implementation sketch

  1. Add `countToolCallsWithoutResults(turns, content): number` helper in `ingest.ts` (or move to `@relayburn/reader` if it wants to live with the types).
  2. In each of `ingestClaudeInto` / `ingestCodexInto` / `ingestOpencodeInto`, accumulate:
    • `affectedSessions: number`
    • `orphanToolCalls: number` (tool calls in committed turns without a matching `tool_result` ContentRecord)
  3. After the loop, if `contentMode === 'full'` and `affectedSessions > 0`, emit one formatted warning line.

Aggregation can also be surfaced in `burn diagnose` as a permanent state report rather than a per-invocation warning. Both layers are useful; consider shipping the warning first and then wiring the aggregate count into `diagnose` as a separate follow-up.

Acceptance

  • `pnpm dev:cli claude` (or equivalent) prints the warning on a ledger where codex/opencode have tool calls but no `tool_result` records.
  • No warning fires when `RELAYBURN_CONTENT_STORE=hash-only` or `off`.
  • No warning when an adapter's sessions have zero tool calls (e.g. chat-only sessions).
  • Warning text names the adapter, the session/tool-call counts, and references Design: content sidecar store with retention and opt-out #33.
  • Regression test under `packages/cli/src/ingest.test.ts` (or new `diagnose` tests) using a synthetic adapter whose "parser" returns `{ turns: [...tool calls...], content: [] }`.

Out of scope

  • A pluggable capability registry ("adapter X supports content kinds Y"). That's overkill until we have ≥ 4 adapters with mixed support. For now, count-based inference at ingest time is sufficient.
  • Failing the ingest outright. The purpose is visibility, not gating.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions