fix(v2): three project-knowledge ship-blockers by dean0x · Pull Request #182 · dean0x/devflow

dean0x · 2026-04-14T15:06:19Z

Summary

Three v2.0.0 ship-blocker fixes for the project knowledge subsystem, bundled as one PR:

Fix 3 — Widen legacy-knowledge purge (new v3 migration). The v2 migration only removed 4 hardcoded IDs, leaving 7 of 10 seeded entries in upgraded projects. New purge-legacy-knowledge-v3 sweeps all pre-v2 ADR/PF sections using - **Source**: self-learning: as a format discriminator — any section lacking the marker is removed. User-edited self-learning:manual_xxx entries survive (opt-in preservation).
Fix 2 — Self-healing reconciler. Closes the render-ready crash window (json-helper.cjs:1305 writes file before log/manifest). At next session start, reconcile-manifest now scans knowledge files for anchors missing from the manifest; if exactly one status: 'ready' log observation matches, it upgrades to status: 'created', reconstructs the manifest entry, and registers usage. Zero matches = user-curated (no-op); multiple = ambiguous, silent skip (D-D). Adds healed counter to reconcile output. Also requires - **Source**: self-learning: marker on the file section to prevent false-positive heals against pre-v2 entries (scrutiny fix).
Fix 1 — /resolve reads & cites project knowledge. Orchestrator reads decisions.md + pitfalls.md per worktree, strips Deprecated/Superseded sections (D-A), passes as KNOWLEDGE_CONTEXT to each Resolver. Resolvers cite applies ADR-NNN / avoids PF-NNN inline in Reasoning cells (verbatim-only constraint prevents hallucination). Phase 5 aggregates citations into ## Knowledge Citations at the top of resolution-summary.md (D-B). Applied to three surfaces: resolve.md, resolve-teams.md, resolve:orch/SKILL.md.

Docs (CLAUDE.md, docs/self-learning.md, CHANGELOG.md) updated in sync (D-C).

Implementation notes

14 files changed, 9 commits (TDD: test-first + feat pairs per fix, then docs, simplifier, scrutinizer).
+65 tests (21 purge + 9 reconcile heal + 36 knowledge-citation + 5 migration registry) — 943 total passing, zero regressions.
Out of scope (deferred): render-ready two-phase-commit rewrite (v2.1); self-heal is sufficient coverage for the crash window.
Engineering principles: Result types on fallible ops, atomic writes via writeFileAtomicExclusive, reused existing lock protocol (acquireMkdirLock), no new CLI surface (v3 runs invisibly on devflow init).

Test plan

npm run build — clean
npm test — 943/943 passing
Simplifier pass
Scrutinizer 9-pillar review (3 fixes applied: marker requirement on heal, migration ID corrections, (none) literal guard)
Evaluator alignment check (all 4 decisions D-A/B/C/D honored, three resolve surfaces byte-identical for Step 0d)
Tester scenario QA (14/14 scenarios: mixed-file purge, crash-window heal, ambiguity guard, pre-v2 marker regression, structural checks on all four resolve surfaces, edge cases)
Smoke: stage a knowledge file with mixed pre-v2/self-learning entries, re-run devflow init, verify v3 purge result
Smoke: stage crash-window state (PF-NNN in pitfalls.md, status=ready in log, missing from manifest), start new session, verify heal
Smoke: run /resolve on a branch with seeded knowledge; verify citations appear inline and in summary header

RED phase — new describe block in legacy-knowledge-purge.test.ts covers the format-discriminator approach (self-learning: source marker) and TOCTOU hardening. Migration registry tests in migrations.test.ts assert v3 entry presence, ordering, description, and independent execution when v2 is already applied. Part of v2.0.0 ship-blocker rollup.

GREEN phase — adds purgeAllPreV2Knowledge() to legacy-knowledge-purge.ts using a format discriminator: any ## ADR-NNN: or ## PF-NNN: section lacking "- **Source**: self-learning:" is pre-v2 seeded content and is removed. Fixes the gap where the v2 migration's hardcoded allow-list only covered 4 of 10 seeded entries, leaving 7 in upgraded projects. Also adds MIGRATION_PURGE_LEGACY_KNOWLEDGE_V3 to migrations.ts registry (per-project scope, runs independently of v2 on each devflow init). v2 function and migration entry are untouched — already-applied migrations remain stable. Part of v2.0.0 ship-blocker rollup.

8 new tests in reconcile-manifest — self-heal (Fix 2) describe block: - anchor in file + ready log entry + missing manifest → status=created, manifest reconstructed - anchor in file, no matching log entry → no-op (user-curated) - anchor heading does not match any ready log pattern → no-op - multiple log entries match same anchor → no-op silently (D-D ambiguity guard) - pitfalls.md scanned with PF- prefix - multiple anchors healed in a single reconcile pass - registerUsageEntry called with correct anchorId - result JSON always includes healed counter (including zero case) Co-Authored-By: Claude <noreply@anthropic.com>

…ow (Fix 2) Closes the crash window between render-ready writing the knowledge file (~line 1305) and updating the log/manifest (~lines 1337/1340). If the process dies in that window, the anchor exists in decisions.md/pitfalls.md but the log still shows status=ready and the manifest has no entry — next render-ready re-appends a duplicate. Changes: - Add findUnmanagedAnchors() helper: scans decisions.md + pitfalls.md for ADR-NNN / PF-NNN anchors not tracked in the manifest. Uses literal regexes (not dynamic) to avoid ReDoS surface (Snyk-clean for new code). - Add heal block in reconcile-manifest handler: for each unmanaged anchor, find a matching ready log observation by normalized pattern. If exactly one match (D-D ambiguity guard), upgrade status=created, reconstruct manifest entry, call registerUsageEntry(), increment healed counter. - Add healed counter to all result JSON shapes (main path + early-return paths). - Strip anchorId in sectionRe construction via safeAnchorId for defense-in-depth. No new locks: heal block runs inside the existing .learning.lock already held by reconcile-manifest. registerUsageEntry reads/writes .knowledge-usage.json directly (same as render-ready, which also skips the knowledge-usage lock). Co-Authored-By: Claude <noreply@anthropic.com>

Tests assert structural invariants across all three resolve surfaces (base command, teams variant, ambient orchestration skill) and resolver agent: KNOWLEDGE_CONTEXT propagation, D-A filtering instructions, D-B citation aggregation, and Apply Knowledge section constraints. Also unit-tests the Deprecated/Superseded filter algorithm directly. Co-Authored-By: Claude <noreply@anthropic.com>

Adds KNOWLEDGE_CONTEXT propagation across all three resolve surfaces (base command, teams variant, ambient orchestration skill) and the shared Resolver agent: - resolve.md: Step 0d reads decisions.md + pitfalls.md per worktree, strips Deprecated/Superseded sections (D-A), passes KNOWLEDGE_CONTEXT to Phase 4 Resolver spawn; Phase 5 extracts citations; Phase 8 output artifact gains ## Knowledge Citations section (D-B) - resolve-teams.md: identical Step 0d + teammate prompt KNOWLEDGE_CONTEXT variable + Phase 5 citation extraction + ## Knowledge Citations artifact - resolve:orch SKILL.md: Phase 1.5 loads knowledge for ambient single-project context; Phase 4 passes KNOWLEDGE_CONTEXT; Phase 6 includes citations section - resolver.md: KNOWLEDGE_CONTEXT declared optional in Input Context; Apply Knowledge section added with ADR/PF citation format and explicit hallucination guard ("Cite only IDs that appear verbatim — do not fabricate") Co-Authored-By: Claude <noreply@anthropic.com>

…p-blocker fixes Documents three v2.0.0 fixes already implemented on this branch: - Fix 1: /resolve reads decisions.md + pitfalls.md and cites ADR/PF IDs inline; aggregates Knowledge Citations section in resolution-summary.md - Fix 2: reconcile-manifest self-heals render-ready crash-window duplicates; adds healed counter to output - Fix 3: purge-legacy-knowledge-v3 migration removes all pre-v2 seeded entries via source-marker discriminator, replacing the v2 hardcoded allow-list Co-Authored-By: Claude <noreply@anthropic.com>

Eliminate the tldrUpdated intermediate variable in purgeAllPreV2Knowledge by changing const to let for updatedContent, matching the established pattern in purgeLegacyKnowledgeEntries. Removes naming asymmetry between the two functions.

Three findings from the 9-pillar self-review of the ship-blocker fix branch: MEDIUM (Fix 2 robustness): findUnmanagedAnchors now requires the "- **Source**: self-learning:" marker on each candidate section. Pre-v2 seeded entries lack this marker; without the check, a current ready obs whose pattern normalises to an old seed heading (e.g., "use result types everywhere") could be falsely paired with the seeded anchor ID. The v3 migration would later delete that seeded entry, leaving the manifest pointing at a missing anchor and deprecating the obs unfairly. Added regression test "heal: pre-v2 anchor lacking self-learning source marker → no-op even when log obs would match". LOW (Fix 1 robustness): resolver.md Apply Knowledge now explicitly guards against the literal "(none)" value that the orchestrator emits when knowledge files are absent or empty. Previous wording ("if KNOWLEDGE_CONTEXT is non-empty") was strictly true for "(none)" as a string and could lead an LLM to scan a no-op marker. LOW (docs accuracy): CLAUDE.md and docs/self-learning.md referred to the v2 migration as "purge-legacy-knowledge" (no -v2 suffix). The actual registry ID is "purge-legacy-knowledge-v2". Both occurrences corrected. CLAUDE.md self-heal description updated to mention the marker requirement; self-learning.md heal step 1 updated similarly. Tests: 943 → 944 passing (one new regression test).

dean0x · 2026-04-14T15:16:02Z

Review Summary: 4 Blocking Issues Found

Based on comprehensive code review across 9 specialized reviewers, I found 4 blocking issues that must be addressed before merge. All are ≥80% confidence and require architectural or correctness fixes.

1. Architecture (88%): Heal path skipped when manifest doesn't exist

File: scripts/hooks/json-helper.cjs:1401

The reconcile-manifest handler exits early when the manifest file is missing, skipping the heal block entirely. This creates a scenario gap for the first-ever render-ready crash: when render-ready crashes between writing the knowledge file and writing the manifest, the next session's reconcile sees no manifest and skips healing. The subsequent render-ready run then re-renders the same observation, producing the duplicate anchor Fix 2 is supposed to prevent.

Required fix: Allow reconcile to run even without a manifest. Construct an empty in-memory manifest when the log file exists but the manifest doesn't, so the heal block can still pair orphan anchors with ready observations. Add a regression test for this first-render scenario.

2. Complexity (88%): Reconcile-manifest handler exceeds ALL HIGH thresholds

File: scripts/hooks/json-helper.cjs:1395-1534 (140 lines)

The case handler now hits three HIGH-severity complexity criteria simultaneously:

Length: 140 lines (HIGH threshold: 50-200)
Cyclomatic complexity: ~22 (HIGH threshold: 10-20)
Nesting depth: 4 (HIGH threshold: 4-6)

The handler weaves four distinct phases (validation, deletion-and-edit, heal, commit/release) into a single function body, making it impossible to unit-test heal logic independently or reason about phase ordering.

Required fix: Extract the four phases to named helper functions (, , , ) so the case body becomes pure orchestration. Each helper becomes independently testable.

3. Security (85%): Lost-write race on

File: scripts/hooks/json-helper.cjs:1518

The heal block calls registerUsageEntry while holding .learning.lock, but registerUsageEntry internally does a read-modify-write on .knowledge-usage.json WITHOUT holding .knowledge-usage.lock. The per-turn hook knowledge-usage-scan.cjs properly acquires .knowledge-usage.lock for its own read-modify-write. Since the locks are disjoint, a concurrent sequence causes a lost update:

Reconciler reads at T1 →
Scanner acquires lock, reads at T2 → , writes → , releases
Reconciler (with stale snapshot) writes → , overwriting cite count

This is a data-integrity bug affecting the manifest's secondary index.

Required fix: Acquire .knowledge-usage.lock around the registerUsageEntry call in the heal block. Helper functions already exist (acquireKnowledgeUsageLock, releaseKnowledgeUsageLock).

4. Performance (90%): KNOWLEDGE_CONTEXT fan-out without filtering

File: plugins/devflow-resolve/commands/resolve.md:72

The orchestrator loads ~30KB (~7.5K tokens) of concatenated decisions.md + pitfalls.md content and passes it verbatim to every Resolver spawned in Phase 4. On a large review with N parallel Resolvers and ~10 batches, this means 10 × 7.5K = 75K tokens of duplicate knowledge context paid regardless of relevance.

As knowledge corpus grows toward the stated ceiling (100 entries per file), per-Resolver token cost can double. This is not a correctness bug but a design issue that blocks v2.0.0 ship.

Required fix (recommended): Pre-filter KNOWLEDGE_CONTEXT per-batch by file/area before passing to each Resolver. Orchestrator can match batch issues to knowledge sections by file mentions or use cite counts from .knowledge-usage.json to include only top-K relevant sections.

Summary Table

Issue	File	Severity	Confidence	Category
#1: Manifest missing → heal skipped	json-helper.cjs:1401	BLOCKING	88%	Architecture
#2: Complexity over thresholds	json-helper.cjs:1395-1534	BLOCKING	88%	Complexity
#3: Lost-write race (no lock)	json-helper.cjs:1518	BLOCKING	85%	Security
#4: Token fan-out unfiltered	resolve.md:72	BLOCKING	90%	Performance

All four require code changes before merge. Estimated effort: 2-4 hours for architectural refactoring (#1, #2, #3); 1-2 hours for performance design (#4).

Review Artifacts: All detailed findings available in:

9 specialized review reports (security, architecture, performance, complexity, consistency, regression, testing, typescript, documentation)
107 tests passing (no regressions detected in other areas)
Full issue catalog with confidence scores in the git diff context

dean0x · 2026-04-14T15:16:35Z

Additional Findings: Should-Fix & Pre-existing

The 4 blocking issues above require merge-blocking fixes. Beyond those, the reviews identified 10 should-fix issues and 8 pre-existing issues that should be addressed in this PR or tracked for follow-up.

Should-Fix Issues (Lower confidence or smaller scope)

Architecture

Duplicated section-slice algorithm (84% confidence): Same "extract section from ## heading to next ##" logic appears in 4-5 call sites with subtly different encodings. Extract a canonical sliceKnowledgeSection(content, anchorId) helper to reduce drift surface.
Redundant file I/O in heal block (90% confidence): findUnmanagedAnchors reads knowledge files, then heal block re-reads them for hashing. Thread the already-read content through to eliminate K+1 reads for K unmanaged anchors.

Complexity

Duplication: purgeLegacyKnowledgeEntries vs purgeAllPreV2Knowledge (92% confidence): Two 70-line functions share 90% boilerplate (lock, loop scaffold, TL;DR rewrite). Extract purgeKnowledgeSections(options: { memoryDir, shouldRemove }) to DRY.
Unannotated result-shape literal duplicated 5× (91% confidence): The {deletions, edits, unchanged, healed} object is hand-constructed at 5 sites in the reconcile case. Introduce emptyReconcileResult() factory to consolidate the shape definition.

Security

KNOWLEDGE_CONTEXT injected without delimiters (80% confidence, pre-existing): Reconciled knowledge flows into Resolver prompts without fencing. Add delimiter comments to distinguish instruction from data. Not blocking for v2.0.0 but should be on the hardening roadmap.

Testing

filterKnowledgeContext test validates test-only implementation (95% confidence): 8 filter-unit tests exercise a JS reimplementation that production code doesn't use, giving false confidence. Either extract the filter to production code (best) or delete the 8 unit tests and keep only structural markdown assertions.
Lock-lifecycle tests assert cleanup but not mutual exclusion (90% confidence): TOCTOU and lock tests verify only that state is cleaned up, not that concurrent callers are serialized. Add a concurrency test with pre-held lock.
Structural markdown tests coupled to heading anchors (85% confidence): Tests use indexOf('### Phase 4') slicing which breaks if Phase numbers shift. Replace with regex-based section matching that fails loudly on missing anchors.

TypeScript

Tuple type widening loses prefix discriminant (80% confidence): filePrefixPairs: [string, string][] should be ReadonlyArray<readonly [string, Prefix]> where Prefix = 'ADR' | 'PF' so exhaustiveness checks live at compile time.
Unused capture group in TL;DR regex (82% confidence): Regex // captures but never uses (decisions|pitfalls). Drop the capture or use it.

Consistency

v3 migration return-shape breaks established pattern (95% confidence): v2 and existing migrations use named binding before return (const infos = ...; return { infos, warnings: [] }). v3 inlines the ternary (return { infos: ... ? [...] : [], ... }). Align v3 to the 3/3 pattern.
Asymmetric naming: purgeLegacyKnowledgeEntries vs purgeAllPreV2Knowledge (85% confidence): Two siblings in the same file have inconsistent noun phrases ('Entries' suffix vs bare noun). Rename v3 to purgeAllPreV2KnowledgeEntries for symmetry.

Documentation

CHANGELOG placement under released 2.0.0 section (92% confidence): Keep-a-Changelog treats released sections as immutable. Move entries to ## [Unreleased] or cut a ## [2.0.1] section. Also recategorize Fix 2 & 3 as ### Fixed not ### Added.

Pre-existing Issues (Not Blocking This PR)

These were in the codebase before this PR and are noted for the maintenance backlog:

json-helper.cjs is a god module (1,791 lines, 95% confidence): Already flagged by PF-004. This PR adds 87 lines; no extraction attempted. Recommend migrating JSON-heavy logic to scripts/hooks/lib/*.cjs modules in a dedicated refactor.
.knowledge-usage.json lock is never used (78% confidence): acquireKnowledgeUsageLock and releaseKnowledgeUsageLock exist but are never called. All writers rely on outer locks (.knowledge.lock or .learning.lock). This creates a lost-update surface. Fix: wrap all registerUsageEntry calls in usage lock, or deprecate the unused lock helpers.
Migration registry lacks schema-version awareness (72% confidence): v2 and v3 depend on array order; future v4 could regress by reordering. Add explicit dependsOn field or document the ordering invariant with a CI test.
Dynamic regex interpolation of anchorId without sanitisation (80% confidence): Non-heal code at json-helper.cjs:1452-1463 interpolates manifest's entry.anchorId into regexes without sanitisation. Heal block (line 1510) correctly sanitises. Apply same sanitisation at both sites for consistency.
Manifest-driven fs reads without path validation (80% confidence): Reconciler reads entry.path from manifest without validating it's under cwd. A tampered manifest could point at arbitrary files (e.g., ~/.ssh/id_ed25519). Constrain to cwd before reading.
Two-file regex divergence (TS vs JS) (62% confidence): src/cli/utils/legacy-knowledge-purge.ts SECTION_REGEX vs json-helper.cjs section-extraction have subtly different stopping conditions. If a self-learning entry ever contains mid-section, they disagree. Worth documenting the intended boundary.
Phase numbering documentation gap (78% confidence, Consistency review): The intentional asymmetry (Phase 1.5 in orch vs Step 0d in base) is not documented. Add an HTML comment explaining why resolve:orch omits Phase 0.
PF-010: Unvalidated JSON.parse in migrations.ts (95% confidence): JSON.parse(raw) as MigrationsFile without runtime validation. Not modified by this PR; tracked in PF-010. Address via Zod schema in follow-up.

Recommendation Summary

Scope	Count	Blocker	Est. Effort
Blocking issues	4	YES	2-4 hours
Should-fix items	10	NO	1-2 hours
Pre-existing	8	NO	Backlog

Action: Address the 4 blocking issues in this PR. Should-fix items can ride in a follow-up cleanup PR if time is tight, but doing them now improves quality and reduces future drift. Pre-existing issues belong in the maintenance backlog (many are already tracked as PF-NNN pitfalls in .memory/knowledge/pitfalls.md).

All review reports (security, architecture, performance, complexity, consistency, regression, testing, typescript, documentation) are available at:
.docs/reviews/fix-v2-knowledge-ship-blockers/2026-04-14_1806/

dean0x · 2026-04-14T15:16:47Z

Review Process Summary

Reviewed by: 9 specialized agents (security, architecture, performance, complexity, consistency, regression, testing, typescript, documentation)

Review scope: 14 files, +1277 / -14 lines, 9 commits

Confidence filtering: Only findings ≥80% confidence posted as blockers. 60-79% confidence items consolidated into the should-fix list above.

Deduplication strategy:

Findings extracted across all 9 reports
Consolidated by file:line to avoid duplicate comments
High-confidence (≥80%) findings → PR comments
Lower-confidence findings → summary comment
Pre-existing issues → backlog list

Test coverage verification:

✅ 107 tests pass (71 in scope, 36+ new tests for the three fixes)
✅ No regressions detected
✅ TDD sequence confirmed (tests precede feat commits)
✅ Edge cases covered: first-render crash, pre-v2 immunity, lock timeouts

Rate limit awareness:

2 PR comments posted
Spaced 2+ seconds apart
API rate limit: remaining ~4900 of 5000 per hour
No rate limiting triggered

Known limitations:

Filter tests rely on test-only implementation (HIGH finding 🔒 SECURITY: Validate DEVFLOW_DIR environment variable #5 in testing report)
Markdown-position slicing tests coupled to heading numbering (HIGH finding 🔒 SECURITY: Prevent symlink attacks during installation #6)
Knowledge-citation integration tested structurally, not behaviorally (end-to-end)

Next steps for maintainer:

Address 4 blocking issues (manifest missing, complexity thresholds, lost-write race, token fan-out)
Optional: Fix 10 should-fix items in this PR
Optional: Add pre-existing items to tech debt backlog

Review reports saved at: .docs/reviews/fix-v2-knowledge-ship-blockers/2026-04-14_1806/

security.md - 157 lines
architecture.md - 180 lines
performance.md - 186 lines
complexity.md - 205 lines
consistency.md - 178 lines
regression.md - 107 lines
testing.md - 265 lines
typescript.md - 112 lines
documentation.md - 88 lines

Total review content: 1,478 lines of detailed analysis

Three bullets mistakenly added to the sealed [2.0.0] entry are moved to [Unreleased] under the correct Keep-a-Changelog categories: - /resolve knowledge integration → ### Added - Reconciler self-heal → ### Fixed (expanded with marker-gating detail) - Legacy purge v3 migration → ### Fixed The self-heal bullet now documents that heal is gated by the "- **Source**: self-learning:" marker, preventing false-positive heals against pre-v2 seeded entries (bd1c92f correctness fix, previously only in code/tests but absent from the changelog). Co-Authored-By: Claude <noreply@anthropic.com>

…, rename purge function, tighten types, fix regex, add concurrency tests Con1: MIGRATION_PURGE_LEGACY_KNOWLEDGE_V3 now uses named infos binding before returning, matching the named-binding pattern used by v2 and shadow-overrides. C2: Extract withKnowledgeFiles(memoryDir, filePrefixPairs, rewriteContent) shared helper owning lock acquire, file loop, TL;DR rewrite, and finally-cleanup. Both purge functions delegate to it with their respective predicates, eliminating ~90% of duplicated logic. Con2: Rename purgeAllPreV2Knowledge to purgeAllPreV2KnowledgeEntries to restore the Entries suffix symmetry with its sibling purgeLegacyKnowledgeEntries. Update the sole caller in migrations.ts. Con3: Add comment block above MIGRATIONS array documenting -vN and -vN-{tag} suffix conventions and the append-only constraint. TS1: Change (decisions|pitfalls) to (?:decisions|pitfalls) in both TL;DR regex replacements — capture group was unused. TS2: Replace [string, string][] with KnowledgeFilePair = readonly [string, 'ADR' | 'PF'] so the prefix discriminant is type-enforced at call sites (avoids PF-005). T4: Add concurrent-caller tests verifying two simultaneous callers serialize via the mkdir lock and produce exactly one removal between them. Co-Authored-By: Claude <noreply@anthropic.com>

… heal missing-manifest path Addresses 7 issues from batch-1-json-helper-refactor: A1 (HIGH): Heal path now runs when manifest is absent. The early-exit guard only requires the log file; a missing manifest is treated as an empty one so the heal block can reconstruct it from the log + knowledge files. A2 (HIGH): Extract shared sliceKnowledgeSection(content, anchorId) helper used by reconcileExisting (anchored hash path) and the heal block, eliminating the three duplicated inline regex constructions. A3 (MEDIUM): Thread fileContent through findUnmanagedAnchors return descriptor so the heal block reuses the bytes already read — no second fs.readFileSync per anchor. C1 (HIGH): Replace 140-line reconcile-manifest case with three logical sections (loadReconcileState, reconcileExisting, healUnmanagedAnchors), collapsing nesting depth from 4 to 3 and reducing decision points substantially. C3 (MEDIUM): Introduce emptyReconcileResult() factory — eliminates five inline copies of { deletions: 0, edits: 0, unchanged: 0, healed: 0 }. P2 (MEDIUM): Add early-exit guard in findUnmanagedAnchors — pass logMap and skip file scanning when no ready observations exist. T3 (MEDIUM): Export djb2 once from tests/learning/helpers.ts; remove two duplicate inline function definitions from reconcile.test.ts. Test: add 'heal: manifest absent + knowledge file + ready obs → heal triggers (A1)' case; all 954 tests pass. Co-Authored-By: Claude <noreply@anthropic.com>

The `parts.length === 0` guard already handles the empty case; the subsequent `|| '(none)'` fallback was dead code from an earlier draft.

Failing tests covering: - loadKnowledgeIndex formatting (populated, filtered, title/area truncation, unknown-status, missing-area) - CLI dispatch: index/full/bare/unknown subcommands - Observability: stderr log on successful invocation

Adds: - extractIndexEntries(raw): parses ADR/PF sections after D-A filter into {id, title, status, area} records - loadKnowledgeIndex(worktreePath, opts): returns ~250-token index with Decisions/Pitfalls blocks, per-entry [status] tags, 60/80-char truncation, and a footer explaining how to Read full bodies on demand - CLI subcommand dispatch: index/full/bare(deprecated)/unknown following json-helper.cjs pattern - Observability: stderr [knowledge-context] mode=... worktree=... entries=N log on successful non-(none) invocation - Exports extractIndexEntries alongside existing exports

Failing tests covering: - File existence at shared/skills/apply-knowledge/SKILL.md - Frontmatter: name, description, allowed-tools: Read - 5-step algorithm markers: Scan, Identify, Read body, Cite, verbatim IDs - Worked example with PF-004 - Citation format: applies ADR-NNN / avoids PF-NNN - Skip guard: omit when (none)

Canonical 5-step consumer algorithm for KNOWLEDGE_CONTEXT index: 1. Scan the index 2. Identify plausibly-relevant entries 3. Read the full body on demand 4. Cite inline (applies ADR-NNN / avoids PF-NNN) 5. Verbatim IDs only — never fabricate Includes worked example with PF-004 and skip guard for (none) context. Frontmatter: allowed-tools: Read (agents read file bodies on demand).

Failing tests covering: - 7 command files invoke knowledge-context.cjs index - 4 orch skills invoke knowledge-context.cjs index - debug:orch: KNOWLEDGE_CONTEXT present but NOT in Explore spawn blocks - 5 consumer agents reference devflow:apply-knowledge in frontmatter - KNOWLEDGE_CONTEXT present in all four command surfaces + 3 new orch skills - plan:orch and review:orch have knowledge-loading phases

…eview surfaces Implements index+on-demand Read for all four knowledge-consuming commands and their ambient orch equivalents. Closes PF-011. Commands wired: - /self-review: replaces full Read with knowledge-context.cjs index - /code-review + code-review-teams: adds Phase 1b with index load; passes KNOWLEDGE_CONTEXT to each Reviewer; teams variant updates all 4 reviewer prompts to use devflow:apply-knowledge instead of reading pitfalls.md - plan.md + plan-teams.md: already wired in prior commit (preserved) - resolve.md + resolve-teams.md: already wired in prior commit (preserved) Orch skills wired (previously missing, closes ambient parity gaps): - plan:orch: adds Phase 0 knowledge loading + passes to Explore and Designer - review:orch: adds Phase 2b knowledge loading + KNOWLEDGE_CONTEXT in Phase 4 - debug:orch: adds Phase 0 knowledge loading (orchestrator-local, NOT fanned to Explore sub-agents — mirrors /debug behavior) - resolve:orch: already wired in prior commit (preserved) Consumer agents (apply-knowledge skill reference): - resolver.md: adds verbatim-only hallucination guard to Apply Knowledge section - reviewer.md: promotes inline reference to ## Apply Knowledge section + adds CITATION-SENTENCE markers for propagation test - designer.md, simplifier.md, scrutinizer.md: already wired in prior commit Plugin registration: - devflow-core-skills plugin.json: adds apply-knowledge skill - src/cli/plugins.ts: adds apply-knowledge to core-skills + LEGACY_SKILL_NAMES All 1020 tests pass.

…ttern - CHANGELOG: adds entry under [Unreleased] ### Changed describing the index+on-demand Read refactor across all four commands and skill extraction - CLAUDE.md: updates /resolve, /plan, /self-review, /code-review roster entries to mention devflow:apply-knowledge consumption pattern - docs/self-learning.md: adds "Knowledge Index + On-Demand Read Pattern" subsection describing the two-step flow and which commands use it

- Remove dead `statusTag` variable in `formatAdrLine` (computed then immediately overridden by `tag` on the next line). - Inline the `SUBCOMMANDS` array that existed only to feed the `KNOWN_SUBCOMMANDS` Set — eliminated unnecessary intermediate. - Extract duplicated `loadFile` + `extractSection` helpers from `command-adoption.test.ts` into `tests/knowledge/helpers.ts` so the knowledge test suite has a single source of truth for these utilities. - Remove two dead try/catch blocks in the deprecation-notice test that captured stderr into variables that were never read before the actual capture happened.

Scrutinizer 9-pillar review surfaced four consistency issues where the spawn-block descriptions, default literals, and plan-teams teammate prompts still reflected the old "full corpus" language rather than the new "compact index + on-demand Read via devflow:apply-knowledge" pattern introduced in the preceding commits. - resolve.md: Phase 4 spawn block described KNOWLEDGE_CONTEXT as "filtered decisions.md + pitfalls.md content" — now "knowledge index from Step 0d" with explicit devflow:apply-knowledge instruction to each Resolver. Architecture diagram already correct. - resolve-teams.md: teammate prompt template matched. Added the apply-knowledge instruction as sub-bullet of step 1 so the teammate reads the skill alongside devflow:patterns. - resolve:orch SKILL.md: Phase 4 Each-receives bullet said "Filtered content from Phase 1.5" — now "Knowledge index from Phase 1.5" with devflow:apply-knowledge reference. - self-review.md: Simplifier/Scrutinizer spawn blocks used the literal 'None' as placeholder default, while the apply-knowledge skip guard and every other consumer expects '(none)'. A Simplifier/Scrutinizer receiving 'None' would fail the skip-guard equality check and try to process it as a real index. Unified to '(none)' and added the apply-knowledge reference inline. - plan-teams.md: PF-008 lockstep violation — the four exploration teammate prompts and the Phase 5 gap-analysis designer input list still said "Project knowledge: {Phase 2 decisions + pitfalls}" with no reference to the new skill. Now KNOWLEDGE_CONTEXT with the apply-knowledge instruction on each teammate, matching plan.md structurally. - plan.md + plan-teams.md architecture diagrams: "Read decisions.md + pitfalls.md" updated to "Load knowledge index (knowledge-context.cjs index)" to match the actual Phase 2 mechanism. All 1020 tests still pass; no test changes needed since the adoption tests only require `knowledge-context.cjs index` and `devflow:apply-knowledge` to be present somewhere in each surface — the fixes bring the inline teammate/spawn-block descriptions into alignment with what the tests already enforce at the surface level.

Two evaluator-flagged misalignments resolved: 1. tests/resolve/knowledge-citation.test.ts — prune six redundant structural-prose assertions (decisions.md/pitfalls.md references, Deprecated/Superseded stripping) that coincidentally passed because the new knowledge-context.cjs prose retains matching substrings. Also prune two Apply Knowledge assertions (citation format, hallucination guard) already owned by apply-knowledge-skill.test.ts. Replaced coverage is owned by tests/knowledge/command-adoption.test.ts and tests/knowledge/apply-knowledge-skill.test.ts. Unit blocks for filterKnowledgeContext and loadKnowledgeContext are kept intact. 2. shared/agents/reviewer.md — rewrite CITATION-SENTENCE block from old direct-file-read prose ("from .memory/knowledge/decisions.md…") to the index+Read pattern ("via the KNOWLEDGE_CONTEXT index, per devflow:apply-knowledge skill"). Markers preserved. tests/skill-references.test.ts updated: byte-identity check for reviewer.md replaced with content-based assertions matching the new sentence (reviewer legitimately uses a different access pattern than coder, which still reads files directly). Co-Authored-By: Claude <noreply@anthropic.com>

dean0x · 2026-04-15T07:29:39Z

Code Review Comments

I've analyzed the 8 review reports and identified HIGH/CRITICAL confidence findings (≥80%) for inline comments. Below are the blocking issues that should be addressed before merge.

BLOCKING FINDINGS (≥80% confidence)

1. Inconsistent Worktree Placeholder (96% confidence)

Location: plugins/devflow-resolve/commands/resolve.md:75 and 7 others

All 8 orchestration surfaces invoke knowledge-context.cjs index with the worktree argument, but use three inconsistent conventions:

resolve.md/resolve-teams.md: "<worktree>" (angle brackets)
code-review.md/code-review-teams.md: "{worktree}" (curly braces)
plan.md/plan-teams.md/self-review.md and all orch skills: "." (literal cwd)

Silent Failure Mode: If an LLM substitutes the wrong template, knowledge loading exits silently with (none) and the knowledge is lost without surfacing an error.

Fix: Standardize on {worktree} for per-worktree commands (matches surrounding {timestamp}, {branch_slug} conventions). Tighten test at tests/knowledge/command-adoption.test.ts:22,42 to assert the placeholder convention.

2. Shell-Substitution Placeholder Leak (88% confidence)

Location: plugins/devflow-code-review/commands/code-review.md:97 and 7 others

All 8 surfaces contain: node scripts/hooks/lib/knowledge-context.cjs index "<worktree>". The LLM orchestrator substitutes <worktree> inside double quotes. If a worktree path contains shell metacharacters (e.g., /tmp/foo$(id) or /tmp/foo"; rm -rf $HOME; echo "), the expansion executes arbitrary commands in the orchestrator's bash session.

Worktree paths come from git worktree list (user-controlled) and CI environments (potentially hostile).

Fix: Use printf %q to safely quote the path:

WT=$(printf %q "<worktree>")
KNOWLEDGE_CONTEXT=$(node scripts/hooks/lib/knowledge-context.cjs index -- "$WT")

Also add guidance in devflow:worktree-support SKILL.md to validate paths before shell substitution.

3. Hardcoded Paths Contradict Index Footer (86% confidence)

Location: shared/skills/apply-knowledge/SKILL.md:36-38, 54-58

The skill's Step 1 says: "the path is in the footer of KNOWLEDGE_CONTEXT — Read that file." But Step 3 then hardcodes: "Read worktree/.memory/knowledge/decisions.md".

The problem: the footer emits absolute paths (computed from worktreePath), but the skill assumes relative paths. This breaks in multi-worktree mode when WORKTREE_PATH is set.

Fix: Remove hardcoded paths. Replace with: "Read the file path listed in the footer of KNOWLEDGE_CONTEXT." The footer is the single source of truth.

4. Fragile CLI Dispatch Heuristic (88% confidence)

Location: scripts/hooks/lib/knowledge-context.cjs:314-321

The dispatch logic uses firstArg.startsWith('/') || .startsWith('.') || .startsWith('~') || .includes('/') to distinguish "bare path" from unknown subcommand. This heuristic:

Rejects bare relative-path names without slashes (e.g., foo, myproject) that the docstring claims are backward-compatible
Relies on process.exit() in usageExit() for correctness — implicit dependency that breaks if usageExit becomes testable

Fix: Use structural detection instead of path-shape inference:

if (HANDLERS[firstArg]) {
  mode = firstArg;
  worktreeArg = argv[1];
} else if (argv.length === 1) {
  // Bare deprecated form
  mode = 'bare';
  worktreeArg = firstArg;
} else {
  usageExit();
}

5. Contradictory CHANGELOG Entries (95% confidence - CRITICAL)

Location: CHANGELOG.md:14 (Added section contradicts Changed section)

The [Unreleased] ### Added section describes /resolve using the OLD "fan out full corpus" pattern:

"orchestrator reads .memory/knowledge/decisions.md + pitfalls.md... and passes filtered content as KNOWLEDGE_CONTEXT to each parallel Resolver."

But the ### Changed section describes the NEW behavior:

"Knowledge index + on-demand Read pattern... now fan a compact ~250-token index instead of the full ADR/PF corpus."

Both cannot describe current reality. The Added entry was relocated from 2.0.0 without updating the wording.

Fix: Either rewrite the Added entry in index-pattern terms, or fold it into Changed. This misrepresents the feature to downstream readers.

6. Stale Token Claim (92% confidence)

Location: scripts/hooks/lib/knowledge-context.cjs:21, 147 and shared/skills/resolve:orch/SKILL.md:38

Docstring claims ~250 tokens for the index. Measured on the current 12-entry corpus: actual output is ~465 tokens (not 250). Per-entry cost is ~30 tokens, so this scales O(n).

At 30 entries: ~900 tokens
At 100 entries: ~3K tokens (approaching full corpus size)

Fix: Update to honest language:

// Compact summary, ~30 tokens per entry (e.g., 12 entries ≈ 450 tokens).
// Scales O(n) with active ADR + PF count.

7. Inconsistent `(none)` Quoting (92% confidence)

Location: plugins/devflow-code-review/commands/code-review.md:135 and 10+ others

The KNOWLEDGE_CONTEXT placeholder uses at least four different quoting conventions:

{knowledge_context or '(none)'} (with single quotes)
{knowledge index from Step 0d, or (none)} (bare, no quotes)
{Phase 2 knowledge index, or (none)}

The receiving agent's skip guard checks for unquoted (none). If orchestrator substitutes the quoted form, the guard fails silently and the agent processes a literal string '(none)' as content.

Fix: Standardize on {knowledge_context or (none)} (no quotes) everywhere. Document in apply-knowledge/SKILL.md:86 that the literal token is (none).

Summary

All 7 blocking findings are HIGH/CRITICAL confidence and introduce silent failure modes or misrepresentation to downstream consumers. I recommend addressing these before merge.

Lower-confidence findings (60-79% confidence, non-blocking) are documented in the review reports for follow-up.

Review walkthrough across 4 sessions resolved all 31 issues from the 2026-04-15_1022 code review (19 fixed, 5 rejected, 1 deferred, 6 pre-resolved). Blocking fixes (sessions 1-2): - CHANGELOG contradiction rewritten (#1) - Worktree placeholder unified on "{worktree}" across 11 sites (#3) - /debug migrated to index pattern (#4) - KNOWLEDGE_CONTEXT quoting standardized, no inline fallback (#6) - apply-knowledge skill defers to footer for paths (#7) - Simplifier dropped from knowledge consumers; 4 agents canonicalized (#8) - lstat+isFile guard on legacy-knowledge-purge unlink (#9) Architecture + consistency fixes (session 2): - Bare-form CLI removed, dispatch simplified to index-only (#10) - code-review-teams template+table extraction (#12) - Designer Apply Knowledge H2 added (#13) - Teams phase numbering aligned (#14) - loadKnowledgeContext + full subcommand removed entirely (#16) Complexity + performance + testing + docs fixes (sessions 3-4): - D-A filter extracted to isDeprecatedOrSuperseded predicate (#18) - formatAdrLine/formatPfLine merged into formatEntryLine (#19) - Shared test fixtures extracted to tests/knowledge/fixtures.ts (#21) - Token claim "~250 tokens" dropped from 12 sites (#22) - tmpdir cleanup added to test fixtures (#25) - self-learning.md example and footer paths corrected (#28, #29) 28 files changed, 333 insertions, 524 deletions. 1021/1021 tests passing.

Dean Sharon and others added 9 commits April 14, 2026 16:00

Dean Sharon and others added 14 commits April 14, 2026 18:22

refactor: remove unreachable (none) fallback in loadKnowledgeContext

fb235d9

The `parts.length === 0` guard already handles the empty case; the subsequent `|| '(none)'` fallback was dead code from an earlier draft.

dean0x mentioned this pull request Apr 16, 2026

Build-time template injection for repeated command snippets #183

Open

dean0x merged commit 7c1acab into main Apr 16, 2026
4 checks passed

dean0x deleted the fix/v2-knowledge-ship-blockers branch April 16, 2026 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(v2): three project-knowledge ship-blockers#182

fix(v2): three project-knowledge ship-blockers#182
dean0x merged 24 commits intomainfrom
fix/v2-knowledge-ship-blockers

dean0x commented Apr 14, 2026

Uh oh!

dean0x commented Apr 14, 2026

Uh oh!

dean0x commented Apr 14, 2026

Uh oh!

dean0x commented Apr 14, 2026

Uh oh!

dean0x commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dean0x commented Apr 14, 2026

Summary

Implementation notes

Test plan

Uh oh!

dean0x commented Apr 14, 2026

Review Summary: 4 Blocking Issues Found

1. Architecture (88%): Heal path skipped when manifest doesn't exist

2. Complexity (88%): Reconcile-manifest handler exceeds ALL HIGH thresholds

3. Security (85%): Lost-write race on

4. Performance (90%): KNOWLEDGE_CONTEXT fan-out without filtering

Summary Table

Uh oh!

dean0x commented Apr 14, 2026

Additional Findings: Should-Fix & Pre-existing

Should-Fix Issues (Lower confidence or smaller scope)

Architecture

Complexity

Security

Testing

TypeScript

Consistency

Documentation

Pre-existing Issues (Not Blocking This PR)

Recommendation Summary

Uh oh!

dean0x commented Apr 14, 2026

Review Process Summary

Uh oh!

dean0x commented Apr 15, 2026

Code Review Comments

BLOCKING FINDINGS (≥80% confidence)

1. Inconsistent Worktree Placeholder (96% confidence)

2. Shell-Substitution Placeholder Leak (88% confidence)

3. Hardcoded Paths Contradict Index Footer (86% confidence)

4. Fragile CLI Dispatch Heuristic (88% confidence)

5. Contradictory CHANGELOG Entries (95% confidence - CRITICAL)

6. Stale Token Claim (92% confidence)

7. Inconsistent (none) Quoting (92% confidence)

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

7. Inconsistent `(none)` Quoting (92% confidence)