optave · carlos-alm · Mar 2, 2026 · greptile-apps · Mar 2, 2026 · greptile-apps
diff --git a/docs/roadmap/BACKLOG.md b/docs/roadmap/BACKLOG.md
@@ -1,6 +1,6 @@
 # Codegraph Feature Backlog
 
-**Last updated:** 2026-03-02
+**Last updated:** 2026-03-01
 **Source:** Features derived from [COMPETITIVE_ANALYSIS.md](../../generated/COMPETITIVE_ANALYSIS.md) and internal roadmap discussions.
 
 ---
@@ -31,18 +31,18 @@ Non-breaking, ordered by problem-fit:
 | 1 | ~~Dead code detection~~ | ~~Find symbols with zero incoming edges (excluding entry points and exports). Agents constantly ask "is this used?" — the graph already has the data, we just need to surface it. Inspired by narsil-mcp, axon, codexray, CKB.~~ | Analysis | ~~Agents stop wasting tokens investigating dead code; developers get actionable cleanup lists without external tools~~ | ✓ | ✓ | 4 | No | **DONE** — Delivered as part of node classification (ID 4). `codegraph roles --role dead -T` lists all symbols with zero fan-in that aren't exported. |
 | 2 | ~~Shortest path A→B~~ | ~~BFS/Dijkstra on the existing edges table to find how symbol A reaches symbol B. We have `fn` for single-node chains but no A→B pathfinding. Inspired by codexray, arbor.~~ | Navigation | ~~Agents can answer "how does this function reach that one?" in one call instead of manually tracing chains~~ | ✓ | ✓ | 4 | No | **DONE** — `codegraph path <from> <to>` command with `--reverse`, `--max-depth`, `--kinds` options. BFS pathfinding on the edges table. `symbol_path` MCP tool. |
 | 12 | ~~Execution flow tracing~~ | ~~Framework-aware entry point detection (Express routes, CLI commands, event handlers) + BFS flow tracing from entry to leaf. Inspired by axon, GitNexus, code-context-mcp.~~ | Navigation | ~~Agents can answer "what happens when a user hits POST /login?" by tracing the full execution path in one query~~ | ✓ | ✓ | 4 | No | **DONE** — `codegraph flow` command with entry point detection and BFS flow tracing. MCP tools `flow` and `entry_points` added. Merged in PR #118. |
-| 16 | ~~Branch structural diff~~ | ~~Compare code structure between two branches using git worktrees. Show added/removed/changed symbols and their impact. Inspired by axon.~~ | Analysis | ~~Teams can review structural impact of feature branches before merge; agents get branch-aware context~~ | ✓ | ✓ | 4 | No | **DONE** — `codegraph branch-compare <base> <target>` builds separate graphs for each ref using git worktrees, diffs at the symbol level (added/removed/changed), and traces transitive caller impact via BFS. Supports `--json`, `--format mermaid`, `--depth`, `--no-tests`, `--engine`. `branchCompareData` and `branchCompareMermaid` exported from programmatic API. `branch_compare` MCP tool. |
-| 20 | ~~Streaming / chunked results~~ | ~~Support streaming output for large query results so MCP clients and programmatic consumers can process incrementally.~~ | Embeddability | ~~Large codebases don't blow up agent context windows; consumers process results as they arrive instead of waiting for the full payload~~ | ✓ | ✓ | 4 | No | **DONE** — Universal pagination (`limit`/`offset` + `_pagination` metadata) on all 21 MCP tools with per-tool defaults in `MCP_DEFAULTS` and hard cap of 1000. NDJSON streaming (`--ndjson`/`--limit`/`--offset`) on ~14 CLI commands via shared `printNdjson` helper. Generator APIs (`iterListFunctions`, `iterRoles`, `iterWhere`, `iterComplexity`) using `better-sqlite3` `.iterate()` for memory-efficient streaming. PR #207. |
+| 16 | ~~Branch structural diff~~ | ~~Compare code structure between two branches using git worktrees. Show added/removed/changed symbols and their impact. Inspired by axon.~~ | Analysis | ~~Teams can review structural impact of feature branches before merge; agents get branch-aware context~~ | ✓ | ✓ | 4 | No | **DONE** — `src/branch-compare.js` module with `branchCompareData`, `branchCompareMermaid`, `branchCompare`. CLI: `codegraph branch-compare <base> <target>` with `--depth`, `-T`, `-j`, `-f` options. MCP: `branch_compare` tool. Git worktree isolation, symbol-level diff, transitive impact analysis. |
+| 20 | Streaming / chunked results | Support streaming output for large query results so MCP clients and programmatic consumers can process incrementally. | Embeddability | Large codebases don't blow up agent context windows; consumers process results as they arrive instead of waiting for the full payload | ✓ | ✓ | 4 | No |
 | 27 | Composite audit command | Single `codegraph audit <file-or-function>` that combines `explain`, `fn-impact`, and code health metrics into one structured report per function. Core version uses graph data; enhanced version includes Phase 4.4 `risk_score`/`complexity_notes`/`side_effects` when available. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) Gauntlet phase. | Orchestration | Each sub-agent in a multi-agent swarm gets everything it needs to assess a function in one call instead of 3-4 — directly reduces token waste and round-trips | ✓ | ✓ | 4 | No |
 | 28 | Batch querying | Accept a list of targets (file or JSON) and return all query results in one JSON payload. Applies to `audit`, `fn-impact`, `context`, and other per-symbol commands. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) swarm pattern. | Orchestration | A swarm of 20+ agents auditing different files can be fed from a single orchestrator call instead of N sequential invocations — reduces overhead and enables parallel dispatch | ✓ | ✓ | 4 | No |
 | 29 | Triage priority queue | Single `codegraph triage` command that merges `map` connectivity, `hotspots` fan-in/fan-out, node roles, and optionally git churn + `risk_score` into one ranked audit queue. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) RECON phase. | Orchestration | Orchestrating agent gets a single prioritized list of what to audit first — replaces manual synthesis of 3+ commands, saves RECON phase from burning tokens on orientation | ✓ | ✓ | 4 | No |
 | 32 | MCP orchestration tools | Expose `audit`, `triage`, and `check` as MCP tools alongside existing tools. Enables multi-agent orchestrators (Claude Code agent teams, custom MCP clients) to run the full Titan Paradigm loop through the MCP protocol without CLI overhead. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md). | Embeddability | Agents query the graph through MCP with zero CLI overhead — fewer tokens, faster round-trips, native integration with AI agent frameworks | ✓ | ✓ | 4 | No |
-| 5 | ~~TF-IDF lightweight search~~ | ~~SQLite FTS5 + TF-IDF as a middle tier (~50MB) between "no search" and full transformer embeddings (~500MB). Provides decent keyword search with near-zero overhead. Inspired by codexray.~~ | Search | ~~Users get useful search without the 500MB embedding model download; faster startup for small projects~~ | ✓ | ✓ | 3 | No | **DONE** — Subsumed by ID 15 (Hybrid BM25 + semantic search). FTS5 full-text index (`fts_index` table) populated during `codegraph embed`. `--mode keyword` provides BM25-only search with zero embedding overhead. PR #198. |
+| 5 | TF-IDF lightweight search | SQLite FTS5 + TF-IDF as a middle tier (~50MB) between "no search" and full transformer embeddings (~500MB). Provides decent keyword search with near-zero overhead. Inspired by codexray. | Search | Users get useful search without the 500MB embedding model download; faster startup for small projects | ✓ | ✓ | 3 | No |
 | 13 | Architecture boundary rules | User-defined rules for allowed/forbidden dependencies between modules (e.g., "controllers must not import from other controllers"). Violations flagged in `diff-impact` and CI. Inspired by codegraph-rust, stratify. | Architecture | Prevents architectural decay in CI; agents are warned before introducing forbidden cross-module dependencies | ✓ | ✓ | 3 | No |
-| 15 | ~~Hybrid BM25 + semantic search~~ | ~~Combine BM25 keyword matching with embedding-based semantic search using Reciprocal Rank Fusion. Better recall than either approach alone. Inspired by GitNexus, claude-context-local.~~ | Search | ~~Search results improve dramatically — keyword matches catch exact names, embeddings catch conceptual matches, RRF merges both~~ | ✓ | ✓ | 3 | No | **DONE** — FTS5 full-text index alongside embeddings for BM25 keyword search. `ftsSearchData()` for keyword-only, `hybridSearchData()` for RRF fusion. `search` command defaults to `--mode hybrid`, with `semantic` and `keyword` alternatives. MCP `semantic_search` tool gains `mode` parameter. Graceful fallback on older DBs. Zero new deps — FTS5 ships with better-sqlite3. PR #198. |
-| 18 | ~~CODEOWNERS integration~~ | ~~Map graph nodes to CODEOWNERS entries. Show who owns each function, surface ownership boundaries in `diff-impact`. Inspired by CKB.~~ | Developer Experience | ~~`diff-impact` tells agents which teams to notify; ownership-aware impact analysis reduces missed reviews~~ | ✓ | ✓ | 3 | No | **DONE** — `src/owners.js` module with CODEOWNERS parser, matcher, and data functions. `codegraph owners [target]` CLI command with `--owner`, `--boundary`, `-f`, `-k`, `-T`, `-j` options. `code_owners` MCP tool. Integrated into `diff-impact` (affected owners + suggested reviewers). No new deps — glob patterns via ~30-line `patternToRegex`. PR #195. |
+| 15 | Hybrid BM25 + semantic search | Combine BM25 keyword matching with embedding-based semantic search using Reciprocal Rank Fusion. Better recall than either approach alone. Inspired by GitNexus, claude-context-local. | Search | Search results improve dramatically — keyword matches catch exact names, embeddings catch conceptual matches, RRF merges both | ✓ | ✓ | 3 | No |
+| 18 | CODEOWNERS integration | Map graph nodes to CODEOWNERS entries. Show who owns each function, surface ownership boundaries in `diff-impact`. Inspired by CKB. | Developer Experience | `diff-impact` tells agents which teams to notify; ownership-aware impact analysis reduces missed reviews | ✓ | ✓ | 3 | No |
 | 22 | ~~Manifesto-driven pass/fail~~ | ~~User-defined rule engine with custom thresholds (e.g. "cognitive > 15 = fail", "cyclomatic > 10 = fail", "imports > 10 = decompose"). Outputs pass/fail per function/file. Generalizes ID 13 (boundary rules) into a generic rule system.~~ | Analysis | ~~Enables autonomous multi-agent audit workflows (GAUNTLET pattern); CI integration for code health gates with configurable thresholds~~ | ✓ | ✓ | 3 | No | **DONE** — `codegraph manifesto` with 9 configurable rules (cognitive, cyclomatic, nesting, MI, Halstead volume/effort/bugs, fan-in, fan-out). Warn/fail thresholds via `.codegraphrc.json`. Exit code 1 on any fail-level breach — CI gate ready. PR #138. |
-| 31 | ~~Graph snapshots~~ | ~~`codegraph snapshot save <name>` / `codegraph snapshot restore <name>` for lightweight SQLite DB backup and restore. Enables orchestrators to checkpoint before refactoring passes and instantly rollback without rebuilding. After Phase 4, also preserves embeddings and semantic metadata. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) STATE MACHINE phase.~~ | Orchestration | ~~Multi-agent workflows get instant rollback without re-running expensive builds or LLM calls — orchestrator checkpoints before each pass and restores on failure~~ | ✓ | ✓ | 3 | No | **DONE** — `codegraph snapshot` subcommand group with `save`, `restore`, `list`, `delete`. Uses SQLite `VACUUM INTO` for atomic, WAL-free snapshots in `.codegraph/snapshots/`. All 6 functions exported via programmatic API. PR #192. |
+| 31 | Graph snapshots | `codegraph snapshot save <name>` / `codegraph snapshot restore <name>` for lightweight SQLite DB backup and restore. Enables orchestrators to checkpoint before refactoring passes and instantly rollback without rebuilding. After Phase 4, also preserves embeddings and semantic metadata. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) STATE MACHINE phase. | Orchestration | Multi-agent workflows get instant rollback without re-running expensive builds or LLM calls — orchestrator checkpoints before each pass and restores on failure | ✓ | ✓ | 3 | No |
 | 30 | Change validation predicates | `codegraph check --staged` with configurable predicates: `--no-new-cycles`, `--max-blast-radius N`, `--no-signature-changes`, `--no-boundary-violations`. Returns exit code 0/1 for CI gates and state machines. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) STATE MACHINE phase. | CI | Automated rollback triggers without parsing JSON — orchestrators and CI pipelines get first-class pass/fail signals for blast radius, cycles, and contract changes | ✓ | ✓ | 3 | No |
 | 6 | ~~Formal code health metrics~~ | ~~Cyclomatic complexity, Maintainability Index, and Halstead metrics per function — we already parse the AST, the data is there. Inspired by code-health-meter (published in ACM TOSEM 2025).~~ | Analysis | ~~Agents can prioritize refactoring targets; `hotspots` becomes richer with quantitative health scores per function~~ | ✓ | ✓ | 2 | No | **DONE** — `codegraph complexity` provides cognitive, cyclomatic, nesting depth, Halstead (volume, difficulty, effort, bugs), and Maintainability Index per function. `--health` for full Halstead view, `--sort mi` to rank by MI, `--above-threshold` for flagged functions. `function_complexity` DB table. `complexity` MCP tool. PR #130 + #139. |
 | 11 | ~~Community detection~~ | ~~Leiden/Louvain algorithm to discover natural module boundaries vs actual file organization. Reveals which symbols are tightly coupled and whether the directory structure matches. Inspired by axon, GitNexus, CodeGraphMCPServer.~~ | Intelligence | ~~Surfaces architectural drift — when directory structure no longer matches actual dependency clusters; guides refactoring~~ | ✓ | ✓ | 2 | No | **DONE** — `codegraph communities` with Louvain algorithm. File-level and `--functions` function-level detection. `--drift` for drift analysis (split/merge candidates). `--resolution` tunable. `communities` MCP tool. PR #133/#134. |

diff --git a/src/owners.js b/src/owners.js
@@ -1,7 +1,6 @@
 import fs from 'node:fs';
 import path from 'node:path';
 import { findDbPath, openReadonlyOrFail } from './db.js';
-
 import { isTestFile } from './queries.js';
 
 // ─── CODEOWNERS Parsing ──────────────────────────────────────────────

diff --git a/tests/integration/owners.test.js b/tests/integration/owners.test.js
@@ -2,7 +2,6 @@ import fs from 'node:fs';
 import os from 'node:os';
 import path from 'node:path';
 import Database from 'better-sqlite3';
-
 import { afterAll, beforeAll, describe, expect, test } from 'vitest';
 import { initSchema } from '../../src/db.js';
 import { ownersData, ownersForFiles } from '../../src/owners.js';

diff --git a/tests/unit/owners.test.js b/tests/unit/owners.test.js
@@ -2,7 +2,6 @@ import fs from 'node:fs';
 import os from 'node:os';
 import path from 'node:path';
 import { afterEach, beforeEach, describe, expect, it } from 'vitest';
-
 import {
   matchOwners,
   parseCodeowners,