fix: classify test-only-called symbols as 'test-only' instead of 'dead' by carlos-alm · Pull Request #497 · optave/codegraph

carlos-alm · 2026-03-17T11:08:35Z

Summary

Closes #488

When -T is active, symbols whose only callers are in test files are now classified as test-only instead of dead
Added test-only to VALID_ROLES so it can be queried with --role test-only
classifyRoles checks a new testOnlyFanIn property: if fanIn === 0 and !isExported but testOnlyFanIn > 0, assigns test-only role
rolesData in the analysis layer identifies test-only-called symbols by querying all calls edges and grouping callers by target

Test plan

codegraph roles --role dead -T no longer includes symbols only called from tests
codegraph roles --role test-only -T shows those symbols instead
Unit tests for classifier with testOnlyFanIn property
Integration tests with fixture symbol called only from test file

…d' (#488) Impact: 4 functions changed, 4 affected

claude · 2026-03-17T11:08:48Z

Claude finished @carlos-alm's task —— View job

Root Cause Analysis ✅

Problem Being Solved: When using -T (exclude tests), symbols that are ONLY called from test files were incorrectly classified as dead instead of having a distinct test-only role. This provides better insight into code that serves test purposes vs truly unused code.

Why This Approach: The solution adds a new test-only role and implements runtime reclassification logic that:

Identifies symbols whose callers are exclusively in test files
When -T is active, reclassifies these from their original role to test-only
Maintains existing behavior when -T is not used

Risk Assessment: Low risk - additive change that preserves existing behavior when -T is not used.

Phase 2: Configuration Change Detection ✅

No High-Risk Configuration Changes Found

No changes to biome.json, eslint, jest, or other linting/testing configs
No timeout increases or coverage threshold reductions
No TypeScript strict mode flags disabled
Changes are purely functional enhancements

Phase 3: Test Integrity ✅

Tests Are Strengthened, Not Weakened

New test cases added: 25 lines in roles.test.js, 34 lines in roles.test.js, 9 lines in unit/roles.test.js
No assertions weakened: All tests validate the new test-only classification works correctly
No workarounds: No eslint-disable or .skip calls introduced
Integration test coverage: Comprehensive fixture with testHelper function called only from test file
Tests verify both positive case (symbols get test-only role) and negative case (truly dead symbols stay dead)

Phase 4: Code Quality ✅

Implementation Quality is Good

SQL injection protection: Uses prepared statements consistently (db.prepare())
Performance: Efficient query using JOINs, minimal extra queries
Memory safety: Proper resource cleanup with try/finally blocks
Logic correctness: Proper differentiation between truly dead vs test-only called symbols
API contract preservation: Backward-compatible, no breaking changes to CLI or MCP tools
Edge case handling: Handles mixed callers (both test and non-test) correctly

Minor concerns:

Complex SQL query in _findTestOnlyCalledIds but well-documented and necessary for performance
Runtime reclassification adds slight overhead but only when -T is used

Phase 5: Backlog Compliance

Zero-dep: ✅

No new runtime dependencies added. Uses existing tree-sitter/SQLite infrastructure.

Foundation-aligned: ✅

Fully aligned with FOUNDATION.md principles:

P1 (always-current): Maintains real-time graph accuracy
P4 (zero-cost core): Optional role classification, doesn't affect core parsing
P5 (embeddable-first): Preserves programmatic API, adds MCP tool compatibility
P8 (planning/developing focus): Helps agents distinguish test utilities from dead code

Problem-fit: 4/5

Directly addresses the core problem: "AI coding assistants waste tokens re-orienting in large codebases"

Prevents agents from flagging test utilities as dead code
Provides clearer architectural insight (test-only vs truly unused)
Reduces false positives in dead code analysis
Minor deduction: somewhat narrow use case (only with -T flag)

Breaking: No

Purely additive change:

New test-only role added to VALID_ROLES
CLI maintains backward compatibility
MCP tools updated but maintain existing contracts
Database role classification is runtime-only, no schema changes

Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Critical Concerns

None. This is a well-implemented feature that solves a real problem without introducing risks.

Final Recommendation

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: High-quality implementation that directly improves agent accuracy when analyzing codebases with the -T flag. Comprehensive test coverage, no security concerns, maintains backward compatibility, and follows established patterns in the codebase. The only minor limitation is the narrow scope (test-only classification only applies with -T flag), but this is by design and appropriate.

Excellent work on:

Comprehensive test coverage for both unit and integration scenarios
Clear separation between runtime reclassification and stored classification
Efficient SQL implementation with proper JOIN usage
Maintaining complete backward compatibility

greptile-apps · 2026-03-17T11:13:12Z

Greptile Summary

This PR introduces a test-only role for symbols whose only callers live in test files, so they no longer pollute the dead role list when running with -T. The reclassification is done at query time in rolesData via _findTestOnlyCalledIds, which scans all calls edges and groups callers by target to identify symbols with exclusively test-file callers. VALID_ROLES is extended, CLI/MCP descriptions are updated, and integration + unit tests cover the main scenarios.

Key changes:

rolesData runs a post-query reclassification pass whenever noTests is true or filterRole === 'test-only', correctly moving affected symbols from their stored role to test-only before applying the final role filter.
_findTestOnlyCalledIds uses a \0-separated composite key (fixing the previous : collision) and emits a warn() when no test-file nodes are found in the graph, giving users a clear signal that the graph was built with -T.
classifyRoles in src/graph/classifiers/roles.js adds a testOnlyFanIn parameter and branch, but classifyNodeRoles in structure.js never computes or supplies this value. The build-time classifier therefore always emits dead (never test-only) and the 'test-only': 0 counter in the build summary will always stay at zero — the testOnlyFanIn branch is dead code from the production call path.
There is no integration test for rolesData({ role: 'test-only' }) without noTests: true, leaving the fixed "silent empty result" code path untested.

Confidence Score: 3/5

The user-facing feature works correctly via query-time reclassification, but a significant inconsistency exists between the new classifier API and what the build pipeline actually computes.
The test-only classification is functionally correct for end users (reclassification in rolesData works). However, the testOnlyFanIn property added to classifyRoles is dead code in the real build path because classifyNodeRoles never computes it — meaning the build-time role summary always shows test-only=0, the 'test-only': 0 initializer is never incremented, and the classifier's new branch is an API that no production caller uses. This inconsistency between the build-time and query-time classification model warrants attention before merging.
src/features/structure.js (dead testOnlyFanIn code path) and tests/integration/roles.test.js (missing coverage for --role test-only without -T).

Important Files Changed

Filename	Overview
src/domain/analysis/roles.js	Core query-time reclassification logic; correctly identifies test-only symbols via `_findTestOnlyCalledIds` using null-byte key separator, test-node guard, and post-query filtering. Logic is sound but the `classifyNodeRoles` call site never feeds `testOnlyFanIn` into the build-time classifier, so the DB always stores `dead` (never `test-only`).
src/features/structure.js	Adds `'test-only': 0` to the build-time summary and classifier input, but never computes `testOnlyFanIn`, making the new classifier branch dead code. The summary counter for `test-only` will always stay at 0 after every build.
src/graph/classifiers/roles.js	Adds `testOnlyFanIn` support to the pure classifier; logic is correct but the branch is dead code when called via `classifyNodeRoles` since that call site never supplies `testOnlyFanIn`.
tests/integration/roles.test.js	Good integration coverage for the `noTests + test-only` path; missing a test for `--role test-only` without `-T`, which is a distinct supported code path introduced in this PR.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["codegraph roles --role X -T"] --> B["rolesData()"]
    B --> C{"filterRole === 'test-only'?"}
    C -- yes --> D["SQL: fetch ALL symbols\n(no role filter)"]
    C -- no --> E["SQL: WHERE role = filterRole"]
    D --> F{"noTests?"}
    E --> F
    F -- yes --> G["Filter out test-file rows"]
    F -- no --> H["Keep all rows"]
    G --> I{"noTests OR filterRole === 'test-only'?"}
    H --> I
    I -- yes --> J["_findTestOnlyCalledIds(db)\n(scan all 'calls' edges)"]
    J --> K["Group callers by target:\nhasTestCaller / hasNonTestCaller"]
    K --> L["Return Set of keys where\nhasTestCaller && !hasNonTestCaller"]
    L --> M["Reclassify matching rows\nto role='test-only'"]
    I -- no --> N["Skip reclassification"]
    M --> O{"filterRole set?"}
    N --> O
    O -- yes --> P["rows.filter(r => r.role === filterRole)"]
    O -- no --> Q["Return all rows"]
    P --> R["Return {count, summary, symbols}"]
    Q --> R

    subgraph "Build time (classifyNodeRoles)"
    S["SQL: fan_in = ALL callers\n(includes test callers)"] --> T["classifyRoles(nodes)\n⚠️ testOnlyFanIn never passed"]
    T --> U["Stores role in DB\n(never stores 'test-only')"]
    end

Comments Outside Diff (1)

src/features/structure.js, line 354-361 (link)

testOnlyFanIn never computed for build-time classifier

classifyNodeRoles builds classifierInput with fanIn equal to the full fan-in count (including test-file callers), and never computes or provides testOnlyFanIn. As a result, the new branch in classifyRoles:
```
role = node.testOnlyFanIn > 0 ? 'test-only' : 'dead';
```
is always dead code from this call path: node.testOnlyFanIn is always undefined, and undefined > 0 is false, so the classifier always falls through to 'dead'. This means:
1. No symbol is ever stored as 'test-only' in the DB after a build — the role summary returned by classifyNodeRoles (which is also emitted to the debug log in build-structure.js) always shows test-only=0, even when test-only symbols exist.
2. The 'test-only': 0 initializer added to the summary object in structure.js will never be incremented.
The test-only classification currently only works via the post-query reclassification in rolesData. If the intent is to also persist test-only roles into the DB at build time (consistent with all other roles), classifyNodeRoles would need to separately compute non-test fan-in and test-only fan-in per node before passing them to the classifier. For example:
```
// Compute test-only fan-in separately
const testOnlyFanInRows = db.prepare(`
  SELECT e.target_id, COUNT(*) AS cnt
  FROM edges e
  JOIN nodes caller ON e.source_id = caller.id
  WHERE e.kind = 'calls' AND isTestFile(caller.file)  -- or a pattern filter
  GROUP BY e.target_id
`).all();
const testOnlyFanInMap = new Map(testOnlyFanInRows.map(r => [r.target_id, r.cnt]));

const classifierInput = rows.map((r) => ({
  id: String(r.id),
  name: r.name,
  fanIn: r.fan_in - (testOnlyFanInMap.get(r.id) ?? 0),  // non-test fan-in
  fanOut: r.fan_out,
  isExported: exportedIds.has(r.id),
  testOnlyFanIn: testOnlyFanInMap.get(r.id) ?? 0,
}));
```
If test-only is intentionally only a query-time reclassification (never persisted to DB), the testOnlyFanIn parameter and branch should be removed from classifyRoles to avoid the misleading implication, and the 'test-only': 0 initializer in the classifyNodeRoles summary should be removed too.

_{Last reviewed commit: 25a2582}

greptile-apps · 2026-03-17T11:13:19Z

src/domain/analysis/roles.js

+    if (filterRole && filterRole !== 'test-only') {
      conditions.push('role = ?');
      params.push(filterRole);
+    } else if (filterRole === 'test-only') {
+      // test-only is not stored in DB; we need all symbols to reclassify
+      // Fetch everything and filter after reclassification
+      conditions.length = 1; // keep only 'role IS NOT NULL'
+      params.length = 0;
    }


--role test-only silently returns empty without -T

When filterRole === 'test-only' but noTests is false, the code fetches all symbols (correct), skips the reclassification block (because if (noTests) is false), then filters to role === 'test-only' — which is never stored in the database. The result is always an empty set with no warning.

Users who run codegraph roles --role test-only without -T will get count: 0, symbols: [] and have no indication that the flag is required for this role to work. Consider either:

Returning an explanatory message/warning when this combination is detected

Or simply applying the reclassification regardless of noTests when filterRole === 'test-only'

Fixed in f4c8176 — the reclassification block now runs whenever \ OR \ is true, so \ without \ no longer returns an empty set silently.

greptile-apps · 2026-03-17T11:13:20Z

src/domain/analysis/roles.js

+    } else if (filterRole === 'test-only') {
+      // test-only is not stored in DB; we need all symbols to reclassify
+      // Fetch everything and filter after reclassification
+      conditions.length = 1; // keep only 'role IS NOT NULL'
+      params.length = 0;


Redundant array reset

At the point the else if (filterRole === 'test-only') branch is entered, conditions still only has its initial 'role IS NOT NULL' element and params is still empty — neither has been modified by the preceding if branch. The resets (conditions.length = 1; params.length = 0;) are no-ops and the comment "keep only 'role IS NOT NULL'" is therefore misleading.

Suggested change

} else if (filterRole === 'test-only') {

// test-only is not stored in DB; we need all symbols to reclassify

// Fetch everything and filter after reclassification

conditions.length = 1; // keep only 'role IS NOT NULL'

params.length = 0;

} else if (filterRole === 'test-only') {

// test-only is not stored in DB; fetch everything and filter after reclassification

Fixed in 0f430ab — removed the redundant conditions.length = 1; params.length = 0; lines since nothing had modified these arrays at that point.

greptile-apps · 2026-03-17T11:13:21Z

src/domain/analysis/roles.js

+function _findTestOnlyCalledIds(db) {
+  // Get all non-test symbols that have at least one caller
+  const rows = db
+    .prepare(
+      `SELECT target.name, target.file, target.line,
+              caller.file AS caller_file
+       FROM edges e
+       JOIN nodes target ON e.target_id = target.id
+       JOIN nodes caller ON e.source_id = caller.id
+       WHERE e.kind = 'calls'
+         AND target.kind NOT IN ('file', 'directory')`,
+    )
+    .all();
+
+  // Group callers by target symbol
+  const callersByTarget = new Map();
+  for (const r of rows) {
+    const key = `${r.name}:${r.file}:${r.line}`;
+    if (!callersByTarget.has(key))
+      callersByTarget.set(key, { hasTestCaller: false, hasNonTestCaller: false });
+    const entry = callersByTarget.get(key);
+    if (isTestFile(r.caller_file)) {
+      entry.hasTestCaller = true;
+    } else {
+      entry.hasNonTestCaller = true;
+    }
+  }
+
+  // Return keys where ALL callers are in test files
+  const result = new Set();
+  for (const [key, entry] of callersByTarget) {
+    if (entry.hasTestCaller && !entry.hasNonTestCaller) {
+      result.add(key);
+    }
+  }
+  return result;


Key separator : collides with route: / event: / command: prefixes

The composite key ${r.name}:${r.file}:${r.line} uses : as a separator, but FRAMEWORK_ENTRY_PREFIXES already defines symbol names that contain : (e.g. route:/api/users, event:user.created). A symbol route:api in file users.js at line 5 produces the key route:api:users.js:5, which is the same key as a symbol named route in the hypothetical file api:users.js at line 5.

While in practice file paths on Linux/macOS rarely contain :, framework entry names do contain colons and the two-level split is ambiguous. Using a separator that doesn't appear in any of the three fields (e.g. \0, or switching to a JSON/object keyed map) would eliminate the ambiguity.

Fixed in f199a68 — switched the composite key separator from : to \0 (null character) which cannot appear in symbol names or file paths, avoiding collisions with framework entry prefixes like route: or event:.

Previously, `--role test-only` without `-T` silently returned empty because the reclassification block only ran when `noTests` was true. Now it also runs when the user explicitly requests the test-only role. Impact: 1 functions changed, 0 affected

The `conditions.length = 1; params.length = 0;` lines were no-ops since nothing had modified these arrays at that point in the code. Impact: 1 functions changed, 0 affected

The composite key `name:file:line` used `:` as separator which could collide with framework entry prefixes like `route:` or `event:`. Switch to `\0` (null character) which cannot appear in names or paths. Impact: 2 functions changed, 1 affected

carlos-alm · 2026-03-17T11:23:53Z

@greptileai

greptile-apps · 2026-03-17T11:29:36Z

src/domain/analysis/roles.js

+function _findTestOnlyCalledIds(db) {
+  // Get all non-test symbols that have at least one caller
+  const rows = db
+    .prepare(
+      `SELECT target.name, target.file, target.line,
+              caller.file AS caller_file
+       FROM edges e
+       JOIN nodes target ON e.target_id = target.id
+       JOIN nodes caller ON e.source_id = caller.id
+       WHERE e.kind = 'calls'
+         AND target.kind NOT IN ('file', 'directory')`,
+    )
+    .all();


Silent empty result if graph was built without test files

_findTestOnlyCalledIds queries caller.file from the nodes table. If the user ran codegraph build -T (excluding test files), test file nodes were never inserted into the DB, so the query returns no rows with test callers, the function returns an empty Set, and --role test-only -T silently returns count: 0, symbols: [] — indistinguishable from "no test-only symbols exist."

There's no guard or warning to surface this precondition. Consider checking whether any test-file nodes exist in the DB and emitting a warning when none are found:

function _findTestOnlyCalledIds(db) { const hasTestNodes = db .prepare(`SELECT 1 FROM nodes WHERE file LIKE '%.test.%' OR file LIKE '%spec%' LIMIT 1`) .get(); if (!hasTestNodes) { // Graph was built without test files — can't determine test-only callers return new Set(); } // ... rest of the function }

Or at minimum document the precondition in the JSDoc above.

Fixed in 5d6d64c — added an early check in _findTestOnlyCalledIds that queries all file nodes and tests whether any match isTestFile. If none are found (e.g. the graph was built with -T), we emit a warning via warn() and return an empty Set immediately, so the user knows why test-only detection returned zero results.

Fixed — added a guard at the top of _findTestOnlyCalledIds that checks for test-file nodes in the DB using isTestFile(). When none are found (e.g. graph built with -T), it emits a warning via warn() and returns an empty Set instead of silently returning nothing.

Add a guard in _findTestOnlyCalledIds that checks whether any test-file nodes exist in the DB before querying. When absent (e.g. graph built with -T), emits a warning instead of silently returning empty results. Impact: 1 functions changed, 3 affected

Impact: 1 functions changed, 1 affected

carlos-alm · 2026-03-17T11:38:59Z

@greptileai

…/review-497 # Conflicts: # src/domain/analysis/roles.js Impact: 1 functions changed, 1 affected

carlos-alm · 2026-03-17T11:40:22Z

@greptileai

carlos-alm · 2026-03-17T12:06:39Z

Addressed Greptile's outside-diff P1 about testOnlyFanIn dead code in the build-time classifier:

The testOnlyFanIn branch in classifyRoles() was never wired up — removed it in 711a897. test-only classification is intentionally a query-time operation in domain/analysis/roles.js, which reclassifies when noTests is active. The build-time classifier uses total fan-in (including test callers), so a node with test-only callers has fanIn > 0 and gets its normal role. The test suite validates this design.

@greptileai

fix: classify test-only-called symbols as 'test-only' instead of 'dea…

a196f3c

…d' (#488) Impact: 4 functions changed, 4 affected

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

carlos-alm added 3 commits March 17, 2026 05:21

fix: remove redundant array reset in rolesData

0f430ab

The `conditions.length = 1; params.length = 0;` lines were no-ops since nothing had modified these arrays at that point in the code. Impact: 1 functions changed, 0 affected

Merge branch 'main' into fix/488-test-only-role

43215be

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

carlos-alm added 2 commits March 17, 2026 05:38

fix: warn when test-only detection finds no test-file nodes in graph

5d6d64c

Impact: 1 functions changed, 1 affected

Merge remote-tracking branch 'origin/fix/488-test-only-role' into fix…

25a2582

…/review-497 # Conflicts: # src/domain/analysis/roles.js Impact: 1 functions changed, 1 affected

Merge branch 'main' into fix/488-test-only-role

da486c4

carlos-alm merged commit ccc71c9 into main Mar 17, 2026
13 checks passed

carlos-alm deleted the fix/488-test-only-role branch March 17, 2026 11:57

github-actions bot locked and limited conversation to collaborators Mar 17, 2026

Conversation

carlos-alm commented Mar 17, 2026

Summary

Test plan

Uh oh!

claude bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality ✅

Phase 5: Backlog Compliance

Zero-dep: ✅

Foundation-aligned: ✅

Problem-fit: 4/5

Breaking: No

Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Critical Concerns

Final Recommendation

Uh oh!

greptile-apps bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 17, 2026

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 17, 2026

Uh oh!

carlos-alm commented Mar 17, 2026

Uh oh!

Uh oh!

carlos-alm commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Mar 17, 2026 •

edited

Loading

greptile-apps bot commented Mar 17, 2026 •

edited

Loading