feat: Repository abstraction with InMemoryRepository (3.13)#444
feat: Repository abstraction with InMemoryRepository (3.13)#444carlos-alm merged 13 commits intomainfrom
Conversation
…pyramid (3.13) Introduce a Repository base class, InMemoryRepository (Maps-backed, no SQLite), and SqliteRepository (delegates to existing fn(db,...) functions). Includes fluent TestRepoBuilder fixture factory and 105 new tests (51 unit + 54 parity) verifying behavioral equivalence between both implementations. Existing code and tests remain unchanged. Impact: 139 functions changed, 6 affected
|
Claude finished @carlos-alm's task —— View job PR Review: Repository Abstraction (3.13)Review Tasks
Phase 1: Root Cause Analysis ✅Problem Being Solved: This PR addresses testing pyramid concerns by introducing repository abstraction that enables fast in-memory testing without SQLite dependency. The current codebase likely has slow tests that require full SQLite database setup, making unit testing cumbersome. Why This Approach:
Risk Assessment: Low risk - this is a pure additive abstraction with no breaking changes to existing APIs. Phase 2: Configuration Change Detection ✅No Configuration Changes: ✅ No changes to biome.json, package.json, vitest.config.js, or other config files that could be silencing warnings/errors. This is purely additive code. Phase 3: Test Integrity Assessment ✅Test Quality: ✅ EXCELLENT test integrity
Test Design Quality:
Phase 4: Code Quality Review ✅Architecture & Design Quality✅ Excellent Architecture:
Implementation Quality✅ High-Quality Code:
Security Analysis✅ No Security Issues:
Potential IssuesMinor observations (non-blocking):
Phase 5: Backlog Compliance Rating ✅Backlog Compliance AssessmentZero-dep: ✅ Pass - No new runtime dependencies added. Builds entirely on existing JavaScript Maps/Sets and existing SQLite functions. Foundation-aligned: ✅ Pass - Fully aligned with FOUNDATION.md principles:
Problem-fit: 3/5 - Good fit. While this doesn't directly address AI token waste/hallucination/blast-radius issues, it enables better testing of the core functionality that does solve those problems. Testing pyramid improvements are foundation-level work that makes the core problem-solving code more maintainable. Breaking: No - Purely additive. All existing APIs unchanged, new Repository interfaces are additive. Tier Classification: Tier 1 (Zero-dep ✅ + Foundation-aligned ✅) Root Cause AnalysisProblem Being Solved: Testing pyramid improvement - enables fast, isolated unit tests for codegraph query logic without SQLite overhead Backlog Compliance
Critical ConcernsNone. This is well-executed foundational testing infrastructure. Final Recommendation
|
Greptile SummaryThis PR introduces a clean Repository abstraction ( Key changes:
One remaining divergence:
Confidence Score: 4/5
Important Files Changed
Class Diagram%%{init: {'theme': 'neutral'}}%%
classDiagram
class Repository {
<<abstract>>
+findNodeById(id)
+findNodesByFile(file)
+findFileNodes(fileLike)
+findNodesWithFanIn(namePattern, opts)
+countNodes()
+countEdges()
+countFiles()
+getNodeId(name, kind, file, line)
+getFunctionNodeId(name, file, line)
+bulkNodeIdsByFile(file)
+findNodeChildren(parentId)
+findNodesByScope(scopeName, opts)
+findNodeByQualifiedName(qualifiedName, opts)
+listFunctionNodes(opts)
+iterateFunctionNodes(opts)
+findNodesForTriage(opts)
+findCallees(nodeId)
+findCallers(nodeId)
+findDistinctCallers(nodeId)
+findAllOutgoingEdges(nodeId)
+findAllIncomingEdges(nodeId)
+findCalleeNames(nodeId)
+findCallerNames(nodeId)
+findImportTargets(nodeId)
+findImportSources(nodeId)
+findImportDependents(nodeId)
+findCrossFileCallTargets(file)
+countCrossFileCallers(nodeId, file)
+getClassHierarchy(classNodeId)
+findIntraFileCallEdges(file)
+getCallableNodes()
+getCallEdges()
+getFileNodesAll()
+getImportEdges()
+hasCfgTables()
+hasEmbeddings()
+hasDataflowTable()
+getComplexityForNode(nodeId)
}
class InMemoryRepository {
-nodes Map
-edges Map
-complexity Map
-nextNodeId int
-nextEdgeId int
+addNode(attrs) int
+addEdge(attrs) int
+addComplexity(nodeId, metrics)
-escapeLike(s) string
-likeToRegex(pattern) RegExp
-computeFanIn() Map
-iterateFunctionNodesImpl(opts) Iterator
}
class SqliteRepository {
-db Database
+db Database
}
class TestRepoBuilder {
-pending object
+fn(name, file, line, extra) TestRepoBuilder
+method(name, file, line, extra) TestRepoBuilder
+cls(name, file, line, extra) TestRepoBuilder
+file(filePath) TestRepoBuilder
+node(name, kind, file, line, extra) TestRepoBuilder
+calls(sourceName, targetName) TestRepoBuilder
+imports(sourceName, targetName) TestRepoBuilder
+extends(sourceName, targetName) TestRepoBuilder
+edge(sourceName, targetName, kind) TestRepoBuilder
+complexity(name, metrics) TestRepoBuilder
+build() object
}
Repository <|-- InMemoryRepository
Repository <|-- SqliteRepository
TestRepoBuilder ..> InMemoryRepository : creates
Last reviewed commit: 1f5b03b |
| findNodesForTriage(opts = {}) { | ||
| const kindsToUse = opts.kind ? [opts.kind] : ['function', 'method', 'class']; | ||
| let nodes = [...this.#nodes.values()].filter((n) => kindsToUse.includes(n.kind)); | ||
|
|
||
| if (opts.noTests) { | ||
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); | ||
| } | ||
| if (opts.file) { | ||
| const fileRe = likeToRegex(`%${opts.file}%`); | ||
| nodes = nodes.filter((n) => fileRe.test(n.file)); | ||
| } | ||
| if (opts.role) { | ||
| nodes = nodes.filter((n) => n.role === opts.role); | ||
| } | ||
|
|
||
| const fanInMap = this.#computeFanIn(); | ||
| return nodes | ||
| .sort((a, b) => a.file.localeCompare(b.file) || a.line - b.line) | ||
| .map((n) => { | ||
| const cx = this.#complexity.get(n.id); | ||
| return { | ||
| id: n.id, | ||
| name: n.name, | ||
| kind: n.kind, | ||
| file: n.file, | ||
| line: n.line, | ||
| end_line: n.end_line, | ||
| role: n.role, | ||
| fan_in: fanInMap.get(n.id) ?? 0, | ||
| cognitive: cx?.cognitive ?? 0, | ||
| mi: cx?.maintainability_index ?? 0, | ||
| cyclomatic: cx?.cyclomatic ?? 0, | ||
| max_nesting: cx?.max_nesting ?? 0, | ||
| churn: 0, // no co-change data in-memory | ||
| }; | ||
| }); |
There was a problem hiding this comment.
Missing input validation for kind and role
The SQLite version in nodes.js:39-47 validates opts.kind against EVERY_SYMBOL_KIND and opts.role against VALID_ROLES, throwing a ConfigError for invalid values. The in-memory version silently accepts any value, which means invalid inputs will silently return empty results instead of throwing. This behavioral divergence will cause the parity tests to miss bugs where callers pass incorrect kind/role strings.
| findNodesForTriage(opts = {}) { | |
| const kindsToUse = opts.kind ? [opts.kind] : ['function', 'method', 'class']; | |
| let nodes = [...this.#nodes.values()].filter((n) => kindsToUse.includes(n.kind)); | |
| if (opts.noTests) { | |
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); | |
| } | |
| if (opts.file) { | |
| const fileRe = likeToRegex(`%${opts.file}%`); | |
| nodes = nodes.filter((n) => fileRe.test(n.file)); | |
| } | |
| if (opts.role) { | |
| nodes = nodes.filter((n) => n.role === opts.role); | |
| } | |
| const fanInMap = this.#computeFanIn(); | |
| return nodes | |
| .sort((a, b) => a.file.localeCompare(b.file) || a.line - b.line) | |
| .map((n) => { | |
| const cx = this.#complexity.get(n.id); | |
| return { | |
| id: n.id, | |
| name: n.name, | |
| kind: n.kind, | |
| file: n.file, | |
| line: n.line, | |
| end_line: n.end_line, | |
| role: n.role, | |
| fan_in: fanInMap.get(n.id) ?? 0, | |
| cognitive: cx?.cognitive ?? 0, | |
| mi: cx?.maintainability_index ?? 0, | |
| cyclomatic: cx?.cyclomatic ?? 0, | |
| max_nesting: cx?.max_nesting ?? 0, | |
| churn: 0, // no co-change data in-memory | |
| }; | |
| }); | |
| findNodesForTriage(opts = {}) { | |
| if (opts.kind && !['function', 'method', 'class', 'interface', 'type', 'struct', 'enum', 'trait', 'record', 'module', 'parameter', 'property', 'constant'].includes(opts.kind)) { | |
| throw new Error(`Invalid kind: ${opts.kind}`); | |
| } | |
| if (opts.role && !['entry', 'core', 'utility', 'adapter', 'dead', 'leaf'].includes(opts.role)) { | |
| throw new Error(`Invalid role: ${opts.role}`); | |
| } | |
| const kindsToUse = opts.kind ? [opts.kind] : ['function', 'method', 'class']; | |
| let nodes = [...this.#nodes.values()].filter((n) => kindsToUse.includes(n.kind)); | |
| if (opts.noTests) { | |
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); | |
| } | |
| if (opts.file) { | |
| const fileRe = likeToRegex(`%${opts.file}%`); | |
| nodes = nodes.filter((n) => fileRe.test(n.file)); | |
| } | |
| if (opts.role) { | |
| nodes = nodes.filter((n) => n.role === opts.role); | |
| } | |
| const fanInMap = this.#computeFanIn(); | |
| return nodes | |
| .sort((a, b) => a.file.localeCompare(b.file) || a.line - b.line) | |
| .map((n) => { | |
| const cx = this.#complexity.get(n.id); | |
| return { | |
| id: n.id, | |
| name: n.name, | |
| kind: n.kind, | |
| file: n.file, | |
| line: n.line, | |
| end_line: n.end_line, | |
| role: n.role, | |
| fan_in: fanInMap.get(n.id) ?? 0, | |
| cognitive: cx?.cognitive ?? 0, | |
| mi: cx?.maintainability_index ?? 0, | |
| cyclomatic: cx?.cyclomatic ?? 0, | |
| max_nesting: cx?.max_nesting ?? 0, | |
| churn: 0, // no co-change data in-memory | |
| }; | |
| }); |
There was a problem hiding this comment.
Fixed — added kind/role validation matching the SQLite version. InMemoryRepository.findNodesForTriage now imports ConfigError, EVERY_SYMBOL_KIND, and VALID_ROLES from kinds.js and throws ConfigError for invalid values. Also added parity tests for both invalid kind and invalid role.
There was a problem hiding this comment.
Fixed in 2c9c5fc — added kind/role validation with ConfigError, matching SqliteRepository behavior.
There was a problem hiding this comment.
Already addressed — validation for kind against EVERY_SYMBOL_KIND and role against VALID_ROLES was added in a previous push (lines 237-246). Parity tests for both invalid-kind and invalid-role throws are also in place.
| return results.sort((a, b) => { | ||
| const callerA = [...this.#nodes.values()].find( | ||
| (n) => n.name === a.caller_name && n.file === file, | ||
| ); | ||
| const callerB = [...this.#nodes.values()].find( | ||
| (n) => n.name === b.caller_name && n.file === file, | ||
| ); | ||
| return (callerA?.line ?? 0) - (callerB?.line ?? 0); |
There was a problem hiding this comment.
Sort comparator spreads nodes Map per comparison
Each call to the sort comparator spreads this.#nodes into a new array and does a linear .find() on it. With m nodes and r results, this is O(r * log(r) * m) work. Since the caller/callee nodes are already resolved in the loop above (lines 447-448), consider building a name→line lookup map before sorting to avoid the repeated scans.
| return results.sort((a, b) => { | |
| const callerA = [...this.#nodes.values()].find( | |
| (n) => n.name === a.caller_name && n.file === file, | |
| ); | |
| const callerB = [...this.#nodes.values()].find( | |
| (n) => n.name === b.caller_name && n.file === file, | |
| ); | |
| return (callerA?.line ?? 0) - (callerB?.line ?? 0); | |
| return results.sort((a, b) => { | |
| const nodesArr = [...this.#nodes.values()]; | |
| const callerA = nodesArr.find( | |
| (n) => n.name === a.caller_name && n.file === file, | |
| ); | |
| const callerB = nodesArr.find( | |
| (n) => n.name === b.caller_name && n.file === file, | |
| ); |
A more efficient approach would be to build the lookup once before sorting:
const lineByName = new Map();
for (const n of this.#nodes.values()) {
if (n.file === file) lineByName.set(n.name, n.line);
}
return results.sort((a, b) => (lineByName.get(a.caller_name) ?? 0) - (lineByName.get(b.caller_name) ?? 0));There was a problem hiding this comment.
Fixed — replaced the per-comparison [...this.#nodes.values()].find() with a pre-built lineByName Map. Now O(n) to build the map + O(r log r) to sort, instead of O(r * log(r) * n).
There was a problem hiding this comment.
Fixed in 2c9c5fc — findIntraFileCallEdges now builds a lineByName lookup Map once before sorting instead of spreading the nodes Map per comparison.
There was a problem hiding this comment.
Already addressed — findIntraFileCallEdges builds a lineByName lookup map once before sorting (lines 474-480), avoiding the repeated spread+find per comparator call.
tests/helpers/fixtures.js
Outdated
| */ | ||
| class TestRepoBuilder { | ||
| #pending = { nodes: [], edges: [], complexity: [] }; | ||
| #nameIndex = new Map(); // name → { kind, file, line, ... } |
There was a problem hiding this comment.
#nameIndex is written but never read
#nameIndex is populated in #addNode (line 131) but is never used anywhere in the class. It appears to be dead code — possibly left over from an earlier iteration where it was used for edge resolution in build() (which now uses the ids Map instead).
| #nameIndex = new Map(); // name → { kind, file, line, ... } |
There was a problem hiding this comment.
Fixed — removed the dead #nameIndex field and its .set() call in #addNode. It was indeed leftover from an earlier iteration.
There was a problem hiding this comment.
Fixed in 2c9c5fc — removed dead #nameIndex field from TestRepoBuilder.
There was a problem hiding this comment.
Already removed — #nameIndex no longer exists in the current version of fixtures.js.
- Add kind/role validation to InMemoryRepository.findNodesForTriage matching SqliteRepository behavior (throws ConfigError) - Optimize findIntraFileCallEdges sort with pre-built lookup map instead of spreading nodes Map per comparison - Remove dead #nameIndex field from TestRepoBuilder - Add validation parity tests for invalid kind/role Impact: 5 functions changed, 0 affected
| for (const n of this.#pending.nodes) { | ||
| const id = repo.addNode(n); | ||
| ids.set(n.name, id); | ||
| } |
There was a problem hiding this comment.
Duplicate node names silently corrupt edge resolution
ids.set(n.name, id) uses the node name as the sole key. If a test graph contains two nodes with the same name (e.g. fn('render', 'components/A.js', 1) and fn('render', 'components/B.js', 1)), the second insertion silently overwrites the first. Any subsequent .calls(…, 'render') will then wire to the wrong node with no error thrown, producing a graph that looks valid but has incorrect edges.
The Error paths a few lines down (Unknown source node) only fire when a name is missing entirely — they offer no protection against the overwrite case.
Consider throwing on a detected duplicate during build():
for (const n of this.#pending.nodes) {
if (ids.has(n.name)) {
throw new Error(`Duplicate node name "${n.name}" — use unique names or disambiguate with name@file:line`);
}
const id = repo.addNode(n);
ids.set(n.name, id);
}Alternatively, the documentation for fn / node / etc. should prominently warn that names must be unique within a single builder chain.
There was a problem hiding this comment.
Fixed in 1ac56af — build() now throws an Error if a duplicate node name is detected: Duplicate node name: "X" (already assigned ID N). This prevents silent edge corruption when the same name appears twice.
There was a problem hiding this comment.
Already addressed — build() checks ids.has(n.name) before calling repo.addNode(n), throwing on duplicates. See lines 105-109 in the current fixtures.js.
Impact: 2 functions changed, 0 affected
Impact: 2 functions changed, 0 affected
…graph into feat/testing-pyramid
| findCallers(nodeId) { | ||
| const results = []; | ||
| for (const e of this.#edges.values()) { | ||
| if (e.target_id === nodeId && e.kind === 'calls') { | ||
| const n = this.#nodes.get(e.source_id); | ||
| if (n) results.push({ id: n.id, name: n.name, kind: n.kind, file: n.file, line: n.line }); | ||
| } | ||
| } | ||
| return results; | ||
| } |
There was a problem hiding this comment.
findCallers doesn't deduplicate while findCallees does
findCallees (lines 276–295) uses a seen Set to suppress duplicate entries when the same target_id appears across multiple edges. findCallers has no equivalent guard. If two calls edges from the same source to the same target exist in the graph (which addEdge does not prevent), findCallees(source) returns the target once, while findCallers(target) returns the source twice.
This asymmetry is not covered by any unit or parity test — every test uses single edges per source-target pair, so neither a SQLite SELECT nor the in-memory scan ever exercises the duplicate case. If the underlying SQLite query for findCallers also returns duplicates in this scenario the implementations are consistent; if the SQL uses DISTINCT, the in-memory version diverges silently.
At minimum, a parity test with a duplicate edge would confirm the intended semantics:
it('findCallers with duplicate edges', () => {
// Add two edges from the same caller to the same target
repo.addEdge({ source_id: ids.bar, target_id: ids.foo, kind: 'calls' });
const callers = repo.findCallers(ids.foo);
// Assert expected count (1 or 3 depending on intended semantics)
});There was a problem hiding this comment.
By design — the SQLite findCallers (edges.js:48-50) also has no DISTINCT, so the in-memory version correctly matches. findDistinctCallers is the deduplicated variant used for impact analysis BFS. The asymmetry with findCallees is pre-existing behavior that both implementations match on.
There was a problem hiding this comment.
Not a divergence — the SQLite \ query also does not use \ (it's a plain ). Both implementations return duplicates if duplicate edges exist. For deduplicated results, callers should use \ which exists for exactly this purpose.
| for (const n of this.#pending.nodes) { | ||
| const id = repo.addNode(n); | ||
| if (ids.has(n.name)) { | ||
| throw new Error( | ||
| `Duplicate node name: "${n.name}" — use unique names or qualify with file path`, | ||
| ); | ||
| } | ||
| ids.set(n.name, id); | ||
| } |
There was a problem hiding this comment.
Duplicate node added to repo before the check fires
repo.addNode(n) is called on line 105 before the ids.has(n.name) check on line 106. When a duplicate name is encountered, the duplicate node is inserted into repo first, and only then is the error thrown. Since build() propagates the error and the partially-built repo is never returned, there is no observable bug — but it's cleaner to check before adding:
| for (const n of this.#pending.nodes) { | |
| const id = repo.addNode(n); | |
| if (ids.has(n.name)) { | |
| throw new Error( | |
| `Duplicate node name: "${n.name}" — use unique names or qualify with file path`, | |
| ); | |
| } | |
| ids.set(n.name, id); | |
| } | |
| for (const n of this.#pending.nodes) { | |
| if (ids.has(n.name)) { | |
| throw new Error( | |
| `Duplicate node name: "${n.name}" — use unique names or qualify with file path`, | |
| ); | |
| } | |
| const id = repo.addNode(n); | |
| ids.set(n.name, id); | |
| } |
There was a problem hiding this comment.
Fixed in 31ebeac — moved the duplicate-name check before repo.addNode() so the repo is never mutated when a duplicate is detected.
There was a problem hiding this comment.
Already fixed — the duplicate check now runs before addNode. See lines 105-109 in the current fixtures.js.
Move the duplicate-name guard before repo.addNode() so the repo is not mutated when a duplicate is detected. Impact: 2 functions changed, 0 affected
| if (opts.file) { | ||
| const fileRe = likeToRegex(`%${opts.file}%`); | ||
| nodes = nodes.filter((n) => fileRe.test(n.file)); | ||
| } |
There was a problem hiding this comment.
opts.file LIKE escaping divergence vs SQLite
findNodesByScope in nodes.js (line 276) uses escapeLike() + ESCAPE '\\' before building the LIKE pattern, so % and _ in opts.file are treated as literal characters. The in-memory version passes opts.file directly into likeToRegex(\%${opts.file}%`), which converts every _to a regex.(any character) and%to.*`.
Concrete divergence: findNodesByScope('SomeClass', { file: '__tests__' }) in SQLite correctly finds files containing the literal string __tests__, while the in-memory version matches files containing any two characters, then tests, then any two characters — returning false positives. The same applies at line 200 in findNodeByQualifiedName (lines 211–212).
Fix: apply the same escapeLike treatment before building the regex:
// after: import { likeToRegex } from the helper, OR add an escapeLike helper
const escapedFile = opts.file.replace(/[%_\\]/g, '\\$&'); // mirror escapeLike()
const fileRe = likeToRegex(`%${escapedFile}%`);This brings the in-memory filtering in line with the SQL ESCAPE '\\' behaviour.
The same divergence exists in findNodeByQualifiedName at src/db/repository/in-memory-repository.js:211.
There was a problem hiding this comment.
Fixed in 6ca47ea — added escapeLike() helper (mirrors nodes.js:254) and applied it to all 6 likeToRegex call sites in in-memory-repository.js. The _ and % chars are now treated as literals, matching SQLite ESCAPE behavior.
There was a problem hiding this comment.
Already addressed — all opts.file usages in the in-memory repo now use escapeLike(opts.file) before building the LIKE pattern. See findNodesByScope (line 210), findNodeByQualifiedName (line 221), findNodesForTriage (line 254), and #iterateFunctionNodesImpl (line 547). The parity test for _ as literal in file filter (line 356) validates this.
| import Database from 'better-sqlite3'; | ||
| import { afterEach, beforeEach, describe, expect, it } from 'vitest'; | ||
| import { initSchema } from '../../src/db/migrations.js'; | ||
| import { InMemoryRepository } from '../../src/db/repository/in-memory-repository.js'; | ||
| import { SqliteRepository } from '../../src/db/repository/sqlite-repository.js'; | ||
|
|
||
| /** | ||
| * Parity tests — run the same assertions against both SqliteRepository and | ||
| * InMemoryRepository to verify behavioral equivalence. | ||
| */ |
There was a problem hiding this comment.
Missing parity coverage for several methods
Several InMemoryRepository methods that have unit tests in in-memory-repository.test.js are absent from the parity suite, meaning potential SQLite-vs-in-memory divergences in those methods go undetected:
findNodesByScope/findNodeByQualifiedName— the escaping divergence noted above wouldn't have been caught without parity testsfindNodesWithFanIn(name-pattern filtering + fan-in computation)findFileNodes(LIKE pattern for file-kind nodes)findNodeChildrenfindImportTargets/findImportSources/findImportDependentsgetCallableNodes/getFileNodesAll/getImportEdgeshasCfgTables/hasEmbeddings
The seeded SQLite fixture in seedSqliteRepo already has bar with scope = 'BarClass' and qualified_name = 'BarClass.bar', making it straightforward to add parity tests for the scope/qualified-name lookups without changing the fixture. Adding these tests would have caught the LIKE escaping issue automatically.
There was a problem hiding this comment.
Fixed in 6ca47ea — added parity tests for: findNodesByScope, findNodeByQualifiedName (with file filter), findNodesWithFanIn, findFileNodes, getFileNodesAll, findNodeChildren, findImportTargets, findImportSources, findImportDependents, getImportEdges, getCallableNodes, hasCfgTables, hasEmbeddings. Also added a LIKE-escaping test (findNodesByScope treats _ as literal in file filter) that would have caught the divergence. Seed fixtures now include file-kind nodes, import edges, and parent_id relationships.
There was a problem hiding this comment.
Added parity tests for findNodesByScope, findNodeByQualifiedName, findFileNodes, findNodeChildren, findNodesWithFanIn, import queries, getCallableNodes, getFileNodesAll, getImportEdges, hasCfgTables, and hasEmbeddings in the latest parity suite (92 tests total). The LIKE escaping test specifically covers the _ wildcard divergence.
findNodesByScope, findNodeByQualifiedName, findNodesWithFanIn, listFunctionNodes, and findNodesForTriage all passed user-provided strings directly into likeToRegex without escaping % and _ chars. This caused divergence from SQLite which uses escapeLike + ESCAPE '\'. Also adds parity tests for findNodesByScope, findNodeByQualifiedName, findNodesWithFanIn, findFileNodes, findNodeChildren, import queries, getCallableNodes, getFileNodesAll, getImportEdges, and hasCfgTables. Impact: 7 functions changed, 5 affected
Impact: 9 functions changed, 14 affected
| if (opts.noTests) { | ||
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); |
There was a problem hiding this comment.
noTests filter misses 3 exclusion patterns from SQLite
The in-memory noTests guard only checks for .test. and .spec. in the file path, but the SQLite equivalent (NodeQuery.excludeTests() in src/db/query-builder.js lines 157–163) applies five conditions:
n.file NOT LIKE '%.test.%'
n.file NOT LIKE '%.spec.%'
n.file NOT LIKE '%__test__%'
n.file NOT LIKE '%__tests__%'
n.file NOT LIKE '%.stories.%'
Files inside __tests__/ or __test__/ directories (the Jest/Vitest convention) and Storybook .stories.js files would be included by InMemoryRepository when noTests: true but excluded by SqliteRepository. This same divergence exists in both #iterateFunctionNodesImpl (affecting listFunctionNodes / iterateFunctionNodes) and findNodesForTriage (lines 250–251).
Because the parity fixture only seeds a tests/foo.test.js filename, the parity tests do not exercise any of the three missing patterns.
| if (opts.noTests) { | |
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); | |
| if (opts.noTests) { | |
| nodes = nodes.filter( | |
| (n) => | |
| !n.file.includes('.test.') && | |
| !n.file.includes('.spec.') && | |
| !n.file.includes('__test__') && | |
| !n.file.includes('__tests__') && | |
| !n.file.includes('.stories.'), | |
| ); | |
| } |
There was a problem hiding this comment.
Fixed — added the three missing patterns (__test__, __tests__, .stories.) to match the SQLite excludeTests behavior in #iterateFunctionNodesImpl. Also added parity test fixtures with __tests__ and .stories. nodes so this divergence is caught automatically going forward.
| if (opts.noTests) { | ||
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); |
There was a problem hiding this comment.
Same noTests divergence in findNodesForTriage
The same three patterns (__test__, __tests__, .stories.) that are excluded by the SQLite implementation are missing here. See the comment at line 554 for the full analysis. Both occurrences need the same fix.
| if (opts.noTests) { | |
| nodes = nodes.filter((n) => !n.file.includes('.test.') && !n.file.includes('.spec.')); | |
| if (opts.noTests) { | |
| nodes = nodes.filter( | |
| (n) => | |
| !n.file.includes('.test.') && | |
| !n.file.includes('.spec.') && | |
| !n.file.includes('__test__') && | |
| !n.file.includes('__tests__') && | |
| !n.file.includes('.stories.'), | |
| ); | |
| } |
There was a problem hiding this comment.
Fixed — same three patterns added here in findNodesForTriage.
| if (opts.file) { | ||
| const fileRe = likeToRegex(`%${escapeLike(opts.file)}%`); | ||
| nodes = nodes.filter((n) => fileRe.test(n.file)); | ||
| } |
There was a problem hiding this comment.
findNodesWithFanIn file filter escaping is stricter than SQLite
The SQLite findNodesWithFanIn calls q.fileFilter(opts.file) (query-builder.js line 170), which wraps the value with %…% but does not call escapeLike. This means in SQLite a _ in opts.file is treated as a LIKE wildcard (any single character). The in-memory version calls escapeLike(opts.file) first, making _ a literal.
In practice, this is a minor discrepancy (in-memory is actually more correct), but it causes a subtle parity gap for callers that pass file fragments containing underscores (e.g. src_utils, __init__). A parity test covering opts.file with an underscore would expose this difference. Consider either:
- Removing
escapeLikehere to match SQLite behavior, or - Adding
escapeLiketo the SQLitefileFiltermethod (a wider fix) and documenting the semantic difference.
There was a problem hiding this comment.
Acknowledged — the in-memory version is stricter (more correct) because it applies . The SQLite \ method in query-builder.js doesn't call , which is a pre-existing bug on the SQLite side. We'll fix that separately rather than regressing the in-memory version to match the buggy behavior.
There was a problem hiding this comment.
Fixed in #446 — added \ to \ and \ in \ with , so \ and \ are now treated as literals. Also deduplicated the \ function — \ now imports it from . Added parity test for \ with underscore in file filter.
| it('listFunctionNodes excludes tests', () => { | ||
| const rows = repo.listFunctionNodes({ noTests: true }); | ||
| expect(rows.length).toBe(4); // foo, bar, Baz, bazMethod | ||
| expect(rows.every((r) => !r.file.includes('.test.'))).toBe(true); | ||
| }); |
There was a problem hiding this comment.
Parity tests don't cover __tests__/.stories. exclusion patterns
The parity test for listFunctionNodes excludes tests seeds only a tests/foo.test.js filename (matched by .test.), so all three missing noTests patterns (__test__, __tests__, .stories.) are never exercised. Consider adding a node whose file contains one of these patterns to the seedSqliteRepo / seedInMemoryRepo fixtures, then asserting rows.every((r) => !r.file.includes('__tests__')). This would have caught the divergence described in the comment on in-memory-repository.js.
There was a problem hiding this comment.
Fixed — added jestFn (src/__tests__/helper.js) and storyFn (src/Button.stories.js) to both seed functions. The listFunctionNodes excludes tests parity test now asserts that none of the excluded patterns appear in the results.
The in-memory noTests filter only checked .test. and .spec. patterns, missing __test__, __tests__, and .stories. that the SQLite query-builder excludes. Add the missing patterns to both findNodesForTriage and iterateFunctionNodes. Extend parity test fixtures with __tests__ and .stories. nodes so the divergence is caught automatically. Impact: 5 functions changed, 0 affected
| } | ||
| if (opts.pattern) { | ||
| const patternRe = likeToRegex(`%${escapeLike(opts.pattern)}%`); | ||
| nodes = nodes.filter((n) => patternRe.test(n.name)); |
There was a problem hiding this comment.
opts.pattern escaping diverges from SQLite's nameLike
The in-memory version applies escapeLike(opts.pattern) before building the regex, so a _ in the pattern is treated as a literal underscore. The SQLite path calls nameLike(opts.pattern) in query-builder.js, which does not escape, so _ is a LIKE wildcard (any single character) in SQL.
Concrete divergence: listFunctionNodes({ pattern: 'get_User' }) will return only get_User in InMemoryRepository, but will also return getAUser, getBUser, etc., in SqliteRepository.
This is the same class of divergence as the acknowledged fileFilter issue in findNodesWithFanIn, but it affects the opts.pattern / nameLike path and has not been discussed in previous threads. The resolution options are the same: either add escapeLike to nameLike in query-builder.js (preferred, fixes SQLite too), or document the intentional asymmetry here.
| it('listFunctionNodes filters by pattern', () => { | ||
| const rows = repo.listFunctionNodes({ pattern: 'Baz' }); | ||
| expect(rows.length).toBe(2); // Baz + bazMethod (LIKE is case-insensitive) | ||
| }); |
There was a problem hiding this comment.
Parity test for listFunctionNodes pattern doesn't cover LIKE wildcards
The pattern 'Baz' contains no LIKE special characters, so the escaping divergence between escapeLike(opts.pattern) (in-memory) and the raw nameLike(pattern) call on the SQL side is never exercised here. Adding a node whose name would only be matched if _ is treated as a wildcard would catch the divergence:
it('listFunctionNodes treats _ as literal in pattern', () => {
// '_az' should NOT match 'Baz' — underscore is a literal, not a wildcard
const rows = repo.listFunctionNodes({ pattern: '_az' });
expect(rows.length).toBe(0);
});Note: this test would currently fail on SqliteRepository because SQLite's nameLike doesn't escape, confirming the bug described in the comment on in-memory-repository.js.
Summary
Repositorybase class,InMemoryRepository(Maps-backed, zero SQLite dependency), andSqliteRepository(delegates to existingfn(db, ...)functions)TestRepoBuilderfixture factory (tests/helpers/fixtures.js) for ergonomic test graph constructionTest plan
npx vitest run tests/unit/in-memory-repository.test.js— 51 tests passnpx vitest run tests/unit/repository-parity.test.js— 54 tests passnpm test— no regressions in existing test suitenpm run lint— clean