diff --git a/CLAUDE.md b/CLAUDE.md index 2d3b7bb1..389695ea 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -45,7 +45,7 @@ JS source is plain JavaScript (ES modules) in `src/`. No transpilation step. The | `queries.js` | Query functions: symbol search, file deps, impact analysis, diff-impact; `SYMBOL_KINDS` constant defines all node kinds | | `embedder.js` | Semantic search with `@huggingface/transformers`; multi-query RRF ranking | | `db.js` | SQLite schema and operations (`better-sqlite3`) | -| `mcp.js` | MCP server exposing graph queries to AI agents | +| `mcp.js` | MCP server exposing graph queries to AI agents; single-repo by default, `--multi-repo` to enable cross-repo access | | `cycles.js` | Circular dependency detection | | `export.js` | DOT/Mermaid/JSON graph export | | `watcher.js` | Watch mode for incremental rebuilds | @@ -66,6 +66,7 @@ JS source is plain JavaScript (ES modules) in `src/`. No transpilation step. The - Non-required parsers (all except JS/TS/TSX) fail gracefully if their WASM grammar is unavailable - Import resolution uses a 6-level priority system with confidence scoring (import-aware → same-file → directory → parent → global → method hierarchy) - Incremental builds track file hashes in the DB to skip unchanged files +- **MCP single-repo isolation:** `startMCPServer` defaults to single-repo mode — tools have no `repo` property and `list_repos` is not exposed. Passing `--multi-repo` or `--repos` to the CLI (or `options.multiRepo` / `options.allowedRepos` programmatically) enables multi-repo access. `buildToolList(multiRepo)` builds the tool list dynamically; the backward-compatible `TOOLS` export equals `buildToolList(true)` - **Credential resolution:** `loadConfig` pipeline is `mergeConfig → applyEnvOverrides → resolveSecrets`. The `apiKeyCommand` config field shells out to an external secret manager via `execFileSync` (no shell). Priority: command output > env var > file config > defaults. On failure, warns and falls back gracefully **Database:** SQLite at `.codegraph/graph.db` with tables: `nodes`, `edges`, `metadata`, `embeddings` @@ -94,9 +95,25 @@ Releases are triggered via the `publish.yml` workflow (`workflow_dispatch`). By The workflow can be overridden with a specific version via the `version-override` input. Locally, `npm run release:dry-run` previews the bump and changelog. +## Dogfooding — codegraph on itself + +Codegraph is **our own tool**. Use it to analyze this repository before making changes: + +```bash +node src/cli.js build . # Build/update the graph +node src/cli.js cycles # Check for circular dependencies +node src/cli.js map --limit 20 # Module overview & coupling hotspots +node src/cli.js diff-impact main # See impact of current branch changes +node src/cli.js fn # Trace function-level dependency chains +node src/cli.js deps src/.js # See what imports/depends on a file +``` + +If codegraph reports an error, crashes, or produces wrong results when analyzing itself, **fix the bug in the codebase** — don't just work around it. This is the best way to find and resolve real issues. + ## Git Conventions - Never add AI co-authorship lines (`Co-Authored-By` or similar) to commit messages. +- Never add "Built with Claude Code", "Generated with Claude Code", or any variation referencing Claude Code or Anthropic to commit messages, PR descriptions, code comments, or any other output. ## Node Version diff --git a/README.md b/README.md index d51cd3da..b1f23880 100644 --- a/README.md +++ b/README.md @@ -128,7 +128,7 @@ codegraph deps src/index.ts # file-level import/export map | 📤 | **Export** | DOT (Graphviz), Mermaid, and JSON graph export | | 🧠 | **Semantic search** | Embeddings-powered natural language search with multi-query RRF ranking | | 👀 | **Watch mode** | Incrementally update the graph as files change | -| 🤖 | **MCP server** | 12-tool MCP server with multi-repo support for AI assistants | +| 🤖 | **MCP server** | 13-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo | | 🔒 | **Fully local** | No network calls, no data exfiltration, SQLite-backed | ## 📦 Commands @@ -215,7 +215,7 @@ The model used during `embed` is stored in the database, so `search` auto-detect ### Multi-Repo Registry -Manage a global registry of codegraph-enabled projects. AI agents can query any registered repo from a single MCP session using the `repo` parameter. +Manage a global registry of codegraph-enabled projects. The registry stores paths to your built graphs so the MCP server can query them when multi-repo mode is enabled. ```bash codegraph registry list # List all registered repos @@ -230,9 +230,13 @@ codegraph registry remove # Unregister ### AI Integration ```bash -codegraph mcp # Start MCP server for AI assistants +codegraph mcp # Start MCP server (single-repo, current project only) +codegraph mcp --multi-repo # Enable access to all registered repos +codegraph mcp --repos a,b # Restrict to specific repos (implies --multi-repo) ``` +By default, the MCP server only exposes the local project's graph. AI agents cannot access other repositories unless you explicitly opt in with `--multi-repo` or `--repos`. + ### Common Flags | Flag | Description | @@ -324,13 +328,17 @@ Benchmarked on a ~3,200-file TypeScript project: ### MCP Server -Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 12 tools, so AI assistants can query your dependency graph directly: +Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 13 tools, so AI assistants can query your dependency graph directly: ```bash -codegraph mcp +codegraph mcp # Single-repo mode (default) — only local project +codegraph mcp --multi-repo # Multi-repo — all registered repos accessible +codegraph mcp --repos a,b # Multi-repo with allowlist ``` -All MCP tools accept an optional `repo` parameter to target any registered repository. Use `list_repos` to see available repos. When `repo` is omitted, the local `.codegraph/graph.db` is used (backwards compatible). +**Single-repo mode (default):** Tools operate only on the local `.codegraph/graph.db`. The `repo` parameter and `list_repos` tool are not exposed to the AI agent. + +**Multi-repo mode (`--multi-repo`):** All tools gain an optional `repo` parameter to target any registered repository, and `list_repos` becomes available. Use `--repos` to restrict which repos the agent can access. ### CLAUDE.md / Agent Instructions diff --git a/docs/dogfooding-guide.md b/docs/dogfooding-guide.md new file mode 100644 index 00000000..14c71e19 --- /dev/null +++ b/docs/dogfooding-guide.md @@ -0,0 +1,102 @@ +# Codegraph Dogfooding Guide + +Codegraph analyzing its own codebase. This guide documents findings from self-analysis and lists improvements — both automated fixes already applied and items requiring human judgment. + +## Running the Self-Analysis + +```bash +# Build the graph (from repo root) +node src/cli.js build . + +# Core analysis commands +node src/cli.js cycles # Circular dependency check +node src/cli.js cycles --functions # Function-level cycles +node src/cli.js map --limit 20 --json # Module coupling overview +node src/cli.js diff-impact main --json # Impact of current branch +node src/cli.js deps src/.js # File dependency inspection +node src/cli.js fn # Function call chain trace +node src/cli.js fn-impact # What breaks if function changes +``` + +## Action Items + +These findings require human judgment to address properly: + +### HIGH PRIORITY + +#### 1. parser.js is a 2200+ line monolith (47 function definitions) +**Found by:** `codegraph deps src/parser.js` and `codegraph map` + +`parser.js` has the highest fan-in (14 files import it) and contains extractors for **all 11 languages** in a single file. Each language extractor (Python, Go, Rust, Java, C#, PHP, Ruby, HCL) has its own `walk()` function, creating duplicate names that confuse function-level analysis. + +**Recommendation:** Split per-language extractors into separate files under `src/extractors/`: +``` +src/extractors/ + javascript.js # JS/TS/TSX extractor (currently inline) + python.js # extractPythonSymbols + findPythonParentClass + walk + go.js # extractGoSymbols + walk + rust.js # extractRustSymbols + extractRustUsePath + walk + java.js # extractJavaSymbols + findJavaParentClass + walk + csharp.js # extractCSharpSymbols + extractCSharpBaseTypes + walk + ruby.js # extractRubySymbols + findRubyParentClass + walk + php.js # extractPHPSymbols + findPHPParentClass + walk + hcl.js # extractHCLSymbols + walk +``` +**Impact:** Would improve codegraph's own function-level analysis (no more ambiguous `walk` matches), make each extractor independently testable, and reduce the cognitive load of the file. + +**Trade-off:** The Rust native engine already has this structure (`crates/codegraph-core/src/extractors/`). Aligning the WASM extractors would create parity. + + +### MEDIUM PRIORITY + +#### 3. builder.js has the highest fan-out (7 dependencies) +**Found by:** `codegraph map` + +`builder.js` imports from 7 modules: config, constants, db, logger, parser, resolve, and structure. As the build orchestrator this is somewhat expected, but it also means any change to builder.js has wide blast radius. + +**Recommendation:** Consider whether the `structure.js` integration (already lazy-loaded via dynamic import) pattern could apply to other optional post-build steps. + +#### 4. watcher.js fan-out vs fan-in imbalance (5 out, 2 in) +**Found by:** `codegraph map` + +The watcher depends on 5 modules but only 2 modules reference it. This suggests it might be pulling in more than it needs. + +**Recommendation:** Review whether watcher.js can use more targeted imports or lazy-load some dependencies. + +#### 5. diff-impact runs git in temp directories (test fragility) +**Found by:** Integration test output showing `git diff --no-index` errors in temp directories + +The `diff-impact` command runs `git diff` which fails in non-git temp directories used by tests. The error output is noisy but doesn't fail the test. + +**Recommendation:** Guard the git call or skip gracefully when not in a git repo. + +### LOW PRIORITY + +#### 6. Consider adding a `codegraph stats` command +There's no single command that shows a quick overview of graph health: node/edge counts, cycle count, top coupling hotspots, fan-out outliers. Currently you need to run `map`, `cycles`, and read the build output separately. + +#### 7. Embed and search the codebase itself +Running `codegraph embed .` and then `codegraph search "build dependency graph"` on the codegraph repo would exercise the embedding pipeline and could surface naming/discoverability issues in the API. + +## Known Environment Issue + +On this workstation, changes to files not already tracked as modified on the current git branch (`docs/architecture-audit`) get reverted by an external process (likely a VS Code extension). If you're applying the parser.js cycle fix, do it from a fresh branch or commit immediately. + +## Periodic Self-Check Routine + +Run this after significant changes: + +```bash +# 1. Rebuild the graph +node src/cli.js build . + +# 2. Check for regressions +node src/cli.js cycles # Should be 0 file-level cycles +node src/cli.js map --limit 10 # Verify no new coupling hotspots + +# 3. Check impact of your changes +node src/cli.js diff-impact main + +# 4. Run tests +npm test +``` diff --git a/docs/recommended-practices.md b/docs/recommended-practices.md index 7825d5cd..1b94f7b9 100644 --- a/docs/recommended-practices.md +++ b/docs/recommended-practices.md @@ -132,10 +132,16 @@ Speed up CI by caching `.codegraph/`: Start the MCP server so AI assistants can query your graph: ```bash -codegraph mcp +codegraph mcp # Single-repo mode (default) — only local project +codegraph mcp --multi-repo # Multi-repo — all registered repos accessible +codegraph mcp --repos a,b # Multi-repo with allowlist ``` -The server exposes tools for `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, and `module_map`. +By default, the MCP server runs in **single-repo mode** — the AI agent can only query the current project's graph. The `repo` parameter and `list_repos` tool are not exposed, preventing agents from silently accessing other codebases. + +Enable `--multi-repo` to let the agent query any registered repository, or use `--repos` to restrict access to a specific set of repos. + +The server exposes tools for `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, and `hotspots`. ### CLAUDE.md for your project diff --git a/src/builder.js b/src/builder.js index 35317b97..31e0eeea 100644 --- a/src/builder.js +++ b/src/builder.js @@ -10,18 +10,20 @@ import { computeConfidence, resolveImportPath, resolveImportsBatch } from './res export { resolveImportPath } from './resolve.js'; -export function collectFiles(dir, files = [], config = {}) { +export function collectFiles(dir, files = [], config = {}, directories = null) { + const trackDirs = directories !== null; let entries; try { entries = fs.readdirSync(dir, { withFileTypes: true }); } catch (err) { warn(`Cannot read directory ${dir}: ${err.message}`); - return files; + return trackDirs ? { files, directories } : files; } // Merge config ignoreDirs with defaults const extraIgnore = config.ignoreDirs ? new Set(config.ignoreDirs) : null; + let hasFiles = false; for (const entry of entries) { if (entry.name.startsWith('.') && entry.name !== '.') { if (IGNORE_DIRS.has(entry.name)) continue; @@ -32,12 +34,16 @@ export function collectFiles(dir, files = [], config = {}) { const full = path.join(dir, entry.name); if (entry.isDirectory()) { - collectFiles(full, files, config); + collectFiles(full, files, config, directories); } else if (EXTENSIONS.has(path.extname(entry.name))) { files.push(full); + hasFiles = true; } } - return files; + if (trackDirs && hasFiles) { + directories.add(dir); + } + return trackDirs ? { files, directories } : files; } export function loadPathAliases(rootDir) { @@ -163,7 +169,9 @@ export async function buildGraph(rootDir, opts = {}) { ); } - const files = collectFiles(rootDir, [], config); + const collected = collectFiles(rootDir, [], config, new Set()); + const files = collected.files; + const discoveredDirs = collected.directories; console.log(`Found ${files.length} files to parse`); // Check for incremental build @@ -179,23 +187,28 @@ export async function buildGraph(rootDir, opts = {}) { if (isFullBuild) { db.exec( - 'PRAGMA foreign_keys = OFF; DELETE FROM edges; DELETE FROM nodes; PRAGMA foreign_keys = ON;', + 'PRAGMA foreign_keys = OFF; DELETE FROM node_metrics; DELETE FROM edges; DELETE FROM nodes; PRAGMA foreign_keys = ON;', ); } else { console.log(`Incremental: ${changed.length} changed, ${removed.length} removed`); - // Remove nodes/edges for changed and removed files + // Remove metrics/edges/nodes for changed and removed files const deleteNodesForFile = db.prepare('DELETE FROM nodes WHERE file = ?'); const deleteEdgesForFile = db.prepare(` DELETE FROM edges WHERE source_id IN (SELECT id FROM nodes WHERE file = @f) OR target_id IN (SELECT id FROM nodes WHERE file = @f) `); + const deleteMetricsForFile = db.prepare( + 'DELETE FROM node_metrics WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)', + ); for (const relPath of removed) { deleteEdgesForFile.run({ f: relPath }); + deleteMetricsForFile.run(relPath); deleteNodesForFile.run(relPath); } for (const item of changed) { const relPath = item.relPath || normalizePath(path.relative(rootDir, item.file)); deleteEdgesForFile.run({ f: relPath }); + deleteMetricsForFile.run(relPath); deleteNodesForFile.run(relPath); } } @@ -539,6 +552,30 @@ export async function buildGraph(rootDir, opts = {}) { }); buildEdges(); + // Build line count map for structure metrics + const lineCountMap = new Map(); + for (const [relPath] of fileSymbols) { + const absPath = path.join(rootDir, relPath); + try { + const content = fs.readFileSync(absPath, 'utf-8'); + lineCountMap.set(relPath, content.split('\n').length); + } catch { + lineCountMap.set(relPath, 0); + } + } + + // Build directory structure, containment edges, and metrics + const relDirs = new Set(); + for (const absDir of discoveredDirs) { + relDirs.add(normalizePath(path.relative(rootDir, absDir))); + } + try { + const { buildStructure } = await import('./structure.js'); + buildStructure(db, fileSymbols, rootDir, lineCountMap, relDirs); + } catch (err) { + debug(`Structure analysis failed: ${err.message}`); + } + const nodeCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c; console.log(`Graph built: ${nodeCount} nodes, ${edgeCount} edges`); console.log(`Stored in ${dbPath}`); diff --git a/src/cli.js b/src/cli.js index ff9373d7..e1868d0f 100644 --- a/src/cli.js +++ b/src/cli.js @@ -19,7 +19,13 @@ import { moduleMap, queryName, } from './queries.js'; -import { listRepos, REGISTRY_PATH, registerRepo, unregisterRepo } from './registry.js'; +import { + listRepos, + pruneRegistry, + REGISTRY_PATH, + registerRepo, + unregisterRepo, +} from './registry.js'; import { watchProject } from './watcher.js'; const program = new Command(); @@ -187,9 +193,16 @@ program .command('mcp') .description('Start MCP (Model Context Protocol) server for AI assistant integration') .option('-d, --db ', 'Path to graph.db') + .option('--multi-repo', 'Enable access to all registered repositories') + .option('--repos ', 'Comma-separated list of allowed repo names (restricts access)') .action(async (opts) => { const { startMCPServer } = await import('./mcp.js'); - await startMCPServer(opts.db); + const mcpOpts = {}; + mcpOpts.multiRepo = opts.multiRepo || !!opts.repos; + if (opts.repos) { + mcpOpts.allowedRepos = opts.repos.split(',').map((s) => s.trim()); + } + await startMCPServer(opts.db, mcpOpts); }); // ─── Registry commands ────────────────────────────────────────────────── @@ -242,6 +255,21 @@ registry } }); +registry + .command('prune') + .description('Remove registry entries whose directories no longer exist') + .action(() => { + const pruned = pruneRegistry(); + if (pruned.length === 0) { + console.log('No stale entries found.'); + } else { + for (const entry of pruned) { + console.log(`Pruned "${entry.name}" (${entry.path})`); + } + console.log(`\nRemoved ${pruned.length} stale ${pruned.length === 1 ? 'entry' : 'entries'}.`); + } + }); + // ─── Embedding commands ───────────────────────────────────────────────── program @@ -295,6 +323,53 @@ program }); }); +program + .command('structure [dir]') + .description( + 'Show project directory structure with hierarchy, cohesion scores, and per-file metrics', + ) + .option('-d, --db ', 'Path to graph.db') + .option('--depth ', 'Max directory depth') + .option('--sort ', 'Sort by: cohesion | fan-in | fan-out | density | files', 'files') + .option('-j, --json', 'Output as JSON') + .action(async (dir, opts) => { + const { structureData, formatStructure } = await import('./structure.js'); + const data = structureData(opts.db, { + directory: dir, + depth: opts.depth ? parseInt(opts.depth, 10) : undefined, + sort: opts.sort, + }); + if (opts.json) { + console.log(JSON.stringify(data, null, 2)); + } else { + console.log(formatStructure(data)); + } + }); + +program + .command('hotspots') + .description( + 'Find structural hotspots: files or directories with extreme fan-in, fan-out, or symbol density', + ) + .option('-d, --db ', 'Path to graph.db') + .option('-n, --limit ', 'Number of results', '10') + .option('--metric ', 'fan-in | fan-out | density | coupling', 'fan-in') + .option('--level ', 'file | directory', 'file') + .option('-j, --json', 'Output as JSON') + .action(async (opts) => { + const { hotspotsData, formatHotspots } = await import('./structure.js'); + const data = hotspotsData(opts.db, { + metric: opts.metric, + level: opts.level, + limit: parseInt(opts.limit, 10), + }); + if (opts.json) { + console.log(JSON.stringify(data, null, 2)); + } else { + console.log(formatHotspots(data)); + } + }); + program .command('watch [dir]') .description('Watch project for file changes and incrementally update the graph') diff --git a/src/constants.js b/src/constants.js index 6bd4bfdf..2bcbb1af 100644 --- a/src/constants.js +++ b/src/constants.js @@ -20,8 +20,6 @@ export const IGNORE_DIRS = new Set([ '.env', ]); -// Re-export as an indirect binding to avoid TDZ in the circular -// parser.js ↔ constants.js import (no value read at evaluation time). export { SUPPORTED_EXTENSIONS as EXTENSIONS }; export function shouldIgnore(dirName) { diff --git a/src/db.js b/src/db.js index ab0b9eaa..17a7d1de 100644 --- a/src/db.js +++ b/src/db.js @@ -33,6 +33,19 @@ export const MIGRATIONS = [ CREATE INDEX IF NOT EXISTS idx_edges_source ON edges(source_id); CREATE INDEX IF NOT EXISTS idx_edges_target ON edges(target_id); CREATE INDEX IF NOT EXISTS idx_edges_kind ON edges(kind); + CREATE TABLE IF NOT EXISTS node_metrics ( + node_id INTEGER PRIMARY KEY, + line_count INTEGER, + symbol_count INTEGER, + import_count INTEGER, + export_count INTEGER, + fan_in INTEGER, + fan_out INTEGER, + cohesion REAL, + file_count INTEGER, + FOREIGN KEY(node_id) REFERENCES nodes(id) + ); + CREATE INDEX IF NOT EXISTS idx_node_metrics_node ON node_metrics(node_id); `, }, { diff --git a/src/export.js b/src/export.js index 7cee9746..7595e8a9 100644 --- a/src/export.js +++ b/src/export.js @@ -24,25 +24,60 @@ export function exportDOT(db, opts = {}) { `) .all(); + // Try to use directory nodes from DB (built by structure analysis) + const hasDirectoryNodes = + db.prepare("SELECT COUNT(*) as c FROM nodes WHERE kind = 'directory'").get().c > 0; + const dirs = new Map(); const allFiles = new Set(); for (const { source, target } of edges) { allFiles.add(source); allFiles.add(target); } - for (const file of allFiles) { - const dir = path.dirname(file) || '.'; - if (!dirs.has(dir)) dirs.set(dir, []); - dirs.get(dir).push(file); + + if (hasDirectoryNodes) { + // Use DB directory structure with cohesion labels + const dbDirs = db + .prepare(` + SELECT n.id, n.name, nm.cohesion + FROM nodes n + LEFT JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = 'directory' + `) + .all(); + + for (const d of dbDirs) { + const containedFiles = db + .prepare(` + SELECT n.name FROM edges e + JOIN nodes n ON e.target_id = n.id + WHERE e.source_id = ? AND e.kind = 'contains' AND n.kind = 'file' + `) + .all(d.id) + .map((r) => r.name) + .filter((f) => allFiles.has(f)); + + if (containedFiles.length > 0) { + dirs.set(d.name, { files: containedFiles, cohesion: d.cohesion }); + } + } + } else { + // Fallback: reconstruct from path.dirname() + for (const file of allFiles) { + const dir = path.dirname(file) || '.'; + if (!dirs.has(dir)) dirs.set(dir, { files: [], cohesion: null }); + dirs.get(dir).files.push(file); + } } let clusterIdx = 0; - for (const [dir, files] of [...dirs].sort()) { + for (const [dir, info] of [...dirs].sort((a, b) => a[0].localeCompare(b[0]))) { lines.push(` subgraph cluster_${clusterIdx++} {`); - lines.push(` label="${dir}";`); + const cohLabel = info.cohesion !== null ? ` (cohesion: ${info.cohesion.toFixed(2)})` : ''; + lines.push(` label="${dir}${cohLabel}";`); lines.push(` style=dashed;`); lines.push(` color="#999999";`); - for (const f of files) { + for (const f of info.files) { const label = path.basename(f); lines.push(` "${f}" [label="${label}"];`); } diff --git a/src/index.js b/src/index.js index f1df2118..7ce90860 100644 --- a/src/index.js +++ b/src/index.js @@ -50,11 +50,22 @@ export { export { listRepos, loadRegistry, + pruneRegistry, REGISTRY_PATH, registerRepo, resolveRepoDbPath, saveRegistry, unregisterRepo, } from './registry.js'; +// Structure analysis +export { + buildStructure, + formatHotspots, + formatModuleBoundaries, + formatStructure, + hotspotsData, + moduleBoundariesData, + structureData, +} from './structure.js'; // Watch mode export { watchProject } from './watcher.js'; diff --git a/src/mcp.js b/src/mcp.js index 7b5b61c5..d8f5be82 100644 --- a/src/mcp.js +++ b/src/mcp.js @@ -16,7 +16,7 @@ const REPO_PROP = { }, }; -const TOOLS = [ +const BASE_TOOLS = [ { name: 'query_function', description: 'Find callers and callees of a function by name', @@ -29,7 +29,6 @@ const TOOLS = [ description: 'Traversal depth for transitive callers', default: 2, }, - ...REPO_PROP, }, required: ['name'], }, @@ -41,7 +40,6 @@ const TOOLS = [ type: 'object', properties: { file: { type: 'string', description: 'File path (partial match supported)' }, - ...REPO_PROP, }, required: ['file'], }, @@ -53,7 +51,6 @@ const TOOLS = [ type: 'object', properties: { file: { type: 'string', description: 'File path to analyze' }, - ...REPO_PROP, }, required: ['file'], }, @@ -63,9 +60,7 @@ const TOOLS = [ description: 'Detect circular dependencies in the codebase', inputSchema: { type: 'object', - properties: { - ...REPO_PROP, - }, + properties: {}, }, }, { @@ -75,7 +70,6 @@ const TOOLS = [ type: 'object', properties: { limit: { type: 'number', description: 'Number of top files to show', default: 20 }, - ...REPO_PROP, }, }, }, @@ -88,7 +82,6 @@ const TOOLS = [ name: { type: 'string', description: 'Function/method/class name (partial match)' }, depth: { type: 'number', description: 'Transitive caller depth', default: 3 }, no_tests: { type: 'boolean', description: 'Exclude test files', default: false }, - ...REPO_PROP, }, required: ['name'], }, @@ -103,7 +96,6 @@ const TOOLS = [ name: { type: 'string', description: 'Function/method/class name (partial match)' }, depth: { type: 'number', description: 'Max traversal depth', default: 5 }, no_tests: { type: 'boolean', description: 'Exclude test files', default: false }, - ...REPO_PROP, }, required: ['name'], }, @@ -118,7 +110,6 @@ const TOOLS = [ ref: { type: 'string', description: 'Git ref to diff against (default: HEAD)' }, depth: { type: 'number', description: 'Transitive caller depth', default: 3 }, no_tests: { type: 'boolean', description: 'Exclude test files', default: false }, - ...REPO_PROP, }, }, }, @@ -132,7 +123,6 @@ const TOOLS = [ query: { type: 'string', description: 'Natural language search query' }, limit: { type: 'number', description: 'Max results to return', default: 15 }, min_score: { type: 'number', description: 'Minimum similarity score (0-1)', default: 0.2 }, - ...REPO_PROP, }, required: ['query'], }, @@ -153,7 +143,6 @@ const TOOLS = [ description: 'File-level graph (true) or function-level (false)', default: true, }, - ...REPO_PROP, }, required: ['format'], }, @@ -168,27 +157,94 @@ const TOOLS = [ file: { type: 'string', description: 'Filter by file path (partial match)' }, pattern: { type: 'string', description: 'Filter by function name (partial match)' }, no_tests: { type: 'boolean', description: 'Exclude test files', default: false }, - ...REPO_PROP, }, }, }, { - name: 'list_repos', - description: 'List all repositories registered in the codegraph registry', + name: 'structure', + description: + 'Show project structure with directory hierarchy, cohesion scores, and per-file metrics', inputSchema: { type: 'object', - properties: {}, + properties: { + directory: { type: 'string', description: 'Filter to a specific directory path' }, + depth: { type: 'number', description: 'Max directory depth to show' }, + sort: { + type: 'string', + enum: ['cohesion', 'fan-in', 'fan-out', 'density', 'files'], + description: 'Sort directories by metric', + }, + }, + }, + }, + { + name: 'hotspots', + description: + 'Find structural hotspots: files or directories with extreme fan-in, fan-out, or symbol density', + inputSchema: { + type: 'object', + properties: { + metric: { + type: 'string', + enum: ['fan-in', 'fan-out', 'density', 'coupling'], + description: 'Metric to rank by', + }, + level: { + type: 'string', + enum: ['file', 'directory'], + description: 'Rank files or directories', + }, + limit: { type: 'number', description: 'Number of results to return', default: 10 }, + }, }, }, ]; -export { TOOLS }; +const LIST_REPOS_TOOL = { + name: 'list_repos', + description: 'List all repositories registered in the codegraph registry', + inputSchema: { + type: 'object', + properties: {}, + }, +}; + +/** + * Build the tool list based on multi-repo mode. + * @param {boolean} multiRepo - If true, inject `repo` prop into each tool and append `list_repos` + * @returns {object[]} + */ +function buildToolList(multiRepo) { + if (!multiRepo) return BASE_TOOLS; + return [ + ...BASE_TOOLS.map((tool) => ({ + ...tool, + inputSchema: { + ...tool.inputSchema, + properties: { ...tool.inputSchema.properties, ...REPO_PROP }, + }, + })), + LIST_REPOS_TOOL, + ]; +} + +// Backward-compatible export: full multi-repo tool list +const TOOLS = buildToolList(true); + +export { TOOLS, buildToolList }; /** * Start the MCP server. * This function requires @modelcontextprotocol/sdk to be installed. + * + * @param {string} [customDbPath] - Path to a specific graph.db + * @param {object} [options] + * @param {boolean} [options.multiRepo] - Enable multi-repo access (default: false) + * @param {string[]} [options.allowedRepos] - Restrict access to these repo names only */ -export async function startMCPServer(customDbPath) { +export async function startMCPServer(customDbPath, options = {}) { + const { allowedRepos } = options; + const multiRepo = options.multiRepo || !!allowedRepos; let Server, StdioServerTransport; try { const sdk = await import('@modelcontextprotocol/sdk/server/index.js'); @@ -223,14 +279,28 @@ export async function startMCPServer(customDbPath) { { capabilities: { tools: {} } }, ); - server.setRequestHandler('tools/list', async () => ({ tools: TOOLS })); + server.setRequestHandler('tools/list', async () => ({ tools: buildToolList(multiRepo) })); server.setRequestHandler('tools/call', async (request) => { const { name, arguments: args } = request.params; try { + if (!multiRepo && args.repo) { + throw new Error( + 'Multi-repo access is disabled. Restart with `codegraph mcp --multi-repo` to access other repositories.', + ); + } + if (!multiRepo && name === 'list_repos') { + throw new Error( + 'Multi-repo access is disabled. Restart with `codegraph mcp --multi-repo` to list repositories.', + ); + } + let dbPath = customDbPath || undefined; if (args.repo) { + if (allowedRepos && !allowedRepos.includes(args.repo)) { + throw new Error(`Repository "${args.repo}" is not in the allowed repos list.`); + } const { resolveRepoDbPath } = await import('./registry.js'); const resolved = resolveRepoDbPath(args.repo); if (!resolved) @@ -333,9 +403,31 @@ export async function startMCPServer(customDbPath) { noTests: args.no_tests, }); break; + case 'structure': { + const { structureData } = await import('./structure.js'); + result = structureData(dbPath, { + directory: args.directory, + depth: args.depth, + sort: args.sort, + }); + break; + } + case 'hotspots': { + const { hotspotsData } = await import('./structure.js'); + result = hotspotsData(dbPath, { + metric: args.metric, + level: args.level, + limit: args.limit, + }); + break; + } case 'list_repos': { const { listRepos } = await import('./registry.js'); - result = { repos: listRepos() }; + let repos = listRepos(); + if (allowedRepos) { + repos = repos.filter((r) => allowedRepos.includes(r.name)); + } + result = { repos }; break; } default: diff --git a/src/parser.js b/src/parser.js index 52ca1ce1..d372d4b0 100644 --- a/src/parser.js +++ b/src/parser.js @@ -2,7 +2,6 @@ import fs from 'node:fs'; import path from 'node:path'; import { fileURLToPath } from 'node:url'; import { Language, Parser } from 'web-tree-sitter'; -import { normalizePath } from './constants.js'; import { warn } from './logger.js'; import { loadNative } from './native.js'; @@ -2145,7 +2144,7 @@ export async function parseFilesAuto(filePaths, rootDir, opts = {}) { const nativeResults = native.parseFiles(filePaths, rootDir); for (const r of nativeResults) { if (!r) continue; - const relPath = normalizePath(path.relative(rootDir, r.file)); + const relPath = path.relative(rootDir, r.file).split(path.sep).join('/'); result.set(relPath, normalizeNativeSymbols(r)); } return result; @@ -2163,7 +2162,7 @@ export async function parseFilesAuto(filePaths, rootDir, opts = {}) { } const symbols = wasmExtractSymbols(parsers, filePath, code); if (symbols) { - const relPath = normalizePath(path.relative(rootDir, filePath)); + const relPath = path.relative(rootDir, filePath).split(path.sep).join('/'); result.set(relPath, symbols); } } diff --git a/src/registry.js b/src/registry.js index a0b1f1ee..96bab195 100644 --- a/src/registry.js +++ b/src/registry.js @@ -36,12 +36,39 @@ export function saveRegistry(registry, registryPath = REGISTRY_PATH) { /** * Register a project directory. Idempotent. * Name defaults to `path.basename(rootDir)`. + * + * When no explicit name is provided and the basename already exists + * pointing to a different path, auto-suffixes (`api` → `api-2`, `api-3`, …). + * Re-registering the same path updates in place. Explicit names always overwrite. */ export function registerRepo(rootDir, name, registryPath = REGISTRY_PATH) { const absRoot = path.resolve(rootDir); - const repoName = name || path.basename(absRoot); + const baseName = name || path.basename(absRoot); const registry = loadRegistry(registryPath); + let repoName = baseName; + + // Auto-suffix only when no explicit name was provided + if (!name) { + const existing = registry.repos[baseName]; + if (existing && path.resolve(existing.path) !== absRoot) { + // Basename collision with a different path — find next available suffix + let suffix = 2; + while (registry.repos[`${baseName}-${suffix}`]) { + const entry = registry.repos[`${baseName}-${suffix}`]; + if (path.resolve(entry.path) === absRoot) { + // Already registered under this suffixed name — update in place + repoName = `${baseName}-${suffix}`; + break; + } + suffix++; + } + if (repoName === baseName) { + repoName = `${baseName}-${suffix}`; + } + } + } + registry.repos[repoName] = { path: absRoot, dbPath: path.join(absRoot, '.codegraph', 'graph.db'), @@ -93,3 +120,26 @@ export function resolveRepoDbPath(name, registryPath = REGISTRY_PATH) { } return entry.dbPath; } + +/** + * Remove registry entries whose repo directory no longer exists on disk. + * Only checks the repo directory (not the DB file — a missing DB is normal pre-build state). + * Returns an array of `{ name, path }` for each pruned entry. + */ +export function pruneRegistry(registryPath = REGISTRY_PATH) { + const registry = loadRegistry(registryPath); + const pruned = []; + + for (const [name, entry] of Object.entries(registry.repos)) { + if (!fs.existsSync(entry.path)) { + pruned.push({ name, path: entry.path }); + delete registry.repos[name]; + } + } + + if (pruned.length > 0) { + saveRegistry(registry, registryPath); + } + + return pruned; +} diff --git a/src/structure.js b/src/structure.js new file mode 100644 index 00000000..e5e504a3 --- /dev/null +++ b/src/structure.js @@ -0,0 +1,491 @@ +import path from 'node:path'; +import { normalizePath } from './constants.js'; +import { openReadonlyOrFail } from './db.js'; +import { debug } from './logger.js'; + +// ─── Build-time: insert directory nodes, contains edges, and metrics ──── + +/** + * Build directory structure nodes, containment edges, and compute metrics. + * Called from builder.js after edge building. + * + * @param {import('better-sqlite3').Database} db - Open read-write database + * @param {Map} fileSymbols - Map of relPath → { definitions, imports, exports, calls } + * @param {string} rootDir - Absolute root directory + * @param {Map} lineCountMap - Map of relPath → line count + * @param {Set} directories - Set of relative directory paths + */ +export function buildStructure(db, fileSymbols, _rootDir, lineCountMap, directories) { + const insertNode = db.prepare( + 'INSERT OR IGNORE INTO nodes (name, kind, file, line, end_line) VALUES (?, ?, ?, ?, ?)', + ); + const getNodeId = db.prepare( + 'SELECT id FROM nodes WHERE name = ? AND kind = ? AND file = ? AND line = ?', + ); + const insertEdge = db.prepare( + 'INSERT INTO edges (source_id, target_id, kind, confidence, dynamic) VALUES (?, ?, ?, ?, ?)', + ); + const upsertMetric = db.prepare(` + INSERT OR REPLACE INTO node_metrics + (node_id, line_count, symbol_count, import_count, export_count, fan_in, fan_out, cohesion, file_count) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) + `); + + // Clean previous directory nodes/edges (idempotent rebuild) + db.exec(` + DELETE FROM edges WHERE kind = 'contains'; + DELETE FROM node_metrics; + DELETE FROM nodes WHERE kind = 'directory'; + `); + + // Step 1: Ensure all directories are represented (including intermediate parents) + const allDirs = new Set(); + for (const dir of directories) { + let d = dir; + while (d && d !== '.') { + allDirs.add(d); + d = normalizePath(path.dirname(d)); + } + } + // Also add dirs derived from file paths + for (const relPath of fileSymbols.keys()) { + let d = normalizePath(path.dirname(relPath)); + while (d && d !== '.') { + allDirs.add(d); + d = normalizePath(path.dirname(d)); + } + } + + // Step 2: Insert directory nodes + const insertDirs = db.transaction(() => { + for (const dir of allDirs) { + insertNode.run(dir, 'directory', dir, 0, null); + } + }); + insertDirs(); + + // Step 3: Insert 'contains' edges (dir → file, dir → subdirectory) + const insertContains = db.transaction(() => { + // dir → file + for (const relPath of fileSymbols.keys()) { + const dir = normalizePath(path.dirname(relPath)); + if (!dir || dir === '.') continue; + const dirRow = getNodeId.get(dir, 'directory', dir, 0); + const fileRow = getNodeId.get(relPath, 'file', relPath, 0); + if (dirRow && fileRow) { + insertEdge.run(dirRow.id, fileRow.id, 'contains', 1.0, 0); + } + } + // dir → subdirectory + for (const dir of allDirs) { + const parent = normalizePath(path.dirname(dir)); + if (!parent || parent === '.' || parent === dir) continue; + const parentRow = getNodeId.get(parent, 'directory', parent, 0); + const childRow = getNodeId.get(dir, 'directory', dir, 0); + if (parentRow && childRow) { + insertEdge.run(parentRow.id, childRow.id, 'contains', 1.0, 0); + } + } + }); + insertContains(); + + // Step 4: Compute per-file metrics + // Pre-compute fan-in/fan-out per file from import edges + const fanInMap = new Map(); + const fanOutMap = new Map(); + const importEdges = db + .prepare(` + SELECT n1.file AS source_file, n2.file AS target_file + FROM edges e + JOIN nodes n1 ON e.source_id = n1.id + JOIN nodes n2 ON e.target_id = n2.id + WHERE e.kind IN ('imports', 'imports-type') + AND n1.file != n2.file + `) + .all(); + + for (const { source_file, target_file } of importEdges) { + fanOutMap.set(source_file, (fanOutMap.get(source_file) || 0) + 1); + fanInMap.set(target_file, (fanInMap.get(target_file) || 0) + 1); + } + + const computeFileMetrics = db.transaction(() => { + for (const [relPath, symbols] of fileSymbols) { + const fileRow = getNodeId.get(relPath, 'file', relPath, 0); + if (!fileRow) continue; + + const lineCount = lineCountMap.get(relPath) || 0; + // Deduplicate definitions by name+kind+line + const seen = new Set(); + let symbolCount = 0; + for (const d of symbols.definitions) { + const key = `${d.name}|${d.kind}|${d.line}`; + if (!seen.has(key)) { + seen.add(key); + symbolCount++; + } + } + const importCount = symbols.imports.length; + const exportCount = symbols.exports.length; + const fanIn = fanInMap.get(relPath) || 0; + const fanOut = fanOutMap.get(relPath) || 0; + + upsertMetric.run( + fileRow.id, + lineCount, + symbolCount, + importCount, + exportCount, + fanIn, + fanOut, + null, + null, + ); + } + }); + computeFileMetrics(); + + // Step 5: Compute per-directory metrics + // Build a map of dir → descendant files + const dirFiles = new Map(); + for (const dir of allDirs) { + dirFiles.set(dir, []); + } + for (const relPath of fileSymbols.keys()) { + let d = normalizePath(path.dirname(relPath)); + while (d && d !== '.') { + if (dirFiles.has(d)) { + dirFiles.get(d).push(relPath); + } + d = normalizePath(path.dirname(d)); + } + } + + const computeDirMetrics = db.transaction(() => { + for (const [dir, files] of dirFiles) { + const dirRow = getNodeId.get(dir, 'directory', dir, 0); + if (!dirRow) continue; + + const fileCount = files.length; + let symbolCount = 0; + let totalFanIn = 0; + let totalFanOut = 0; + const filesInDir = new Set(files); + + for (const f of files) { + const sym = fileSymbols.get(f); + if (sym) { + const seen = new Set(); + for (const d of sym.definitions) { + const key = `${d.name}|${d.kind}|${d.line}`; + if (!seen.has(key)) { + seen.add(key); + symbolCount++; + } + } + } + } + + // Compute cross-boundary fan-in/fan-out and cohesion + let intraEdges = 0; + let crossEdges = 0; + for (const { source_file, target_file } of importEdges) { + const srcInside = filesInDir.has(source_file); + const tgtInside = filesInDir.has(target_file); + if (srcInside && tgtInside) { + intraEdges++; + } else if (srcInside || tgtInside) { + crossEdges++; + if (!srcInside && tgtInside) totalFanIn++; + if (srcInside && !tgtInside) totalFanOut++; + } + } + + const totalEdges = intraEdges + crossEdges; + const cohesion = totalEdges > 0 ? intraEdges / totalEdges : null; + + upsertMetric.run( + dirRow.id, + null, + symbolCount, + null, + null, + totalFanIn, + totalFanOut, + cohesion, + fileCount, + ); + } + }); + computeDirMetrics(); + + const dirCount = allDirs.size; + debug(`Structure: ${dirCount} directories, ${fileSymbols.size} files with metrics`); +} + +// ─── Query functions (read-only) ────────────────────────────────────── + +/** + * Return hierarchical directory tree with metrics. + */ +export function structureData(customDbPath, opts = {}) { + const db = openReadonlyOrFail(customDbPath); + const filterDir = opts.directory || null; + const maxDepth = opts.depth || null; + const sortBy = opts.sort || 'files'; + + // Get all directory nodes with their metrics + let dirs = db + .prepare(` + SELECT n.id, n.name, n.file, nm.symbol_count, nm.fan_in, nm.fan_out, nm.cohesion, nm.file_count + FROM nodes n + LEFT JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = 'directory' + `) + .all(); + + if (filterDir) { + const norm = normalizePath(filterDir); + dirs = dirs.filter((d) => d.name === norm || d.name.startsWith(`${norm}/`)); + } + + if (maxDepth) { + const baseDepth = filterDir ? normalizePath(filterDir).split('/').length : 0; + dirs = dirs.filter((d) => { + const depth = d.name.split('/').length - baseDepth; + return depth <= maxDepth; + }); + } + + // Sort + const sortFn = getSortFn(sortBy); + dirs.sort(sortFn); + + // Get file metrics for each directory + const result = dirs.map((d) => { + const files = db + .prepare(` + SELECT n.name, nm.line_count, nm.symbol_count, nm.import_count, nm.export_count, nm.fan_in, nm.fan_out + FROM edges e + JOIN nodes n ON e.target_id = n.id + LEFT JOIN node_metrics nm ON n.id = nm.node_id + WHERE e.source_id = ? AND e.kind = 'contains' AND n.kind = 'file' + `) + .all(d.id); + + const subdirs = db + .prepare(` + SELECT n.name + FROM edges e + JOIN nodes n ON e.target_id = n.id + WHERE e.source_id = ? AND e.kind = 'contains' AND n.kind = 'directory' + `) + .all(d.id); + + return { + directory: d.name, + fileCount: d.file_count || 0, + symbolCount: d.symbol_count || 0, + fanIn: d.fan_in || 0, + fanOut: d.fan_out || 0, + cohesion: d.cohesion, + density: d.file_count > 0 ? (d.symbol_count || 0) / d.file_count : 0, + files: files.map((f) => ({ + file: f.name, + lineCount: f.line_count || 0, + symbolCount: f.symbol_count || 0, + importCount: f.import_count || 0, + exportCount: f.export_count || 0, + fanIn: f.fan_in || 0, + fanOut: f.fan_out || 0, + })), + subdirectories: subdirs.map((s) => s.name), + }; + }); + + db.close(); + return { directories: result, count: result.length }; +} + +/** + * Return top N files or directories ranked by a chosen metric. + */ +export function hotspotsData(customDbPath, opts = {}) { + const db = openReadonlyOrFail(customDbPath); + const metric = opts.metric || 'fan-in'; + const level = opts.level || 'file'; + const limit = opts.limit || 10; + + const kind = level === 'directory' ? 'directory' : 'file'; + + const HOTSPOT_QUERIES = { + 'fan-in': db.prepare(` + SELECT n.name, n.kind, nm.line_count, nm.symbol_count, nm.import_count, nm.export_count, + nm.fan_in, nm.fan_out, nm.cohesion, nm.file_count + FROM nodes n JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = ? ORDER BY nm.fan_in DESC NULLS LAST LIMIT ?`), + 'fan-out': db.prepare(` + SELECT n.name, n.kind, nm.line_count, nm.symbol_count, nm.import_count, nm.export_count, + nm.fan_in, nm.fan_out, nm.cohesion, nm.file_count + FROM nodes n JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = ? ORDER BY nm.fan_out DESC NULLS LAST LIMIT ?`), + density: db.prepare(` + SELECT n.name, n.kind, nm.line_count, nm.symbol_count, nm.import_count, nm.export_count, + nm.fan_in, nm.fan_out, nm.cohesion, nm.file_count + FROM nodes n JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = ? ORDER BY nm.symbol_count DESC NULLS LAST LIMIT ?`), + coupling: db.prepare(` + SELECT n.name, n.kind, nm.line_count, nm.symbol_count, nm.import_count, nm.export_count, + nm.fan_in, nm.fan_out, nm.cohesion, nm.file_count + FROM nodes n JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = ? ORDER BY (COALESCE(nm.fan_in, 0) + COALESCE(nm.fan_out, 0)) DESC NULLS LAST LIMIT ?`), + }; + + const stmt = HOTSPOT_QUERIES[metric] || HOTSPOT_QUERIES['fan-in']; + const rows = stmt.all(kind, limit); + + const hotspots = rows.map((r) => ({ + name: r.name, + kind: r.kind, + lineCount: r.line_count, + symbolCount: r.symbol_count, + importCount: r.import_count, + exportCount: r.export_count, + fanIn: r.fan_in, + fanOut: r.fan_out, + cohesion: r.cohesion, + fileCount: r.file_count, + density: + r.file_count > 0 + ? (r.symbol_count || 0) / r.file_count + : r.line_count > 0 + ? (r.symbol_count || 0) / r.line_count + : 0, + coupling: (r.fan_in || 0) + (r.fan_out || 0), + })); + + db.close(); + return { metric, level, limit, hotspots }; +} + +/** + * Return directories with cohesion above threshold, with top exports/imports. + */ +export function moduleBoundariesData(customDbPath, opts = {}) { + const db = openReadonlyOrFail(customDbPath); + const threshold = opts.threshold || 0.3; + + const dirs = db + .prepare(` + SELECT n.id, n.name, nm.symbol_count, nm.fan_in, nm.fan_out, nm.cohesion, nm.file_count + FROM nodes n + JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = 'directory' AND nm.cohesion IS NOT NULL AND nm.cohesion >= ? + ORDER BY nm.cohesion DESC + `) + .all(threshold); + + const modules = dirs.map((d) => { + // Get files inside this directory + const files = db + .prepare(` + SELECT n.name FROM edges e + JOIN nodes n ON e.target_id = n.id + WHERE e.source_id = ? AND e.kind = 'contains' AND n.kind = 'file' + `) + .all(d.id) + .map((f) => f.name); + + return { + directory: d.name, + cohesion: d.cohesion, + fileCount: d.file_count || 0, + symbolCount: d.symbol_count || 0, + fanIn: d.fan_in || 0, + fanOut: d.fan_out || 0, + files, + }; + }); + + db.close(); + return { threshold, modules, count: modules.length }; +} + +// ─── Formatters ─────────────────────────────────────────────────────── + +export function formatStructure(data) { + if (data.count === 0) return 'No directory structure found. Run "codegraph build" first.'; + + const lines = [`\nProject structure (${data.count} directories):\n`]; + for (const d of data.directories) { + const cohStr = d.cohesion !== null ? ` cohesion=${d.cohesion.toFixed(2)}` : ''; + const depth = d.directory.split('/').length - 1; + const indent = ' '.repeat(depth); + lines.push( + `${indent}${d.directory}/ (${d.fileCount} files, ${d.symbolCount} symbols, <-${d.fanIn} ->${d.fanOut}${cohStr})`, + ); + for (const f of d.files) { + lines.push( + `${indent} ${path.basename(f.file)} ${f.lineCount}L ${f.symbolCount}sym <-${f.fanIn} ->${f.fanOut}`, + ); + } + } + return lines.join('\n'); +} + +export function formatHotspots(data) { + if (data.hotspots.length === 0) return 'No hotspots found. Run "codegraph build" first.'; + + const lines = [`\nHotspots by ${data.metric} (${data.level}-level, top ${data.limit}):\n`]; + let rank = 1; + for (const h of data.hotspots) { + const extra = + h.kind === 'directory' + ? `${h.fileCount} files, cohesion=${h.cohesion !== null ? h.cohesion.toFixed(2) : 'n/a'}` + : `${h.lineCount || 0}L, ${h.symbolCount || 0} symbols`; + lines.push( + ` ${String(rank++).padStart(2)}. ${h.name} <-${h.fanIn || 0} ->${h.fanOut || 0} (${extra})`, + ); + } + return lines.join('\n'); +} + +export function formatModuleBoundaries(data) { + if (data.count === 0) return `No modules found with cohesion >= ${data.threshold}.`; + + const lines = [`\nModule boundaries (cohesion >= ${data.threshold}, ${data.count} modules):\n`]; + for (const m of data.modules) { + lines.push( + ` ${m.directory}/ cohesion=${m.cohesion.toFixed(2)} (${m.fileCount} files, ${m.symbolCount} symbols)`, + ); + lines.push(` Incoming: ${m.fanIn} edges Outgoing: ${m.fanOut} edges`); + if (m.files.length > 0) { + lines.push( + ` Files: ${m.files.slice(0, 5).join(', ')}${m.files.length > 5 ? ` ... +${m.files.length - 5}` : ''}`, + ); + } + lines.push(''); + } + return lines.join('\n'); +} + +// ─── Helpers ────────────────────────────────────────────────────────── + +function getSortFn(sortBy) { + switch (sortBy) { + case 'cohesion': + return (a, b) => (b.cohesion ?? -1) - (a.cohesion ?? -1); + case 'fan-in': + return (a, b) => (b.fan_in || 0) - (a.fan_in || 0); + case 'fan-out': + return (a, b) => (b.fan_out || 0) - (a.fan_out || 0); + case 'density': + return (a, b) => { + const da = a.file_count > 0 ? (a.symbol_count || 0) / a.file_count : 0; + const db_ = b.file_count > 0 ? (b.symbol_count || 0) / b.file_count : 0; + return db_ - da; + }; + default: + return (a, b) => a.name.localeCompare(b.name); + } +} diff --git a/tests/integration/cli.test.js b/tests/integration/cli.test.js index 662443c2..d1950636 100644 --- a/tests/integration/cli.test.js +++ b/tests/integration/cli.test.js @@ -140,6 +140,29 @@ describe('CLI smoke tests', () => { expect(data).toHaveProperty('edges'); }); + // ─── Structure ────────────────────────────────────────────────────── + test('structure --json returns valid JSON with directories', () => { + const out = run('structure', '--db', dbPath, '--json'); + const data = JSON.parse(out); + expect(data).toHaveProperty('directories'); + expect(data).toHaveProperty('count'); + }); + + // ─── Hotspots ────────────────────────────────────────────────────── + test('hotspots --json returns valid JSON with hotspots', () => { + const out = run('hotspots', '--db', dbPath, '--json'); + const data = JSON.parse(out); + expect(data).toHaveProperty('hotspots'); + expect(data).toHaveProperty('metric'); + expect(data).toHaveProperty('level'); + }); + + test('hotspots --level directory returns directory hotspots', () => { + const out = run('hotspots', '--db', dbPath, '--level', 'directory', '--json'); + const data = JSON.parse(out); + expect(data.level).toBe('directory'); + }); + // ─── Info ──────────────────────────────────────────────────────────── test('info outputs engine diagnostics', () => { const out = run('info'); @@ -217,4 +240,38 @@ describe('Registry CLI commands', () => { expect(err.stderr || err.stdout).toContain('not found'); } }); + + test('registry prune removes stale entries', () => { + const staleDir = path.join(tmpHome, 'stale-project'); + fs.mkdirSync(staleDir, { recursive: true }); + + runReg('registry', 'add', staleDir, '-n', 'stale'); + // Remove the directory to make it stale + fs.rmSync(staleDir, { recursive: true, force: true }); + + const out = runReg('registry', 'prune'); + expect(out).toContain('Pruned'); + expect(out).toContain('stale'); + }); + + test('registry prune reports nothing when no stale entries', () => { + // Add a valid repo + runReg('registry', 'add', tmpDir, '-n', 'valid-proj'); + + const out = runReg('registry', 'prune'); + expect(out).toContain('No stale entries found'); + }); + + test('registry add auto-suffixes on basename collision', () => { + const dir1 = path.join(tmpHome, 'ws1', 'api'); + const dir2 = path.join(tmpHome, 'ws2', 'api'); + fs.mkdirSync(dir1, { recursive: true }); + fs.mkdirSync(dir2, { recursive: true }); + + const out1 = runReg('registry', 'add', dir1); + expect(out1).toContain('"api"'); + + const out2 = runReg('registry', 'add', dir2); + expect(out2).toContain('"api-2"'); + }); }); diff --git a/tests/integration/structure.test.js b/tests/integration/structure.test.js new file mode 100644 index 00000000..9bd4607b --- /dev/null +++ b/tests/integration/structure.test.js @@ -0,0 +1,182 @@ +/** + * Integration tests for the structure analysis module. + * Builds a real graph from a multi-directory fixture and tests structure queries. + */ + +import fs from 'node:fs'; +import os from 'node:os'; +import path from 'node:path'; +import Database from 'better-sqlite3'; +import { afterAll, beforeAll, describe, expect, test } from 'vitest'; +import { buildGraph } from '../../src/builder.js'; +import { hotspotsData, moduleBoundariesData, structureData } from '../../src/structure.js'; + +// Multi-directory fixture with cross-directory imports +const FIXTURE_FILES = { + 'src/math.js': ` +export function add(a, b) { return a + b; } +export function multiply(a, b) { return a * b; } +`.trimStart(), + 'src/utils.js': ` +import { add } from './math.js'; +export function double(x) { return add(x, x); } +`.trimStart(), + 'lib/format.js': ` +import { add } from '../src/math.js'; +export function formatSum(a, b) { return String(add(a, b)); } +`.trimStart(), + 'lib/helpers.js': ` +import { formatSum } from './format.js'; +export function printSum(a, b) { console.log(formatSum(a, b)); } +`.trimStart(), + 'index.js': ` +import { double } from './src/utils.js'; +import { printSum } from './lib/helpers.js'; +export function main() { printSum(1, double(2)); } +`.trimStart(), +}; + +let tmpDir, dbPath; + +beforeAll(async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-structure-')); + + // Create directories first + fs.mkdirSync(path.join(tmpDir, 'src'), { recursive: true }); + fs.mkdirSync(path.join(tmpDir, 'lib'), { recursive: true }); + + for (const [relPath, content] of Object.entries(FIXTURE_FILES)) { + fs.writeFileSync(path.join(tmpDir, relPath), content); + } + + await buildGraph(tmpDir, { engine: 'wasm' }); + dbPath = path.join(tmpDir, '.codegraph', 'graph.db'); +}); + +afterAll(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); +}); + +describe('Structure integration', () => { + test('build creates directory nodes', () => { + const db = new Database(dbPath, { readonly: true }); + const dirs = db.prepare("SELECT name FROM nodes WHERE kind = 'directory' ORDER BY name").all(); + db.close(); + const dirNames = dirs.map((d) => d.name); + expect(dirNames).toContain('src'); + expect(dirNames).toContain('lib'); + }); + + test('build creates contains edges', () => { + const db = new Database(dbPath, { readonly: true }); + const containsCount = db + .prepare("SELECT COUNT(*) as c FROM edges WHERE kind = 'contains'") + .get().c; + db.close(); + expect(containsCount).toBeGreaterThan(0); + }); + + test('build creates node_metrics', () => { + const db = new Database(dbPath, { readonly: true }); + const metricsCount = db.prepare('SELECT COUNT(*) as c FROM node_metrics').get().c; + db.close(); + expect(metricsCount).toBeGreaterThan(0); + }); + + test('file metrics have line counts', () => { + const db = new Database(dbPath, { readonly: true }); + const metrics = db + .prepare(` + SELECT n.name, nm.line_count FROM nodes n + JOIN node_metrics nm ON n.id = nm.node_id + WHERE n.kind = 'file' AND nm.line_count > 0 + `) + .all(); + db.close(); + expect(metrics.length).toBeGreaterThan(0); + }); +}); + +describe('structureData', () => { + test('returns directories with metrics', () => { + const data = structureData(dbPath); + expect(data.directories.length).toBeGreaterThan(0); + for (const d of data.directories) { + expect(d).toHaveProperty('directory'); + expect(d).toHaveProperty('fileCount'); + expect(d).toHaveProperty('symbolCount'); + expect(d).toHaveProperty('fanIn'); + expect(d).toHaveProperty('fanOut'); + expect(d).toHaveProperty('files'); + } + }); + + test('filters by directory', () => { + const data = structureData(dbPath, { directory: 'src' }); + for (const d of data.directories) { + expect(d.directory).toMatch(/^src/); + } + }); + + test('limits by depth', () => { + const data = structureData(dbPath, { depth: 1 }); + for (const d of data.directories) { + expect(d.directory.split('/').length).toBeLessThanOrEqual(1); + } + }); + + test('supports JSON output format', () => { + const data = structureData(dbPath); + expect(data).toHaveProperty('count'); + expect(data).toHaveProperty('directories'); + expect(typeof data.count).toBe('number'); + }); +}); + +describe('hotspotsData', () => { + test('returns file hotspots ranked by fan-in', () => { + const data = hotspotsData(dbPath, { metric: 'fan-in', level: 'file', limit: 5 }); + expect(data).toHaveProperty('hotspots'); + expect(data.hotspots.length).toBeGreaterThan(0); + expect(data.hotspots.length).toBeLessThanOrEqual(5); + // Should be sorted descending by fan-in + for (let i = 1; i < data.hotspots.length; i++) { + expect(data.hotspots[i - 1].fanIn).toBeGreaterThanOrEqual(data.hotspots[i].fanIn); + } + }); + + test('returns directory hotspots', () => { + const data = hotspotsData(dbPath, { metric: 'fan-in', level: 'directory', limit: 5 }); + expect(data).toHaveProperty('hotspots'); + for (const h of data.hotspots) { + expect(h.kind).toBe('directory'); + } + }); + + test('supports coupling metric', () => { + const data = hotspotsData(dbPath, { metric: 'coupling', level: 'file', limit: 3 }); + expect(data.metric).toBe('coupling'); + expect(data.hotspots.length).toBeGreaterThan(0); + }); +}); + +describe('moduleBoundariesData', () => { + test('returns modules with cohesion above threshold', () => { + const data = moduleBoundariesData(dbPath, { threshold: 0.0 }); + expect(data).toHaveProperty('modules'); + expect(data).toHaveProperty('count'); + for (const m of data.modules) { + expect(m).toHaveProperty('directory'); + expect(m).toHaveProperty('cohesion'); + expect(m.cohesion).toBeGreaterThanOrEqual(0); + } + }); + + test('high threshold may return fewer or no modules', () => { + const data = moduleBoundariesData(dbPath, { threshold: 0.99 }); + // Either empty or all have high cohesion + for (const m of data.modules) { + expect(m.cohesion).toBeGreaterThanOrEqual(0.99); + } + }); +}); diff --git a/tests/unit/mcp.test.js b/tests/unit/mcp.test.js index 3081465c..bb54d51c 100644 --- a/tests/unit/mcp.test.js +++ b/tests/unit/mcp.test.js @@ -6,7 +6,7 @@ */ import { describe, expect, it, vi } from 'vitest'; -import { TOOLS } from '../../src/mcp.js'; +import { buildToolList, TOOLS } from '../../src/mcp.js'; const ALL_TOOL_NAMES = [ 'query_function', @@ -20,6 +20,8 @@ const ALL_TOOL_NAMES = [ 'semantic_search', 'export_graph', 'list_functions', + 'structure', + 'hotspots', 'list_repos', ]; @@ -113,6 +115,24 @@ describe('TOOLS', () => { expect(lf.inputSchema.properties).toHaveProperty('no_tests'); }); + it('structure has no required parameters', () => { + const st = TOOLS.find((t) => t.name === 'structure'); + expect(st).toBeDefined(); + expect(st.inputSchema.required).toBeUndefined(); + expect(st.inputSchema.properties).toHaveProperty('directory'); + expect(st.inputSchema.properties).toHaveProperty('depth'); + expect(st.inputSchema.properties).toHaveProperty('sort'); + }); + + it('hotspots has no required parameters', () => { + const hs = TOOLS.find((t) => t.name === 'hotspots'); + expect(hs).toBeDefined(); + expect(hs.inputSchema.required).toBeUndefined(); + expect(hs.inputSchema.properties).toHaveProperty('metric'); + expect(hs.inputSchema.properties).toHaveProperty('level'); + expect(hs.inputSchema.properties).toHaveProperty('limit'); + }); + it('every tool except list_repos has optional repo property', () => { for (const tool of TOOLS) { if (tool.name === 'list_repos') continue; @@ -132,6 +152,30 @@ describe('TOOLS', () => { }); }); +// ─── buildToolList ────────────────────────────────────────────────── + +describe('buildToolList', () => { + it('single-repo mode excludes list_repos and repo property', () => { + const tools = buildToolList(false); + const names = tools.map((t) => t.name); + expect(names).not.toContain('list_repos'); + for (const tool of tools) { + expect(tool.inputSchema.properties).not.toHaveProperty('repo'); + } + }); + + it('multi-repo mode includes list_repos and repo property on all other tools', () => { + const tools = buildToolList(true); + const names = tools.map((t) => t.name); + expect(names).toContain('list_repos'); + for (const tool of tools) { + if (tool.name === 'list_repos') continue; + expect(tool.inputSchema.properties).toHaveProperty('repo'); + expect(tool.inputSchema.properties.repo.type).toBe('string'); + } + }); +}); + // ─── startMCPServer handler logic ──────────────────────────────────── describe('startMCPServer handler dispatch', () => { @@ -169,9 +213,10 @@ describe('startMCPServer handler dispatch', () => { const { startMCPServer } = await import('../../src/mcp.js'); await startMCPServer('/tmp/test.db'); - // Test tools/list + // Test tools/list — single-repo mode by default (no list_repos) const toolsList = await handlers['tools/list'](); - expect(toolsList.tools.length).toBe(ALL_TOOL_NAMES.length); + expect(toolsList.tools.length).toBe(ALL_TOOL_NAMES.length - 1); + expect(toolsList.tools.map((t) => t.name)).not.toContain('list_repos'); // Test query_function dispatch const result = await handlers['tools/call']({ @@ -404,7 +449,7 @@ describe('startMCPServer handler dispatch', () => { })); const { startMCPServer } = await import('../../src/mcp.js'); - await startMCPServer(); + await startMCPServer(undefined, { multiRepo: true }); const result = await handlers['tools/call']({ params: { name: 'query_function', arguments: { name: 'test', repo: 'my-project' } }, @@ -447,7 +492,7 @@ describe('startMCPServer handler dispatch', () => { })); const { startMCPServer } = await import('../../src/mcp.js'); - await startMCPServer(); + await startMCPServer(undefined, { multiRepo: true }); const result = await handlers['tools/call']({ params: { name: 'query_function', arguments: { name: 'test', repo: 'unknown-repo' } }, @@ -461,4 +506,309 @@ describe('startMCPServer handler dispatch', () => { vi.doUnmock('../../src/registry.js'); vi.doUnmock('../../src/queries.js'); }); + + it('rejects repo not in allowedRepos list', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/registry.js', () => ({ + resolveRepoDbPath: vi.fn(() => '/some/path'), + })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: vi.fn(), + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer(undefined, { allowedRepos: ['allowed-repo'] }); + + const result = await handlers['tools/call']({ + params: { name: 'query_function', arguments: { name: 'test', repo: 'blocked-repo' } }, + }); + expect(result.isError).toBe(true); + expect(result.content[0].text).toContain('blocked-repo'); + expect(result.content[0].text).toContain('not in the allowed'); + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/registry.js'); + vi.doUnmock('../../src/queries.js'); + }); + + it('allows repo in allowedRepos list', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/registry.js', () => ({ + resolveRepoDbPath: vi.fn(() => '/resolved/db'), + })); + + const queryMock = vi.fn(() => ({ query: 'test', results: [] })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: queryMock, + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer(undefined, { allowedRepos: ['my-repo'] }); + + const result = await handlers['tools/call']({ + params: { name: 'query_function', arguments: { name: 'test', repo: 'my-repo' } }, + }); + expect(result.isError).toBeUndefined(); + expect(queryMock).toHaveBeenCalledWith('test', '/resolved/db'); + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/registry.js'); + vi.doUnmock('../../src/queries.js'); + }); + + it('list_repos filters by allowedRepos', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/registry.js', () => ({ + resolveRepoDbPath: vi.fn(), + listRepos: vi.fn(() => [ + { name: 'alpha', path: '/alpha' }, + { name: 'beta', path: '/beta' }, + { name: 'gamma', path: '/gamma' }, + ]), + })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: vi.fn(), + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer(undefined, { allowedRepos: ['alpha', 'gamma'] }); + + const result = await handlers['tools/call']({ + params: { name: 'list_repos', arguments: {} }, + }); + const data = JSON.parse(result.content[0].text); + expect(data.repos).toHaveLength(2); + expect(data.repos.map((r) => r.name)).toEqual(['alpha', 'gamma']); + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/registry.js'); + vi.doUnmock('../../src/queries.js'); + }); + + it('list_repos returns all repos when no allowlist', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/registry.js', () => ({ + resolveRepoDbPath: vi.fn(), + listRepos: vi.fn(() => [ + { name: 'alpha', path: '/alpha' }, + { name: 'beta', path: '/beta' }, + ]), + })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: vi.fn(), + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer(undefined, { multiRepo: true }); + + const result = await handlers['tools/call']({ + params: { name: 'list_repos', arguments: {} }, + }); + const data = JSON.parse(result.content[0].text); + expect(data.repos).toHaveLength(2); + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/registry.js'); + vi.doUnmock('../../src/queries.js'); + }); + + it('rejects repo param in single-repo mode', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: vi.fn(), + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer('/tmp/test.db'); + + const result = await handlers['tools/call']({ + params: { name: 'query_function', arguments: { name: 'test', repo: 'some-repo' } }, + }); + expect(result.isError).toBe(true); + expect(result.content[0].text).toContain('Multi-repo access is disabled'); + expect(result.content[0].text).toContain('--multi-repo'); + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/queries.js'); + }); + + it('rejects list_repos in single-repo mode', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: vi.fn(), + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer('/tmp/test.db'); + + const result = await handlers['tools/call']({ + params: { name: 'list_repos', arguments: {} }, + }); + expect(result.isError).toBe(true); + expect(result.content[0].text).toContain('Multi-repo access is disabled'); + expect(result.content[0].text).toContain('--multi-repo'); + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/queries.js'); + }); + + it('tools/list in single-repo mode has no repo property and no list_repos', async () => { + const handlers = {}; + + vi.doMock('@modelcontextprotocol/sdk/server/index.js', () => ({ + Server: class MockServer { + setRequestHandler(name, handler) { + handlers[name] = handler; + } + async connect() {} + }, + })); + vi.doMock('@modelcontextprotocol/sdk/server/stdio.js', () => ({ + StdioServerTransport: class MockTransport {}, + })); + vi.doMock('../../src/queries.js', () => ({ + queryNameData: vi.fn(), + impactAnalysisData: vi.fn(), + moduleMapData: vi.fn(), + fileDepsData: vi.fn(), + fnDepsData: vi.fn(), + fnImpactData: vi.fn(), + diffImpactData: vi.fn(), + listFunctionsData: vi.fn(), + })); + + const { startMCPServer } = await import('../../src/mcp.js'); + await startMCPServer('/tmp/test.db'); + + const toolsList = await handlers['tools/list'](); + const names = toolsList.tools.map((t) => t.name); + expect(names).not.toContain('list_repos'); + for (const tool of toolsList.tools) { + expect(tool.inputSchema.properties).not.toHaveProperty('repo'); + } + + vi.doUnmock('@modelcontextprotocol/sdk/server/index.js'); + vi.doUnmock('@modelcontextprotocol/sdk/server/stdio.js'); + vi.doUnmock('../../src/queries.js'); + }); }); diff --git a/tests/unit/registry.test.js b/tests/unit/registry.test.js index f59596cc..a594ea12 100644 --- a/tests/unit/registry.test.js +++ b/tests/unit/registry.test.js @@ -5,6 +5,7 @@ import { afterEach, beforeEach, describe, expect, it } from 'vitest'; import { listRepos, loadRegistry, + pruneRegistry, REGISTRY_PATH, registerRepo, resolveRepoDbPath, @@ -148,6 +149,65 @@ describe('registerRepo', () => { const { entry } = registerRepo(dir, 'proj', registryPath); expect(entry.addedAt).toMatch(/^\d{4}-\d{2}-\d{2}T/); }); + + it('auto-suffixes when basename collides with different path', () => { + const dir1 = path.join(tmpDir, 'workspace1', 'api'); + const dir2 = path.join(tmpDir, 'workspace2', 'api'); + fs.mkdirSync(dir1, { recursive: true }); + fs.mkdirSync(dir2, { recursive: true }); + + const { name: name1 } = registerRepo(dir1, undefined, registryPath); + const { name: name2 } = registerRepo(dir2, undefined, registryPath); + + expect(name1).toBe('api'); + expect(name2).toBe('api-2'); + + const reg = loadRegistry(registryPath); + expect(reg.repos.api.path).toBe(dir1); + expect(reg.repos['api-2'].path).toBe(dir2); + }); + + it('auto-suffix increments past existing suffixes', () => { + const dir1 = path.join(tmpDir, 'a', 'app'); + const dir2 = path.join(tmpDir, 'b', 'app'); + const dir3 = path.join(tmpDir, 'c', 'app'); + fs.mkdirSync(dir1, { recursive: true }); + fs.mkdirSync(dir2, { recursive: true }); + fs.mkdirSync(dir3, { recursive: true }); + + registerRepo(dir1, undefined, registryPath); + registerRepo(dir2, undefined, registryPath); + const { name: name3 } = registerRepo(dir3, undefined, registryPath); + + expect(name3).toBe('app-3'); + }); + + it('re-registering same path with no explicit name updates in place', () => { + const dir = path.join(tmpDir, 'mylib'); + fs.mkdirSync(dir, { recursive: true }); + + const { name: first } = registerRepo(dir, undefined, registryPath); + const { name: second } = registerRepo(dir, undefined, registryPath); + + expect(first).toBe('mylib'); + expect(second).toBe('mylib'); + expect(Object.keys(loadRegistry(registryPath).repos)).toHaveLength(1); + }); + + it('explicit name always overwrites (no suffix)', () => { + const dir1 = path.join(tmpDir, 'one'); + const dir2 = path.join(tmpDir, 'two'); + fs.mkdirSync(dir1, { recursive: true }); + fs.mkdirSync(dir2, { recursive: true }); + + registerRepo(dir1, 'shared', registryPath); + const { name } = registerRepo(dir2, 'shared', registryPath); + + expect(name).toBe('shared'); + const reg = loadRegistry(registryPath); + expect(reg.repos.shared.path).toBe(dir2); + expect(Object.keys(reg.repos)).toHaveLength(1); + }); }); // ─── unregisterRepo ───────────────────────────────────────────────── @@ -225,3 +285,54 @@ describe('resolveRepoDbPath', () => { expect(result).toBeUndefined(); }); }); + +// ─── pruneRegistry ───────────────────────────────────────────────── + +describe('pruneRegistry', () => { + it('removes entries whose directories no longer exist', () => { + const dir1 = path.join(tmpDir, 'exists'); + const dir2 = path.join(tmpDir, 'gone'); + fs.mkdirSync(dir1, { recursive: true }); + fs.mkdirSync(dir2, { recursive: true }); + + registerRepo(dir1, 'exists', registryPath); + registerRepo(dir2, 'gone', registryPath); + + // Remove the directory to make it stale + fs.rmSync(dir2, { recursive: true, force: true }); + + const pruned = pruneRegistry(registryPath); + expect(pruned).toHaveLength(1); + expect(pruned[0].name).toBe('gone'); + expect(pruned[0].path).toBe(dir2); + + const reg = loadRegistry(registryPath); + expect(reg.repos.exists).toBeDefined(); + expect(reg.repos.gone).toBeUndefined(); + }); + + it('returns empty array when nothing to prune', () => { + const dir = path.join(tmpDir, 'healthy'); + fs.mkdirSync(dir, { recursive: true }); + registerRepo(dir, 'healthy', registryPath); + + const pruned = pruneRegistry(registryPath); + expect(pruned).toEqual([]); + }); + + it('does not write file when nothing pruned', () => { + const dir = path.join(tmpDir, 'ok'); + fs.mkdirSync(dir, { recursive: true }); + registerRepo(dir, 'ok', registryPath); + + const mtimeBefore = fs.statSync(registryPath).mtimeMs; + pruneRegistry(registryPath); + const mtimeAfter = fs.statSync(registryPath).mtimeMs; + expect(mtimeAfter).toBe(mtimeBefore); + }); + + it('returns empty array for empty registry', () => { + const pruned = pruneRegistry(registryPath); + expect(pruned).toEqual([]); + }); +}); diff --git a/tests/unit/structure.test.js b/tests/unit/structure.test.js new file mode 100644 index 00000000..003a879d --- /dev/null +++ b/tests/unit/structure.test.js @@ -0,0 +1,277 @@ +/** + * Unit tests for src/structure.js + * + * Tests buildStructure metrics computation and query functions + * using an in-memory SQLite database. + */ + +import Database from 'better-sqlite3'; +import { beforeEach, describe, expect, it } from 'vitest'; +import { initSchema } from '../../src/db.js'; +import { buildStructure } from '../../src/structure.js'; + +let db; + +function setup() { + db = new Database(':memory:'); + db.pragma('journal_mode = WAL'); + initSchema(db); + return db; +} + +function insertFileNode(name, file) { + db.prepare( + 'INSERT OR IGNORE INTO nodes (name, kind, file, line, end_line) VALUES (?, ?, ?, ?, ?)', + ).run(name, 'file', file, 0, null); +} + +function insertImportEdge(sourceFile, targetFile) { + const src = db + .prepare('SELECT id FROM nodes WHERE name = ? AND kind = ?') + .get(sourceFile, 'file'); + const tgt = db + .prepare('SELECT id FROM nodes WHERE name = ? AND kind = ?') + .get(targetFile, 'file'); + if (src && tgt) { + db.prepare( + 'INSERT INTO edges (source_id, target_id, kind, confidence, dynamic) VALUES (?, ?, ?, ?, ?)', + ).run(src.id, tgt.id, 'imports', 1.0, 0); + } +} + +describe('buildStructure', () => { + beforeEach(() => { + setup(); + }); + + it('creates directory nodes and contains edges', () => { + // Set up: two files in src/ + insertFileNode('src/a.js', 'src/a.js'); + insertFileNode('src/b.js', 'src/b.js'); + + const fileSymbols = new Map([ + [ + 'src/a.js', + { + definitions: [{ name: 'foo', kind: 'function', line: 1 }], + imports: [], + exports: [], + calls: [], + }, + ], + [ + 'src/b.js', + { + definitions: [{ name: 'bar', kind: 'function', line: 1 }], + imports: [], + exports: [], + calls: [], + }, + ], + ]); + const lineCountMap = new Map([ + ['src/a.js', 10], + ['src/b.js', 20], + ]); + const directories = new Set(['src']); + + buildStructure(db, fileSymbols, '/root', lineCountMap, directories); + + // Check directory node was created + const dirNode = db + .prepare("SELECT * FROM nodes WHERE kind = 'directory' AND name = 'src'") + .get(); + expect(dirNode).toBeDefined(); + + // Check contains edges exist + const containsEdges = db + .prepare("SELECT COUNT(*) as c FROM edges WHERE kind = 'contains'") + .get(); + expect(containsEdges.c).toBeGreaterThanOrEqual(2); // src -> a.js, src -> b.js + }); + + it('computes per-file metrics', () => { + insertFileNode('src/a.js', 'src/a.js'); + insertFileNode('src/b.js', 'src/b.js'); + insertImportEdge('src/b.js', 'src/a.js'); + + const fileSymbols = new Map([ + [ + 'src/a.js', + { + definitions: [ + { name: 'foo', kind: 'function', line: 1 }, + { name: 'bar', kind: 'function', line: 5 }, + ], + imports: [], + exports: [{ name: 'foo', kind: 'function', line: 1 }], + calls: [], + }, + ], + [ + 'src/b.js', + { + definitions: [{ name: 'baz', kind: 'function', line: 1 }], + imports: [{ source: './a.js', names: ['foo'] }], + exports: [], + calls: [], + }, + ], + ]); + const lineCountMap = new Map([ + ['src/a.js', 10], + ['src/b.js', 5], + ]); + + buildStructure(db, fileSymbols, '/root', lineCountMap, new Set(['src'])); + + // Check file metrics + const aNode = db + .prepare("SELECT id FROM nodes WHERE name = 'src/a.js' AND kind = 'file'") + .get(); + const aMetrics = db.prepare('SELECT * FROM node_metrics WHERE node_id = ?').get(aNode.id); + expect(aMetrics.line_count).toBe(10); + expect(aMetrics.symbol_count).toBe(2); + expect(aMetrics.fan_in).toBe(1); // b.js imports a.js + expect(aMetrics.export_count).toBe(1); + + const bNode = db + .prepare("SELECT id FROM nodes WHERE name = 'src/b.js' AND kind = 'file'") + .get(); + const bMetrics = db.prepare('SELECT * FROM node_metrics WHERE node_id = ?').get(bNode.id); + expect(bMetrics.fan_out).toBe(1); // b.js imports a.js + expect(bMetrics.import_count).toBe(1); + }); + + it('computes directory cohesion', () => { + // Set up: src/a.js imports src/b.js (intra), lib/c.js imports src/a.js (cross) + insertFileNode('src/a.js', 'src/a.js'); + insertFileNode('src/b.js', 'src/b.js'); + insertFileNode('lib/c.js', 'lib/c.js'); + insertImportEdge('src/a.js', 'src/b.js'); // intra-src edge + insertImportEdge('lib/c.js', 'src/a.js'); // cross edge (lib -> src) + + const fileSymbols = new Map([ + [ + 'src/a.js', + { definitions: [], imports: [{ source: './b.js', names: [] }], exports: [], calls: [] }, + ], + ['src/b.js', { definitions: [], imports: [], exports: [], calls: [] }], + [ + 'lib/c.js', + { + definitions: [], + imports: [{ source: '../src/a.js', names: [] }], + exports: [], + calls: [], + }, + ], + ]); + const lineCountMap = new Map([ + ['src/a.js', 5], + ['src/b.js', 5], + ['lib/c.js', 5], + ]); + + buildStructure(db, fileSymbols, '/root', lineCountMap, new Set(['src', 'lib'])); + + // src directory has 1 intra edge (a->b) and 1 cross edge (c->a) + // cohesion = 1 / (1 + 1) = 0.5 + const srcDir = db + .prepare("SELECT id FROM nodes WHERE kind = 'directory' AND name = 'src'") + .get(); + const srcMetrics = db.prepare('SELECT * FROM node_metrics WHERE node_id = ?').get(srcDir.id); + expect(srcMetrics.cohesion).toBeCloseTo(0.5); + }); + + it('deduplicates definitions in symbol count', () => { + insertFileNode('src/a.js', 'src/a.js'); + + const fileSymbols = new Map([ + [ + 'src/a.js', + { + definitions: [ + { name: 'foo', kind: 'function', line: 1 }, + { name: 'foo', kind: 'function', line: 1 }, // duplicate + { name: 'bar', kind: 'function', line: 5 }, + ], + imports: [], + exports: [], + calls: [], + }, + ], + ]); + const lineCountMap = new Map([['src/a.js', 10]]); + + buildStructure(db, fileSymbols, '/root', lineCountMap, new Set(['src'])); + + const aNode = db + .prepare("SELECT id FROM nodes WHERE name = 'src/a.js' AND kind = 'file'") + .get(); + const metrics = db.prepare('SELECT * FROM node_metrics WHERE node_id = ?').get(aNode.id); + expect(metrics.symbol_count).toBe(2); // foo + bar (not 3) + }); + + it('creates intermediate directory nodes', () => { + insertFileNode('src/utils/helper.js', 'src/utils/helper.js'); + + const fileSymbols = new Map([ + ['src/utils/helper.js', { definitions: [], imports: [], exports: [], calls: [] }], + ]); + const lineCountMap = new Map([['src/utils/helper.js', 5]]); + + buildStructure(db, fileSymbols, '/root', lineCountMap, new Set(['src/utils'])); + + // Both src and src/utils should exist as directory nodes + const srcDir = db + .prepare("SELECT * FROM nodes WHERE kind = 'directory' AND name = 'src'") + .get(); + const utilsDir = db + .prepare("SELECT * FROM nodes WHERE kind = 'directory' AND name = 'src/utils'") + .get(); + expect(srcDir).toBeDefined(); + expect(utilsDir).toBeDefined(); + + // src -> src/utils contains edge + const containsEdge = db + .prepare("SELECT * FROM edges WHERE source_id = ? AND target_id = ? AND kind = 'contains'") + .get(srcDir.id, utilsDir.id); + expect(containsEdge).toBeDefined(); + }); + + it('handles empty fileSymbols gracefully', () => { + const fileSymbols = new Map(); + const lineCountMap = new Map(); + + buildStructure(db, fileSymbols, '/root', lineCountMap, new Set()); + + const dirCount = db.prepare("SELECT COUNT(*) as c FROM nodes WHERE kind = 'directory'").get().c; + expect(dirCount).toBe(0); + }); + + it('is idempotent — rebuilding clears old directory data', () => { + insertFileNode('src/a.js', 'src/a.js'); + + const fileSymbols = new Map([ + [ + 'src/a.js', + { + definitions: [{ name: 'foo', kind: 'function', line: 1 }], + imports: [], + exports: [], + calls: [], + }, + ], + ]); + const lineCountMap = new Map([['src/a.js', 10]]); + const dirs = new Set(['src']); + + buildStructure(db, fileSymbols, '/root', lineCountMap, dirs); + buildStructure(db, fileSymbols, '/root', lineCountMap, dirs); + + // Should only have 1 directory node, not 2 + const dirCount = db.prepare("SELECT COUNT(*) as c FROM nodes WHERE kind = 'directory'").get().c; + expect(dirCount).toBe(1); + }); +});