feat: add query + incremental regression benchmarks#80
Conversation
…int section Add 4 new regression benchmarks (query depth scaling, diff-impact latency, incremental build tiers, import resolution throughput) with dual-engine support, report updaters, and CI workflow jobs. Add lightweight footprint section to README with live shields.io badges for unpacked size, dependency stars, and weekly downloads.
|
Claude finished @carlos-alm's task —— View job PR Review: Query + Incremental Regression BenchmarksReview Tasks:
Phase 1: Root Cause Analysis ✅Problem Being Solved: This PR addresses the need for comprehensive regression testing of codegraph's query and incremental build performance. The existing benchmark CI only covers build benchmarks and embedding benchmarks, but doesn't test query performance (fnDeps, fnImpact, diffImpact) or incremental build scenarios (full/no-op/1-file rebuilds). Why This Approach: The solution follows the established benchmark pattern used by existing benchmarks in the codebase:
Risk Assessment: This is a LOW-RISK PR that adds monitoring/observability without changing core functionality. The benchmarks are measurement-only and the CI jobs follow the same secure pattern as existing benchmark jobs. Phase 2: Configuration Change Detection ✅No High-Risk Configuration Changes Found:
The PR adds new CI jobs but follows the same secure pattern as existing benchmark workflows. Phase 3: Test Integrity ✅No Test Weakening Found:
Phase 4: Code Quality ✅Security Review:
Code Quality:
Implementation Details:
Final AssessmentRoot Cause AnalysisProblem Being Solved: Expanding regression testing coverage to include query performance and incremental build scenarios that were previously unmeasured. Critical ConcernsNone identified. This is a well-implemented observability enhancement that follows security best practices and existing codebase patterns. Final Recommendation
Minor observations for future consideration:
|
Greptile SummaryAdded query and incremental regression benchmarks to complement existing build/embedding benchmarks. The new scripts measure query depth scaling (
Confidence Score: 5/5
Important Files Changed
Last reviewed commit: 0fd1967 |
Summary
scripts/query-benchmark.js): measuresfnDepsDataandfnImpactDataat depth 1/3/5 (median of 5 runs) plusdiffImpactDatawith a synthetic staged diff, both enginesscripts/incremental-benchmark.js): measures full/no-op/1-file rebuild tiers and import resolution throughput (native batch vs JS fallback), both enginesupdate-query-report.js,update-incremental-report.js) following the exact pattern of existing updaters — trend arrows, HTML comment history, deduplicate by versionbenchmark.yml(query-benchmark,incremental-benchmark)Test plan
node scripts/query-benchmark.js 2>/dev/null | jq .— produces valid JSONnode scripts/incremental-benchmark.js 2>/dev/null | jq .— produces valid JSONnpm run lintpasses