Skip to content

fix(cfg): avoid dual-connection WAL conflict in native bulkInsertCfg#719

Merged
carlos-alm merged 4 commits intomainfrom
fix/cfg-wal-conflict
Mar 31, 2026
Merged

fix(cfg): avoid dual-connection WAL conflict in native bulkInsertCfg#719
carlos-alm merged 4 commits intomainfrom
fix/cfg-wal-conflict

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • The v3.6.0 native binary introduced bulkInsertCfg on NativeDatabase, causing buildCFGData to take the native bulk insert path for the first time
  • This path called deleteCfgForNode via the JS better-sqlite3 connection, then nativeDb.bulkInsertCfg via the native rusqlite connection — a dual-connection WAL conflict
  • The cfg_blocks table ended up empty after builds, failing the incremental-parity test on all PRs
  • Fix: Move the DELETE statements into the Rust bulk_insert_cfg function so all writes happen on a single (native) connection, eliminating the WAL conflict

Test plan

…d WAL conflict

The native bulk insert path called deleteCfgForNode via the JS
better-sqlite3 connection, then suspended the JS DB and inserted via
the native rusqlite connection. This dual-connection pattern caused a
WAL conflict that left cfg_blocks empty after builds.

Move the DELETE statements into the Rust bulk_insert_cfg function so
all operations happen on a single (native) connection within one
transaction. The JS side now sends entries with empty blocks/edges for
nodes that need deletion only.
@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

Claude finished @carlos-alm's task —— View job


PR Review Completed

Analyzing PR #719: fix(cfg): avoid dual-connection WAL conflict in native bulkInsertCfg

Review Checklist

  • Read and analyze changed files
  • Assess root cause analysis and solution approach
  • Check for configuration changes or test modifications
  • Evaluate code quality and potential issues
  • Assess backlog compliance metrics
  • Provide structured review feedback with rating

Root Cause Analysis

Problem Being Solved: WAL (Write-Ahead Logging) conflict introduced in v3.6.0 when buildCFGData used dual database connections - calling deleteCfgForNode() via the JavaScript better-sqlite3 connection, then nativeDb.bulkInsertCfg() via the native rusqlite connection. This caused the cfg_blocks table to end up empty after builds, breaking the incremental-parity test.

Why This Approach: The fix correctly moves the DELETE operations into the native Rust side so all CFG write operations happen on a single connection. This eliminates the WAL conflict by ensuring atomicity of the delete-then-insert operation.

Risk Assessment: ✅ Low risk - This is a genuine bug fix that restores correct behavior. The approach maintains the same functional behavior (delete existing CFG data before inserting new data) but does it through a single connection.

Backlog Compliance

  • Zero-dep: ✅ - No new runtime dependencies added
  • Foundation-aligned: ✅ - Fully compliant, fixes a core functionality bug
  • Problem-fit: 5/5 - Directly addresses a critical bug that was breaking CFG data persistence, essential for AI agents understanding control flow
  • Breaking: No - Internal implementation change with no API surface changes
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Code Quality Assessment

Positive aspects:

  1. Proper root cause identification: Correctly identified the dual-connection WAL conflict as the core issue
  2. Minimal, targeted fix: Only changes what's necessary to fix the problem
  3. Maintains existing behavior: The delete-before-insert logic is preserved, just moved to the right place
  4. Good error handling: Uses let _ = for DELETE operations to gracefully handle cases where no data exists
  5. Clear documentation: Comments explain the WAL conflict rationale

Technical correctness:

  • ✅ Rust code correctly prepares DELETE statements within the transaction
  • ✅ Maintains the same HashMap-based block ID tracking logic
  • ✅ JavaScript side correctly removes the deleteCfgForNode(db, nodeId) call from the native fast path
  • ✅ Preserves the fallback path behavior for non-native scenarios

Test coverage: The existing test preserves CFG blocks for changed file (#468) in tests/integration/incremental-parity.test.ts directly validates this fix works.

Critical Concerns

None. This is a well-executed bug fix.

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: This is an exemplary bug fix - correctly identifies root cause (WAL conflict from dual connections), implements minimal targeted solution (move deletes to native side), preserves all existing behavior, and directly addresses a critical functionality failure. The code is clean, well-documented, and has existing test coverage that validates the fix.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR fixes a dual-connection WAL conflict introduced when v3.6.0's bulkInsertCfg on NativeDatabase caused buildCFGData to split its writes across two SQLite connections — deletes via the JS better-sqlite3 connection and inserts via the native rusqlite connection — leaving cfg_blocks empty and breaking the incremental-parity test on every PR.

Changes:

  • crates/codegraph-core/src/native_db.rs: Adds DELETE FROM cfg_edges and DELETE FROM cfg_blocks (by function_node_id) inside bulk_insert_cfg's transaction before the inserts, making the function self-contained. Both deletes propagate errors via ? — consistent with how tx.commit() failures are surfaced. The previously flagged silent-failure concern from the last review round is resolved.
  • src/features/cfg.ts: Removes the native bulkInsertCfg fast path entirely and falls back to the JS-only persistNativeFileCfg path, which performs deletes and inserts on the same better-sqlite3 connection wrapped in a single JS transaction. The engineOpts parameter is renamed _engineOpts to signal it is intentionally unused.

Minor note: The comment at line 382 of cfg.ts cites the WAL conflict as the reason the native path is not used, but this same PR's Rust change resolves that conflict. The comment should be updated to reflect the current state so future developers are not misled about whether the native path is still broken.

Confidence Score: 5/5

Safe to merge — the WAL conflict is fully resolved and the prior P1 (silent delete errors) has been addressed.

All previously flagged concerns are resolved. The only remaining finding is a P2 comment-accuracy issue: the stale explanation in cfg.ts describes a WAL conflict that the Rust change in this same PR fixes. No logic bugs, data-integrity risks, or correctness issues remain.

No files require special attention.

Important Files Changed

Filename Overview
crates/codegraph-core/src/native_db.rs Adds DELETE-before-INSERT inside bulk_insert_cfg's transaction; both deletes now propagate errors correctly via ?, consistent with the commit error handling. Previously flagged silent-failure concern is resolved.
src/features/cfg.ts Removes the native bulkInsertCfg fast path and falls back to the JS-only persistNativeFileCfg path; parameter renamed to _engineOpts to signal it is intentionally unused. Comment describing the WAL conflict is now slightly stale given the Rust fix.

Sequence Diagram

sequenceDiagram
    participant TS as buildCFGData (cfg.ts)
    participant JS as better-sqlite3 (JS conn)
    participant Rust as bulk_insert_cfg (rusqlite)

    Note over TS,Rust: BEFORE this PR — dual-connection WAL conflict
    TS->>JS: deleteCfgForNode(db, nodeId)
    JS-->>TS: OK
    TS->>Rust: nativeDb.bulkInsertCfg(entries)
    Note right of Rust: WAL conflict! Both connections writing to same WAL file
    Rust-->>TS: cfg_blocks empty ❌

    Note over TS,Rust: AFTER this PR — single-connection JS path
    TS->>JS: tx() — deleteCfgForNode + insertBlock + insertEdge
    JS-->>TS: All writes committed on one connection ✅

    Note over Rust: bulk_insert_cfg (Rust) also fixed: DELETE edges → DELETE blocks → INSERT all within one rusqlite transaction (ready for future re-enablement)
Loading

Reviews (2): Last reviewed commit: "fix: propagate CFG delete errors instead..." | Re-trigger Greptile

…onflict

The v3.6.0 native binary introduced bulkInsertCfg, but the CFG path
requires delete-before-insert where deletes go through JS (better-sqlite3)
and inserts through native (rusqlite), creating a WAL conflict that
silently drops all CFG data. Disable the native bulk path so the proven
JS-only persistNativeFileCfg handles both operations on a single connection.

The Rust-side fix (previous commit) will allow re-enabling the native path
once the next binary is published with delete logic inside bulkInsertCfg.
Comment on lines +874 to +875
let _ = del_edges.execute(params![entry.node_id]);
let _ = del_blocks.execute(params![entry.node_id]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent delete errors may mask partial state

Both DELETE statements discard their Result with let _ = ..., silently swallowing any error. If either delete fails inside the transaction (e.g., a locked auxiliary index or a statement constraint), the subsequent INSERTs still proceed, potentially leaving duplicate rows for the same function_node_id.

The existing INSERT code uses the same pattern (if let Ok(_) = block_stmt.execute(...)) so this is consistent, but the deletes are "load-bearing" here — a silent failure means old rows survive alongside new ones. Consider propagating delete errors the same way the tx.commit() error is propagated:

Suggested change
let _ = del_edges.execute(params![entry.node_id]);
let _ = del_blocks.execute(params![entry.node_id]);
del_edges.execute(params![entry.node_id])
.map_err(|e| napi::Error::from_reason(format!("cfg_edges delete failed: {e}")))?;
del_blocks.execute(params![entry.node_id])
.map_err(|e| napi::Error::from_reason(format!("cfg_blocks delete failed: {e}")))?;

This keeps the whole function consistent with how tx.commit() failures are surfaced, and ensures the caller gets a clear error rather than silently storing duplicate blocks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Changed both del_edges.execute() and del_blocks.execute() to propagate errors via .map_err(|e| napi::Error::from_reason(...))?; — consistent with how tx.commit() errors are surfaced. A failed delete now aborts the transaction instead of silently proceeding with inserts.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai review

@carlos-alm carlos-alm merged commit 4717232 into main Mar 31, 2026
23 checks passed
@carlos-alm carlos-alm deleted the fix/cfg-wal-conflict branch March 31, 2026 04:02
@github-actions github-actions bot locked and limited conversation to collaborators Mar 31, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant