Skip to content

fix(lineage): unify dbt model producer and consumer nodes (#32)#42

Open
melonamin wants to merge 4 commits intomasterfrom
fix/issue-32-dbt-multi-model-lineage
Open

fix(lineage): unify dbt model producer and consumer nodes (#32)#42
melonamin wants to merge 4 commits intomasterfrom
fix/issue-32-dbt-multi-model-lineage

Conversation

@melonamin
Copy link
Copy Markdown
Member

Summary

  • Fixes dbt model is single #32 — dbt multi-model chains now render as connected lineage instead of disconnected per-file fragments.
  • Materializes a dbt model's bare SELECT as the canonical Table node (sharing its id with consumers) instead of a per-statement Output node, so the flattener unifies producer and consumer by canonical name.
  • Passes the sink id through to analyze_query as the target, so table-level DataFlow edges are emitted between models (matching the shape CTAS already produces).

Before: supplies, stg_supplies, int_supplies show up as disconnected boxes in mermaid and fct_supplies is missing entirely.
After: supplies -> stg_supplies -> int_supplies -> fct_supplies renders as one connected chain.

Test plan

  • New regression test dbt_chained_models_unify_producer_and_consumer_nodes covering a 3-model A -> B -> C chain (asserts node unification, absence of stray Output nodes, cross-statement edges, and table-level DataFlow arrows).
  • Three existing dbt tests updated to assert the corrected Table sink shape instead of the old Output shape.
  • cargo test --workspace passes.
  • cargo clippy --workspace --all-targets -- -D warnings clean.
  • cargo fmt --check clean.
  • CLI mermaid repro on a 3-file dbt chain shows the full raw.supplies -> stg -> int -> fct lineage.
  • Reviewer: visual check in Vite dev server (just build-wasm && just build-ts && just dev) with a dbt-style 3-model repro, plus diamond and unreferenced-leaf cases.

In dbt mode, a bare SELECT previously emitted a per-statement Output
node while downstream consumers referencing the same model via
`{{ ref(...) }}` emitted separate Table nodes. With matching canonical
names but different node types, the flattener could not merge them,
leaving multi-hop chains (A -> B -> C) as disconnected fragments and
omitting leaf models from the mermaid table view.

Materialize the dbt sink as the canonical relation node and pass its
id to analyze_query as the target, so producer and consumer collapse
into one node and table-level DataFlow edges connect the chain - the
same shape CTAS already produces.
Preserve dbt view materialization, keep producer occurrence metadata on merged model nodes, and treat dbt relation sinks as writes in script dependency views.
Scope `materialized` kwarg parsing to the body of the first `config(...)` call via a paren-depth scanner that respects string literals, eliminating false positives from comments or unrelated SQL. Extend `DbtMaterialization` to cover table/incremental/snapshot/ephemeral/materialized_view, with ephemeral and materialized_view correctly mapped to `NodeType::View`. Pre-declare view-materialized dbt models during DDL pre-collection so consumers resolved before the producer still merge onto the canonical view sink, and rework `getCreatedRelationNodeIds` to identify statement-local projection columns via edge-shape instead of `qualifiedName`, which can be inherited across producer/consumer merges.
Bundle analyze_statement source params into a StatementSource struct
so the signature drops under clippy's 7-arg limit, rename the sink
node id field to match its broadened semantics (Output node or
canonical relation sink), and avoid a String round-trip on the dbt
sink id by keeping it as Option<Arc<str>>.

Extend dbt materialization parsing to honor multiple config(...)
calls (last materialized= wins, matching dbt's override behavior)
and predeclare model producers so forward refs resolve without
leaving placeholder nodes behind. Extract isCreatedProjectionColumn
and isRelationNode helpers in lineageHelpers.ts so the projection
detection rule reads as a named predicate rather than an inline
boolean soup.

Add tests covering unknown adapter materializations, dynamic Jinja
materializations, later-config-wins override, and forward-ref
resolution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dbt model is single

1 participant