Skip to content

feat: markdown link checking and doc-to-doc anchors#18

Merged
laulauland merged 9 commits intomainfrom
lau/markdown-links
Apr 10, 2026
Merged

feat: markdown link checking and doc-to-doc anchors#18
laulauland merged 9 commits intomainfrom
lau/markdown-links

Conversation

@laulauland
Copy link
Copy Markdown
Member

Summary

drift can now detect broken markdown links and track staleness between docs, not just between docs and code.

Two new capabilities:

  1. Broken link detection — during drift lint, all [text](path.md) links in drift-managed docs are parsed via tree-sitter markdown and checked for file existence. Missing targets are reported as BROKEN. No lockfile entry needed — any relative markdown link is checked automatically.

  2. Doc-to-doc anchors — lockfile bindings where the target is a .md file with an optional #heading-slug fragment (e.g. docs/overview.md -> docs/auth.md#authentication sig:...). Headings are resolved via tree-sitter markdown's section/atx_heading nodes and fingerprinted for staleness detection — same model as code symbol anchors.

Also removes the legacy @./ inline anchor syntax, frontmatter-based anchors (YAML frontmatter, <!-- drift: --> HTML comments), scanner.zig, and frontmatter.zig. Net result: -658 lines.

Changes by commit

  1. Design docs — updated DESIGN.md, DECISIONS.md (new Decision 14), CLI.md, check-json-schema.md with markdown links and doc-to-doc anchor documentation. Removed references to @./ anchors and frontmatter.

  2. Remove @./ inline anchors and frontmatter — deleted frontmatter.zig (1110 lines) and scanner.zig (297 lines). Extracted anchorFileIdentity's # split logic into a small target.zig module. Cleaned up link.zig, refs.zig, unlink.zig.

  3. Add tree-sitter markdown grammar — added tree-sitter-markdown (block + inline) as a lazy build dependency. Both grammars are separate ts.Language instances from MDeiml/tree-sitter-markdown, compiled via linkGrammars() in build.zig.

  4. Extract markdown links — rewrote markdown.zig with tree-sitter two-pass parsing (block grammar for structure, inline grammar for inline_link nodes via setIncludedRanges). Extracts relative links, skipping URLs, absolute paths, and fragment-only links.

  5. Broken link checking in lint — added link_target_not_found reason code, links array in JSON output (docs[*].links[*] with target, line, result, reason), links_total/links_broken in summary. Text mode prints BROKEN for missing link targets. Exit code 1 if any link is broken. Updated drift.check.v1 schema and schema generator.

  6. Doc-to-doc anchor heading resolution.md files with #fragment targets resolve headings via slug matching (GitHub-style: lowercase, non-alphanumeric → hyphens). Lockfile stores slug form to avoid space-tokenization issues in the line parser. Section content fingerprinted via normalized syntax tree hash.

  7. drift link for doc-to-doc — linking docs/a.md docs/b.md#Heading slugifies the heading, validates it exists in the target doc, computes section fingerprint, writes binding. Blanket relink also refreshes doc-to-doc anchors.

  8. Tests — integration tests for broken link detection (text + JSON), doc-to-doc anchor staleness (fresh, stale, heading removed), drift link for doc-to-doc targets. Updated payload validation tests for new schema fields.

Design decisions

  • Lockfile stores heading slugs, not raw text — the lockfile parser tokenizes on spaces, so docs/a.md#Token Validation would break parsing. Slugified form docs/a.md#token-validation is a single token. Both drift link (write) and lint (match) slugify.

  • Section fingerprinting includes nested subsections — anchoring to an H2 heading fingerprints everything under it until the next H2 or higher. Subsection changes trigger staleness. This is intentional: "the Authentication section changed" includes its children.

  • Reference links deferred — only inline_link ([text](url)) is extracted for now. full_reference_link ([text][ref]) support is a follow-up since it requires building a label→destination map from block-level link_reference_definition nodes.

  • Only lockfile-managed docs get link-checked — consistent with drift's model where a "drift doc" is a file with at least one binding in drift.lock. Repo-wide link checking is a different tool's job.

Test plan

  • zig build test passes (unit + integration)
  • drift lint reports BROKEN for docs with dead markdown links
  • drift lint --format json includes links array with broken entries
  • drift link docs/a.md docs/b.md#Heading creates lockfile binding with slug and sig
  • drift lint reports STALE when anchored heading section content changes
  • drift lint reports STALE with symbol_not_found when anchored heading is deleted
  • Legacy @./ anchors and frontmatter no longer parsed (no regressions in existing tests)

Two new capabilities documented:

1. Broken link detection — drift lint parses markdown links via
   tree-sitter and reports BROKEN for targets that don't exist.
   No lockfile entry needed.

2. Doc-to-doc anchors — lockfile bindings where target is a .md
   file with optional #Heading fragment. Headings resolved via
   tree-sitter markdown section nodes, fingerprinted for staleness.

Also removes references to @./ inline anchors and frontmatter-based
anchors (both being removed from the codebase).

Adds Decision 14 (tree-sitter markdown for link checking).
Updates CLI examples, JSON schema docs with links array and
link_target_not_found reason code, heading anchor kind.
Unlink deleted files (src/frontmatter.zig, src/scanner.zig) from
drift.lock, relink all stale docs to refresh signatures after
code changes. Update SKILL.md to remove @./ inline anchor and
frontmatter migration references, add doc-to-doc anchor and
broken link checking documentation.
@laulauland laulauland merged commit a3cfbbb into main Apr 10, 2026
5 checks passed
@laulauland laulauland deleted the lau/markdown-links branch April 10, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant