Skip to content

docs: add /review skill for automated PR review sweeps#474

Open
carlos-alm wants to merge 35 commits intomainfrom
docs/review-skill
Open

docs: add /review skill for automated PR review sweeps#474
carlos-alm wants to merge 35 commits intomainfrom
docs/review-skill

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Adds /review Claude Code skill that sweeps all open PRs
  • Resolves merge conflicts (via merge, never rebase), fixes CI failures, addresses all Claude + Greptile review comments, replies to each, and re-triggers reviewers until clean

Test plan

  • Invoke /review and verify it discovers open PRs
  • Verify conflict resolution uses merge (not rebase)
  • Verify Greptile is always re-triggered; Claude only when its feedback was addressed

findDbPath() walks up from cwd looking for .codegraph/graph.db. In a git
worktree (e.g. .claude/worktrees/agent-xxx/), this crosses the worktree
boundary and finds the main repo's database instead.

Add findRepoRoot() using `git rev-parse --show-toplevel` (which returns
the correct root for both repos and worktrees) and use it as a ceiling
in findDbPath(). The walk-up now stops at the git boundary, so each
worktree resolves to its own database.
…t test

- Ceiling test now uses `git init` to create a real git repo boundary,
  and verifies the outer DB is NOT found (without the ceiling fix it
  would be). Added separate test for finding DB within the boundary.
- Non-git test uses a fresh mkdtemp dir instead of os.tmpdir().
- Added debug log when ceiling stops the walk-up.
- Removed unused `origFindRepoRoot` variable.
- Mock execFileSync via vi.mock to verify caching calls exactly once
  and cache bypass calls twice (not just comparing return values)
- Mock execFileSync to throw in "not in git repo" test so the null
  path is always exercised regardless of host environment
- Rename "does not use cache" test to "bypasses cache" with spy assertion
…rt name

- Mock execFileSync to throw in "falls back gracefully when not in a git
  repo" test, preventing flakiness when os.tmpdir() is inside a git repo
- Rename realExecFileSync → execFileSyncForSetup with comment explaining
  it resolves to the spy due to vi.mock hoisting
Combine PR's connection.js imports (findRepoRoot, _resetRepoRootCache)
with main's barrel db/index.js pattern. Add missing re-exports to barrel.

Impact: 22 functions changed, 10 affected
On macOS, os.tmpdir() returns /var/folders/... but git rev-parse
returns /private/var/folders/... (resolved symlink). The ceiling
comparison failed because the paths didn't match. Use fs.realpathSync
on cwd to normalize symlinks before comparing against the ceiling.

Impact: 1 functions changed, 1 affected
Impact: 1 functions changed, 1 affected
…boundary

Impact: 9 functions changed, 19 affected
The 'returns default path when no DB found' test didn't control the git
ceiling — if tmpDir was inside a git repo, findRepoRoot() would return
a non-null ceiling and the default path would differ from emptyDir.
Mock execFileSync to throw so the cwd fallback is always exercised.
…m ceiling

On macOS, os.tmpdir() returns /var/... but git resolves symlinks to
/private/var/..., and on Windows, 8.3 short names (RUNNER~1) differ
from long names (runneradmin). findDbPath normalizes dir via
fs.realpathSync but findRepoRoot only used path.resolve, causing the
ceiling comparison to fail — the walk crossed the worktree boundary.

Fix findRepoRoot to use fs.realpathSync on git output, and resolve
test paths after directory creation so assertions match.

Impact: 1 functions changed, 77 affected
git rev-parse resolves 8.3 short names (RUNNER~1 → runneradmin) but
fs.realpathSync may not, causing the string comparison to fail and the
walk to escape the git ceiling boundary. Replace string equality with
isSameDirectory() that falls back to dev+ino comparison when paths
differ textually but refer to the same directory.

Also fix the test assertion to use findRepoRoot() for the expected
ceiling path instead of the test's worktreeRoot, since git may resolve
paths differently than realpathSync.

Impact: 2 functions changed, 137 affected
Impact: 2 functions changed, 2 affected
…hort names

Impact: 1 functions changed, 1 affected
…tave/codegraph into fix/db-path-worktree-boundary

Impact: 13 functions changed, 3 affected
On Windows CI, fs.realpathSync(process.cwd()) and git rev-parse
--show-toplevel can resolve 8.3 short names (RUNNER~1) differently
than long names (runneradmin). Apply realpathSync to both sides of
the assertion so the comparison is path-equivalent.
…leanup

Five skills implementing the full Titan Paradigm pipeline:
- /titan-recon: graph build, embeddings, health baseline, domain mapping
- /titan-gauntlet: 4-pillar audit (17 rules) with batch processing and resume
- /titan-sync: dependency cluster analysis and ordered execution planning
- /titan-gate: staged change validation with auto-rollback on failure
- /titan-reset: escape hatch to clean all artifacts and snapshots

Skills use artifact-passing between phases to manage context window limits.
Published as docs/examples for users to copy, with live copies in .claude/skills/.
Updated titan-paradigm.md with skill references and new backlog items.
- Add openRepo to db/index.js barrel export (from main)
- Keep realpathSync-based assertions for Windows 8.3 path normalization (from HEAD)

Impact: 33 functions changed, 28 affected
Skill checks all open PRs, resolves conflicts (via merge, never rebase),
fixes CI failures, addresses all reviewer comments (Claude + Greptile),
replies to each comment, and re-triggers reviewers in a loop until clean.
Skill checks all open PRs, resolves conflicts (via merge, never rebase),
fixes CI failures, addresses all reviewer comments (Claude + Greptile),
replies to each comment, and re-triggers reviewers in a loop until clean.
@claude
Copy link

claude bot commented Mar 16, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete

Reviewing PR #474: "docs: add /review skill for automated PR review sweeps"

Todo List

  • Read the new skill file to understand functionality
  • Apply 5-phase review methodology
  • Assess backlog compliance
  • Provide structured final assessment with rating

Root Cause Analysis

Problem Being Solved: This PR adds a Claude Code skill to automate the manual process of reviewing and fixing open PRs. The skill handles common PR maintenance tasks: resolving conflicts, fixing CI failures, addressing review feedback, and re-triggering reviewers.

Why This Approach: The skill is documented as a structured workflow that systematically processes all open PRs. It uses the GitHub CLI (gh) and git commands that Claude Code already has access to, avoiding new dependencies.

Risk Assessment: This is documentation-only (no code execution risk). However, the documented workflow involves destructive git operations and automated code changes based on review comments.

Backlog Compliance

  • Zero-dep: ✓ — Uses only gh CLI and git (already assumed available)
  • Foundation-aligned: ✗ — Violates Principle 8: "We are not an AI agent" and "We don't make decisions, run multi-step workflows, or modify code"
  • Problem-fit: 1 — This is purely about PR maintenance automation, not helping AI agents navigate codebases or understand dependencies
  • Breaking: No — Documentation addition only
  • Tier: 3 (foundation-misaligned)

Critical Concerns

Foundation Violation (Principle 8)

The skill directly contradicts FOUNDATION.md Principle 8: "We are not an application. We are not a coding tool. We are not an AI agent. We don't make decisions, run multi-step workflows, or modify code."

This skill is explicitly designed as a multi-step automated workflow that:

  • Makes decisions about how to resolve merge conflicts
  • Runs complex multi-step workflows across multiple PRs
  • Modifies code by "reading conflicting files, understanding both sides, and making the correct resolution"
  • Automatically addresses review feedback by making code changes

Scope Creep Risk

The skill transforms Claude Code from a "code intelligence engine" into a PR automation agent. Per FOUNDATION.md: "We expose tools to AI agents via MCP, but we are not an agent ourselves."

Operational Safety Concerns

The documented workflow includes potentially dangerous operations:

  • Automatic merge conflict resolution without human review
  • Automated code changes based on review comments
  • Force operations on PR branches
  • No rollback mechanisms for incorrect automated changes

Technical Assessment

Positive Aspects

  • Well-structured documentation with clear steps
  • Uses existing tools (gh CLI, git) without new dependencies
  • Includes safety rules like "never rebase" and "never force-push"
  • Comprehensive coverage of PR maintenance tasks

Documentation Quality

The skill is thoroughly documented with clear steps, examples, and safety rules. The writing is precise and actionable.

Final Recommendation

  • Rating: ⭐⭐☆☆☆ (2/5)
  • Action: REQUEST CHANGES
  • Reasoning: While technically well-executed, this feature fundamentally misaligns with codegraph's stated purpose and principles. The project explicitly positions itself as a code intelligence engine, not an automation agent. Adding agent-like capabilities represents significant scope creep that could confuse the project's identity and competitive positioning.

Suggested Path Forward

If PR automation is valuable, consider:

  1. Moving this to a separate repository/package that uses codegraph as a dependency
  2. Clearly documenting this as an experimental extension outside core codegraph scope
  3. Adding explicit warnings about the experimental/agent nature if kept

The skill itself is well-designed, but belongs in a codegraph-powered automation tool, not in the core intelligence engine.

Replace realpathSync comparison in db.test.js ceiling test with
existence-based assertions to avoid Windows 8.3 short name mismatch
(RUNNER~1 vs runneradmin). Fix artifact count (5 -> 6 files) in
skills README and rule count (31 -> 17 rules) in titan-paradigm.md.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR adds the /review skill for automated PR review sweeps alongside the full Titan Paradigm skill suite (/titan-recon, /titan-gauntlet, /titan-sync, /titan-gate, /titan-reset), plus matching documentation and a minor test robustness fix. All four issues raised in the previous review round (worktree isolation, fork-PR checkout, one-concern-per-commit, and the retry cap) have been addressed correctly.

Key findings:

  • Missing clean-tree guard between PR switches (.claude/skills/review/SKILL.md, Step 2a): the skill runs gh pr checkout <number> for every PR sequentially inside the same worktree, but never verifies the working tree is clean first. Partial edits from a skipped or aborted prior PR can silently bleed into the next checkout.
  • Hard-coded repository name (.claude/skills/review/SKILL.md, Steps 1, 2d–2g): optave/codegraph is repeated in every gh and API call, breaking portability and making the skill fragile to repo renames. The gh CLI can infer the repo from the git remote without an explicit --repo flag.
  • Titan RESET capped at 10 batch snapshots (.claude/skills/titan-reset/SKILL.md): batch snapshot deletion is hard-coded for titan-batch-1 through titan-batch-10. Large codebases producing more than 10 batches will accumulate orphaned snapshots that RESET cannot clean.
  • Misleading NDJSON example in GAUNTLET (.claude/skills/titan-gauntlet/SKILL.md and its docs/ mirror): the sample record shows cognitive: 28 with level: "fail" and detail "28 > 30 threshold", but 28 does not exceed the FAIL threshold of 30 — this is a WARN, not a FAIL.
  • Test change is functionally a no-op (tests/unit/db.test.js): the comment update explains the Windows CI issue, but the code change (template literal → string concatenation) produces an identical assertion. The underlying path-normalisation concern it describes remains unaddressed — though this does not regress existing behaviour.

Confidence Score: 3/5

  • Safe to merge for documentation-only consumption, but the missing clean-tree check in the active /review skill can corrupt PR branches when processing multiple PRs in sequence.
  • The Titan skills and docs are high quality and the previous feedback has been correctly addressed. The /review skill has one real logic gap (no working-tree guard between sequential PR checkouts) that can cause cross-contamination at runtime, and a portability issue with the hard-coded repo name. The RESET skill's 10-batch cap is a correctness bug for larger projects. These are concentrated in three files and are straightforward to fix.
  • .claude/skills/review/SKILL.md (clean-tree guard + hard-coded repo), .claude/skills/titan-reset/SKILL.md (batch snapshot cap), .claude/skills/titan-gauntlet/SKILL.md (example NDJSON threshold value)

Important Files Changed

Filename Overview
.claude/skills/review/SKILL.md New /review skill for automated PR sweep. Core logic is sound after addressing previous feedback (worktree isolation, fork PRs, retry cap), but lacks a clean-working-tree check between PR switches and hard-codes the repository name throughout.
.claude/skills/titan-reset/SKILL.md RESET skill hard-codes batch snapshot deletion up to titan-batch-10 only; any project generating more than 10 batches will leave orphaned snapshots that accumulate indefinitely.
.claude/skills/titan-gauntlet/SKILL.md Comprehensive 4-pillar GAUNTLET audit skill; example NDJSON record contains a misleading threshold detail (cognitive 28 flagged as FAIL when the FAIL threshold is > 30).
tests/unit/db.test.js Comment and assertion style updated for Windows CI path compatibility; the actual assertion logic is functionally unchanged (template literal replaced with equivalent string concatenation).
docs/use-cases/titan-paradigm.md Existing use-case doc updated with skill callout blurbs, a new skills table, and a backlog section surfaced by skill development. No issues found.

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant ReviewSkill as /review skill
    participant GH as GitHub API
    participant Greptile as @greptileai
    participant Claude as @claude
    participant CI as CI/CD

    Dev->>ReviewSkill: /review
    ReviewSkill->>ReviewSkill: Step 0 — /worktree isolation
    ReviewSkill->>GH: gh pr list (Step 1)
    GH-->>ReviewSkill: [PR #N, PR #M, ...]

    loop For each open PR
        ReviewSkill->>ReviewSkill: git status --porcelain (guard)
        ReviewSkill->>GH: gh pr checkout <number> (Step 2a)
        ReviewSkill->>GH: gh pr view --json mergeable (Step 2b)
        alt CONFLICTING
            ReviewSkill->>ReviewSkill: git merge origin/<base> (never rebase)
            ReviewSkill->>GH: push resolved branch
        end
        ReviewSkill->>GH: gh pr checks <number> (Step 2c)
        alt CI failing
            ReviewSkill->>ReviewSkill: diagnose + fix + npm test + npm run lint
            ReviewSkill->>GH: push fix, wait for green
        end
        ReviewSkill->>GH: fetch PR review comments (Step 2d)
        ReviewSkill->>ReviewSkill: address each comment (Step 2e)
        ReviewSkill->>GH: group commits by concern and push (Step 2f)
        ReviewSkill->>GH: @greptileai trigger comment (Step 2g, always)
        opt Claude feedback was addressed
            ReviewSkill->>GH: @claude trigger comment (Step 2g)
        end

        loop Up to 3 rounds
            Greptile->>GH: new review comments
            Claude->>GH: new review comments (if triggered)
            ReviewSkill->>GH: fetch new comments (Step 2h)
            alt No new actionable comments
                ReviewSkill->>ReviewSkill: mark PR as ready
            else New comments remain
                ReviewSkill->>ReviewSkill: go back to Step 2e
            end
        end

        alt Still unclean after 3 rounds
            ReviewSkill->>ReviewSkill: mark as needs human review
        end
    end

    ReviewSkill->>Dev: summary table (Step 3)
Loading

Last reviewed commit: 6ab8371

Comment on lines +28 to +33

```bash
git fetch origin
git checkout <head-branch>
git pull origin <head-branch>
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing /worktree isolation step

CLAUDE.md (line 195) mandates: "Every session must start with /worktree to get an isolated copy of the repo before making any changes." This skill directly runs git checkout <head-branch> on the shared working tree, which can corrupt another session's in-progress work.

The skill should either instruct the agent to run /worktree at the start of the session, or use gh pr checkout <number> inside a dedicated worktree per PR. Without this, invoking /review while another Claude Code session is active violates the repo's parallel-session safety rules.

Rule Used: CLAUDE.md (source)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a /worktree isolation step (Step 0) at the start of the skill workflow, consistent with the CLAUDE.md parallel-session rules.

Comment on lines +110 to +116
### 2f. Commit and push fixes

After addressing all comments for a PR:

1. Stage only the files you changed.
2. Commit: `fix: address review feedback on #<number>`
3. Push to the PR branch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contradicts "one concern per commit" rule

Step 2f instructs the agent to batch all review-feedback fixes into a single commit (fix: address review feedback on #<number>), but the Rules section at line 171 says "One concern per commit — don't lump conflict resolution with code fixes." If the review comments span multiple unrelated concerns (e.g., a logic fix in one file and a naming change in another), this step would violate that rule. Consider instructing the agent to group commits by concern rather than lumping all feedback into one commit.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated step 2f to group commits by concern rather than batching all fixes into a single commit, consistent with the one-concern-per-commit rule in the Rules section.

Add /worktree isolation step at the start of the workflow, consistent
with the CLAUDE.md parallel-session rules. Update step 2f to group
commits by concern rather than batching all fixes into a single commit.
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +39 to +45
```

### 2b. Resolve merge conflicts

Check if the PR has conflicts with its base branch:

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git checkout breaks for fork PRs

Step 2a fetches from origin and then does git checkout <head-branch>. For PRs opened from a forked repository, the head branch lives on the fork's remote — not on origin — so the checkout will fail with "pathspec did not match any file(s) known to git".

gh pr checkout <number> handles this transparently: it fetches from the correct remote (fork or origin) and sets up a local tracking branch. Replace the three-line block with:

### 2a. Switch to the PR branch

```bash
gh pr checkout <number>
git pull

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced git fetch/checkout with gh pr checkout which handles fork PRs transparently.

Comment on lines +149 to +158
3. If there are **new** comments from Greptile or Claude, go back to Step 2e and address them.
4. **Repeat this loop** until Greptile's latest review has no actionable comments.
5. Verify CI is still green after all changes.

---

## Step 3: Summary

After processing all PRs, output a summary table:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No iteration cap on the retry loop

Step 2h says "Repeat this loop until Greptile's latest review has no actionable comments" with no upper bound. If Greptile consistently surfaces new issues (e.g., due to churn in the PR or a recurring false positive), the agent will loop indefinitely.

Consider adding an explicit escape condition, for example:

4. **Repeat this loop**, up to a maximum of **3 rounds**. If the PR is still not clean after 3 rounds, note it in the summary as "needs human review" and move to the next PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a maximum of 3 retry rounds. After 3 rounds, the PR is noted as 'needs human review' in the summary and the skill moves to the next PR.

@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +39 to +43
### 2b. Resolve merge conflicts

Check if the PR has conflicts with its base branch:

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No clean-tree check before each PR checkout

Step 2a runs gh pr checkout <number> back-to-back for each PR, but there is no guard that verifies the working tree is clean first. If a previous iteration left behind unstaged or uncommitted changes (e.g., the PR was declared "fundamentally broken" and skipped after partial file edits, or a conflict resolution was staged but not committed), those changes will silently bleed into the next PR's branch — causing incorrect commits or confusing diffs.

A git status --porcelain guard should be added before every checkout, with an explicit git reset --hard HEAD && git clean -fd (or git stash) if the tree is dirty:

Suggested change
### 2b. Resolve merge conflicts
Check if the PR has conflicts with its base branch:
```bash
### 2a. Switch to the PR branch
Ensure the working tree is clean before switching:
```bash
git status --porcelain

If any output is shown, stash or reset:

git stash push -m "pre-checkout stash for PR <previous-number>" || git reset --hard HEAD

Then check out the PR branch:

gh pr checkout <number>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant