fix(code-review): dedup prior comments and raise max-turns to 256#14
fix(code-review): dedup prior comments and raise max-turns to 256#14
Conversation
Two fixes for the reusable code review workflow: - Add a Step 1 that fetches prior claude[bot] inline comments, checks whether each is fixed in the current revision, replies (✅ Fixed / still present / partially fixed) and resolves resolved threads, instead of re-opening the same finding on every push. - Raise --max-turns from 30 to 256. The 30-turn cap was being hit on non-trivial PRs (observed on taptap/maker#439, error_max_turns after ~4 minutes), aborting the review before any comments were posted. Also tighten the prompt: confidence ≥ 75 threshold, explicit false-positive filters, and add Bash(gh api:*) / Bash(gh pr comment:*) to allowedTools so the dedup logic and GraphQL thread resolution work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
| Filename | Overview |
|---|---|
| .github/workflows/code-review.yml | 新增 Step 1 去重逻辑(获取旧 comment、检查是否 fixed、解析 thread)、将 --max-turns 从 30 提升至 256、扩展 allowedTools,并收紧提示词。主要风险:reviewThreads(first:100) 分页截断、timeout-minutes: 15 与新增 Step 1 开销的交互,以及之前已标注的 bot 用户名硬编码问题。 |
Sequence Diagram
sequenceDiagram
participant GH as GitHub Actions
participant Claude as Claude (Opus)
participant API as GitHub REST API
participant GQL as GitHub GraphQL API
GH->>Claude: 触发 code-review job(PR push)
Note over Claude: Step 1: 去重旧评论
Claude->>API: GET /pulls/{pr}/comments (filter: claude[bot])
API-->>Claude: 旧 inline comments 列表
Claude->>GH: gh pr diff {pr}
GH-->>Claude: 当前 diff
loop 每条旧 comment
Claude->>GH: Read 当前文件内容
GH-->>Claude: 文件内容
alt 已修复
Claude->>API: POST /comments/{id}/replies (✅ Fixed)
Claude->>GQL: resolveReviewThread(threadId)
else 未修复
Claude->>API: POST /comments/{id}/replies (still present)
end
end
Note over Claude: Step 2: 审查新变更
Claude->>GH: 读取完整文件 / git blame / git log
GH-->>Claude: 文件内容 & 历史
Note over Claude: Step 3: 输出结果
Claude->>API: mcp__github_inline_comment (新发现 ≤10)
Claude->>GH: gh pr comment --body "…摘要…"
Reviews (2): Last reviewed commit: "fix(code-review): address greptile feedb..." | Re-trigger Greptile
| Before reviewing anything new, handle your prior comments on this PR so issues are not reported twice across commits. | ||
|
|
||
| 1. Fetch your previous inline comments: | ||
| `gh api repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number || inputs.pr_number }}/comments --jq '[.[] | select(.user.login == "claude[bot]")]'` |
There was a problem hiding this comment.
select(.user.login == "claude[bot]") 使用了硬编码的 bot 用户名。GitHub Apps 的 bot 登录名格式为 <app-slug>[bot],具体取决于 Anthropic 注册 GitHub App 时使用的 slug(可能是 claude-code[bot]、anthropic-claude[bot] 等)。如果名称不匹配,Step 1 的去重逻辑会静默失效——jq 返回空数组,Claude 跳过所有去重工作,每次 push 仍然重复上报相同问题,与本 PR 目标背道而驰。
建议先在某个 consumer repo 上通过 gh api repos/{owner}/{repo}/pulls/{pr}/comments | jq '.[].user.login' 确认实际 bot 用户名后,再将其固定在此处。
- Narrow Bash(gh api:*) to Bash(gh api repos/*/pulls/*/comments*) +
Bash(gh api graphql*) for least-privilege. Both subpaths are what the
Step 1 dedup logic actually needs (PR comments REST + GraphQL
resolveReviewThread mutation).
- Add Step 3 summary-comment instruction so the existing
Bash(gh pr comment:*) permission is actually used. Mirrors the
pattern already in taptap/UrhoX/.github/workflows/pr-review.yml: one
PR-level summary with verdict + new findings + count of resolved
prior issues.
The third greptile P1 ("verify claude[bot] login name") is empirically
correct as-is: gh api repos/taptap/maker/pulls/439/comments confirms
the bot login is `claude[bot]`. No change needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
回复 greptile 的 review: P1 — bot 登录名就是 P2 — P3 — 均见 db1a084。 |
Six reviewer threads addressed: §4.2 (claude bot): Stop claiming "no yaml changes needed" — we do need to update the prompt one-liner from "REVIEW.md" to ".github/REVIEW.md". Document the tradeoff (semantic clarity vs root-level discoverability) explicitly. §4.3 (greptile + claude bot): Two real implementation traps in the prompt template — trailing comma when extra_allowed_tools is empty, and shell metacharacter injection through raw input interpolation. Move allowedTools composition to a dedicated shell step, document why claude_args block scalar is fine (PR #14 already proves the action parses it correctly). §5 (greptile): Add inline NOTE comment on `uses: ...@main` in both consumer examples pointing at §8 Q3 — flagging that this is a current convention, not an endorsement. §6 (claude bot — security): Substantially expanded. Added §6.1 on character-level sanitization (reject `"`, backtick, `$`, `\`, `;`, `|`, `&`, `>`, `<`, newlines), §6.2 changing match semantics from prefix to strict per-pattern equality, §6.4 on validation step placement, §6.5 on logging. §7 (claude bot — factual accuracy): Replace the misleading "consistent with PR #14" sentence with a per-input table that distinguishes inputs that mirror current behavior (model, max_turns) from genuinely new defaults (max_budget_usd, confidence_min, max_findings) and call out that confidence_min/max_findings are "formally new" — they were prompt literals before, now lifted to parameters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(rfc): RFC 0001 — code-review workflow per-repo customization Drafts the architecture for letting consumer repos customize the org-wide reusable code-review workflow without forking it. Covers: - A 5-category taxonomy of customization needs (numeric knobs, prompt content, tool surface, path filters, review goals) and which lives in workflow inputs vs which lives in a per-repo .github/REVIEW.md. - New input list (model, max_turns, max_budget_usd, confidence_min, max_findings, paths_focus, paths_skip, extra_allowed_tools) — all with defaults so existing consumers (maker) need zero changes. - Security model for extra_allowed_tools: explicit safe-tool allowlist, deny-list for network/destructive commands, fail-fast validation step. - Phased rollout (5 input + path filter + tool extension + docs) so each phase is independently mergeable and backward-compatible. - Open questions surfaced for discussion: REVIEW.md location, whether to also support a structured config file, version tag strategy, allowlist maintenance, missing-REVIEW.md nudge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rfc): address review feedback on RFC 0001 Six reviewer threads addressed: §4.2 (claude bot): Stop claiming "no yaml changes needed" — we do need to update the prompt one-liner from "REVIEW.md" to ".github/REVIEW.md". Document the tradeoff (semantic clarity vs root-level discoverability) explicitly. §4.3 (greptile + claude bot): Two real implementation traps in the prompt template — trailing comma when extra_allowed_tools is empty, and shell metacharacter injection through raw input interpolation. Move allowedTools composition to a dedicated shell step, document why claude_args block scalar is fine (PR #14 already proves the action parses it correctly). §5 (greptile): Add inline NOTE comment on `uses: ...@main` in both consumer examples pointing at §8 Q3 — flagging that this is a current convention, not an endorsement. §6 (claude bot — security): Substantially expanded. Added §6.1 on character-level sanitization (reject `"`, backtick, `$`, `\`, `;`, `|`, `&`, `>`, `<`, newlines), §6.2 changing match semantics from prefix to strict per-pattern equality, §6.4 on validation step placement, §6.5 on logging. §7 (claude bot — factual accuracy): Replace the misleading "consistent with PR #14" sentence with a per-input table that distinguishes inputs that mirror current behavior (model, max_turns) from genuinely new defaults (max_budget_usd, confidence_min, max_findings) and call out that confidence_min/max_findings are "formally new" — they were prompt literals before, now lifted to parameters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rfc): use env: for compose-tools input passing Address claude bot's round-2 finding (PRRT thread on §4.3): the compose-tools snippet I added in 02bc76b still had the GitHub Actions script-injection antipattern — \`EXTRA='\${{ inputs.extra_allowed_tools }}'\` expands at YAML parse time and concatenates the raw input into the shell script, exactly what §6.1 is meant to block. Switch to \`env: EXTRA: \${{ ... }}\` so EXTRA is a proper shell environment variable, not a parse-time string interpolation. Add an inline comment explaining why this matters and propagate the same pattern to the validate-extra-tools step (defense in depth before the validator runs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rfc): paths_skip is a strong prompt instruction, not a hard filter Address claude bot's round-3 finding: §4.2 called paths_skip a "硬过滤 / 绝不 review" but the implementation just injects the value into the prompt as text — LLMs don't provide a 100% instruction-following guarantee, so the language overpromised. Reword §4.2 to call it a "structured strong prompt instruction" (high but not absolute compliance) and explicitly warn against using paths_skip as a security boundary — sensitive files shouldn't be in the diff in the first place; rely on .gitignore / git-secrets / pre-commit hooks upstream. Add alternative E to §10 documenting the actual hard-filter approach (pre-process diff to physically strip paths) as deferred until a concrete need surfaces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
claude[bot]inline comments, checks each against the current source, replies (✅ Fixed / still present / partially fixed) and resolves the threads that are fixed — instead of re-opening the same finding every push.--max-turnsfrom 30 to 256. Observed ontaptap/maker#439: non-trivial PRs hiterror_max_turnsafter ~4 min and aborted before any comment was posted. 30 was a hard cap, not a budget — small PRs still use few turns.taptap/UrhoX/.github/workflows/pr-review.yml(org-generic parts only — no language-specific checks).allowedTools: addBash(gh api:*)andBash(gh pr comment:*)because the dedup logic needsgh apifor the comments endpoint and the GraphQLresolveReviewThreadmutation.Why org-wide and not per-repo
The reusable workflow exposes only
pr_number / is_draft / head_repo_full_nameas inputs, so consumers (e.g.maker/.github/workflows/code-review.yml) cannot override the prompt or args today. All four changes here are bug fixes / generic quality improvements — no language- or project-specific content was copied from UrhoX. If a repo later wants specialization, the right move is to add aextra_prompt/max_turnsinput to this workflow, not to fork it.Test plan
taptap/maker) and verify the Code Review job reaches the Claude step and posts inline comments.error_max_turns.🤖 Generated with Claude Code