fix(code-review): dedup prior comments and raise max-turns to 256 by lishuceo · Pull Request #14 · taptap/.github

lishuceo · 2026-04-20T10:28:40Z

Summary

Stop duplicate review comments across pushes. Add a Step 1 to the prompt that pulls prior claude[bot] inline comments, checks each against the current source, replies (✅ Fixed / still present / partially fixed) and resolves the threads that are fixed — instead of re-opening the same finding every push.
Raise --max-turns from 30 to 256. Observed on taptap/maker#439: non-trivial PRs hit error_max_turns after ~4 min and aborted before any comment was posted. 30 was a hard cap, not a budget — small PRs still use few turns.
Tighten the prompt: confidence ≥ 75 threshold and an explicit false-positive filter list, modeled on the working pattern from taptap/UrhoX/.github/workflows/pr-review.yml (org-generic parts only — no language-specific checks).
Expand allowedTools: add Bash(gh api:*) and Bash(gh pr comment:*) because the dedup logic needs gh api for the comments endpoint and the GraphQL resolveReviewThread mutation.

Why org-wide and not per-repo

The reusable workflow exposes only pr_number / is_draft / head_repo_full_name as inputs, so consumers (e.g. maker/.github/workflows/code-review.yml) cannot override the prompt or args today. All four changes here are bug fixes / generic quality improvements — no language- or project-specific content was copied from UrhoX. If a repo later wants specialization, the right move is to add a extra_prompt / max_turns input to this workflow, not to fork it.

Test plan

Open a PR in a consumer repo (e.g. taptap/maker) and verify the Code Review job reaches the Claude step and posts inline comments.
Push a follow-up commit that fixes one of the reported issues; confirm Claude replies "✅ Fixed" and resolves that thread instead of opening a new comment.
Push a follow-up commit that does NOT fix a reported issue; confirm Claude replies "still unresolved" and does NOT re-open the same finding.
Confirm a large PR no longer aborts with error_max_turns.

🤖 Generated with Claude Code

Two fixes for the reusable code review workflow: - Add a Step 1 that fetches prior claude[bot] inline comments, checks whether each is fixed in the current revision, replies (✅ Fixed / still present / partially fixed) and resolves resolved threads, instead of re-opening the same finding on every push. - Raise --max-turns from 30 to 256. The 30-turn cap was being hit on non-trivial PRs (observed on taptap/maker#439, error_max_turns after ~4 minutes), aborting the review before any comments were posted. Also tighten the prompt: confidence ≥ 75 threshold, explicit false-positive filters, and add Bash(gh api:*) / Bash(gh pr comment:*) to allowedTools so the dedup logic and GraphQL thread resolution work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-04-20T10:31:16Z

Greptile Summary

此 PR 通过在 prompt 中添加 Step 1（获取旧 inline comment → 判断是否已修复 → 回复并 resolve thread）来解决跨 push 重复评论的问题，同时将 --max-turns 从 30 提升至 256 以修复大型 PR 被截断的问题，并收窄了 allowedTools 中的 gh api 权限范围。整体方向正确，之前已有评论覆盖了 bot 用户名硬编码（claude[bot] 实际 slug 待确认）和工具权限范围等核心问题。

需关注：Step 1 对每个历史 comment 均需独立执行文件读取 + API 调用，在有多轮历史的 PR 上会消耗显著的时间和 turns；现有 timeout-minutes: 15 未随 --max-turns 一同上调，对历史评论较多的 PR 存在超时风险，建议考虑调整为 30–45 分钟。

Confidence Score: 4/5

可以合并，但建议在合并后确认实际 bot 用户名并评估 timeout-minutes 是否充足。

P1 级别的 bot 用户名问题已在之前评论中指出（未修复会导致去重逻辑静默失效），本轮新发现为 P2 级分页截断和超时风险，不阻断合并但应跟进。

.github/workflows/code-review.yml：Step 1 去重逻辑（bot 用户名、reviewThreads 分页、timeout 交互）需重点关注。

Important Files Changed

Filename	Overview
.github/workflows/code-review.yml	新增 Step 1 去重逻辑（获取旧 comment、检查是否 fixed、解析 thread）、将 `--max-turns` 从 30 提升至 256、扩展 allowedTools，并收紧提示词。主要风险：`reviewThreads(first:100)` 分页截断、`timeout-minutes: 15` 与新增 Step 1 开销的交互，以及之前已标注的 bot 用户名硬编码问题。

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant Claude as Claude (Opus)
    participant API as GitHub REST API
    participant GQL as GitHub GraphQL API

    GH->>Claude: 触发 code-review job（PR push）

    Note over Claude: Step 1: 去重旧评论
    Claude->>API: GET /pulls/{pr}/comments (filter: claude[bot])
    API-->>Claude: 旧 inline comments 列表

    Claude->>GH: gh pr diff {pr}
    GH-->>Claude: 当前 diff

    loop 每条旧 comment
        Claude->>GH: Read 当前文件内容
        GH-->>Claude: 文件内容
        alt 已修复
            Claude->>API: POST /comments/{id}/replies (✅ Fixed)
            Claude->>GQL: resolveReviewThread(threadId)
        else 未修复
            Claude->>API: POST /comments/{id}/replies (still present)
        end
    end

    Note over Claude: Step 2: 审查新变更
    Claude->>GH: 读取完整文件 / git blame / git log
    GH-->>Claude: 文件内容 & 历史

    Note over Claude: Step 3: 输出结果
    Claude->>API: mcp__github_inline_comment (新发现 ≤10)
    Claude->>GH: gh pr comment --body "…摘要…"

_{Reviews (2): Last reviewed commit: "fix(code-review): address greptile feedb..." | Re-trigger Greptile}

greptile-apps · 2026-04-20T10:31:23Z

+            Before reviewing anything new, handle your prior comments on this PR so issues are not reported twice across commits.
+
+            1. Fetch your previous inline comments:
+               `gh api repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number || inputs.pr_number }}/comments --jq '[.[] | select(.user.login == "claude[bot]")]'`


Bot 用户名 claude[bot] 需确认

select(.user.login == "claude[bot]") 使用了硬编码的 bot 用户名。GitHub Apps 的 bot 登录名格式为 <app-slug>[bot]，具体取决于 Anthropic 注册 GitHub App 时使用的 slug（可能是 claude-code[bot]、anthropic-claude[bot] 等）。如果名称不匹配，Step 1 的去重逻辑会静默失效——jq 返回空数组，Claude 跳过所有去重工作，每次 push 仍然重复上报相同问题，与本 PR 目标背道而驰。

建议先在某个 consumer repo 上通过 gh api repos/{owner}/{repo}/pulls/{pr}/comments | jq '.[].user.login' 确认实际 bot 用户名后，再将其固定在此处。

- Narrow Bash(gh api:*) to Bash(gh api repos/*/pulls/*/comments*) + Bash(gh api graphql*) for least-privilege. Both subpaths are what the Step 1 dedup logic actually needs (PR comments REST + GraphQL resolveReviewThread mutation). - Add Step 3 summary-comment instruction so the existing Bash(gh pr comment:*) permission is actually used. Mirrors the pattern already in taptap/UrhoX/.github/workflows/pr-review.yml: one PR-level summary with verdict + new findings + count of resolved prior issues. The third greptile P1 ("verify claude[bot] login name") is empirically correct as-is: gh api repos/taptap/maker/pulls/439/comments confirms the bot login is `claude[bot]`. No change needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lishuceo · 2026-04-20T10:34:54Z

回复 greptile 的 review：

P1 — claude[bot] 用户名：已实地验证，无需修改。在 taptap/maker#439 上跑：

$ gh api repos/taptap/maker/pulls/439/comments --jq '[.[] | .user.login] | unique'
["claude[bot]","greptile-apps[bot]","lishuceo"]

bot 登录名就是 claude[bot]。

P2 — Bash(gh api:*) 权限过宽：已采纳，收窄到 Bash(gh api repos/*/pulls/*/comments*) + Bash(gh api graphql*)，对应 Step 1 实际用到的两条路径（PR comments REST + resolveReviewThread GraphQL mutation）。

P3 — Bash(gh pr comment:*) 未使用：反向修复——保留权限，在 prompt 里加 Step 3「发布单条 PR 级汇总评论」（verdict + 新发现列表 + 已解决的旧问题数），对齐 taptap/UrhoX/.github/workflows/pr-review.yml 的现有模式。一行 verdict 比纯 inline 评论更易读。

均见 db1a084。

Six reviewer threads addressed: §4.2 (claude bot): Stop claiming "no yaml changes needed" — we do need to update the prompt one-liner from "REVIEW.md" to ".github/REVIEW.md". Document the tradeoff (semantic clarity vs root-level discoverability) explicitly. §4.3 (greptile + claude bot): Two real implementation traps in the prompt template — trailing comma when extra_allowed_tools is empty, and shell metacharacter injection through raw input interpolation. Move allowedTools composition to a dedicated shell step, document why claude_args block scalar is fine (PR #14 already proves the action parses it correctly). §5 (greptile): Add inline NOTE comment on `uses: ...@main` in both consumer examples pointing at §8 Q3 — flagging that this is a current convention, not an endorsement. §6 (claude bot — security): Substantially expanded. Added §6.1 on character-level sanitization (reject `"`, backtick, `$`, `\`, `;`, `|`, `&`, `>`, `<`, newlines), §6.2 changing match semantics from prefix to strict per-pattern equality, §6.4 on validation step placement, §6.5 on logging. §7 (claude bot — factual accuracy): Replace the misleading "consistent with PR #14" sentence with a per-input table that distinguishes inputs that mirror current behavior (model, max_turns) from genuinely new defaults (max_budget_usd, confidence_min, max_findings) and call out that confidence_min/max_findings are "formally new" — they were prompt literals before, now lifted to parameters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(rfc): RFC 0001 — code-review workflow per-repo customization Drafts the architecture for letting consumer repos customize the org-wide reusable code-review workflow without forking it. Covers: - A 5-category taxonomy of customization needs (numeric knobs, prompt content, tool surface, path filters, review goals) and which lives in workflow inputs vs which lives in a per-repo .github/REVIEW.md. - New input list (model, max_turns, max_budget_usd, confidence_min, max_findings, paths_focus, paths_skip, extra_allowed_tools) — all with defaults so existing consumers (maker) need zero changes. - Security model for extra_allowed_tools: explicit safe-tool allowlist, deny-list for network/destructive commands, fail-fast validation step. - Phased rollout (5 input + path filter + tool extension + docs) so each phase is independently mergeable and backward-compatible. - Open questions surfaced for discussion: REVIEW.md location, whether to also support a structured config file, version tag strategy, allowlist maintenance, missing-REVIEW.md nudge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rfc): address review feedback on RFC 0001 Six reviewer threads addressed: §4.2 (claude bot): Stop claiming "no yaml changes needed" — we do need to update the prompt one-liner from "REVIEW.md" to ".github/REVIEW.md". Document the tradeoff (semantic clarity vs root-level discoverability) explicitly. §4.3 (greptile + claude bot): Two real implementation traps in the prompt template — trailing comma when extra_allowed_tools is empty, and shell metacharacter injection through raw input interpolation. Move allowedTools composition to a dedicated shell step, document why claude_args block scalar is fine (PR #14 already proves the action parses it correctly). §5 (greptile): Add inline NOTE comment on `uses: ...@main` in both consumer examples pointing at §8 Q3 — flagging that this is a current convention, not an endorsement. §6 (claude bot — security): Substantially expanded. Added §6.1 on character-level sanitization (reject `"`, backtick, `$`, `\`, `;`, `|`, `&`, `>`, `<`, newlines), §6.2 changing match semantics from prefix to strict per-pattern equality, §6.4 on validation step placement, §6.5 on logging. §7 (claude bot — factual accuracy): Replace the misleading "consistent with PR #14" sentence with a per-input table that distinguishes inputs that mirror current behavior (model, max_turns) from genuinely new defaults (max_budget_usd, confidence_min, max_findings) and call out that confidence_min/max_findings are "formally new" — they were prompt literals before, now lifted to parameters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rfc): use env: for compose-tools input passing Address claude bot's round-2 finding (PRRT thread on §4.3): the compose-tools snippet I added in 02bc76b still had the GitHub Actions script-injection antipattern — \`EXTRA='\${{ inputs.extra_allowed_tools }}'\` expands at YAML parse time and concatenates the raw input into the shell script, exactly what §6.1 is meant to block. Switch to \`env: EXTRA: \${{ ... }}\` so EXTRA is a proper shell environment variable, not a parse-time string interpolation. Add an inline comment explaining why this matters and propagate the same pattern to the validate-extra-tools step (defense in depth before the validator runs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rfc): paths_skip is a strong prompt instruction, not a hard filter Address claude bot's round-3 finding: §4.2 called paths_skip a "硬过滤 / 绝不 review" but the implementation just injects the value into the prompt as text — LLMs don't provide a 100% instruction-following guarantee, so the language overpromised. Reword §4.2 to call it a "structured strong prompt instruction" (high but not absolute compliance) and explicitly warn against using paths_skip as a security boundary — sensitive files shouldn't be in the diff in the first place; rely on .gitignore / git-secrets / pre-commit hooks upstream. Add alternative E to §10 documenting the actual hard-filter approach (pre-process diff to physically strip paths) as deferred until a concrete need surfaces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions Bot requested a review from Copilot April 20, 2026 10:28

lishuceo review requested due to automatic review settings April 20, 2026 10:28

greptile-apps Bot reviewed Apr 20, 2026

View reviewed changes

github-actions Bot requested a review from Copilot April 20, 2026 10:34

lishuceo review requested due to automatic review settings April 20, 2026 10:34

lishuceo merged commit 7ac3834 into main Apr 20, 2026
4 of 5 checks passed

lishuceo deleted the fix/code-review-dedup-and-turns branch April 20, 2026 12:11

claude Bot mentioned this pull request Apr 20, 2026

RFC 0001: code-review workflow per-repo customization #15

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(code-review): dedup prior comments and raise max-turns to 256#14

fix(code-review): dedup prior comments and raise max-turns to 256#14
lishuceo merged 2 commits intomainfrom
fix/code-review-dedup-and-turns

lishuceo commented Apr 20, 2026

Uh oh!

greptile-apps Bot commented Apr 20, 2026 •

edited

Loading

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

lishuceo commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lishuceo commented Apr 20, 2026

Summary

Why org-wide and not per-repo

Test plan

Uh oh!

greptile-apps Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lishuceo commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Apr 20, 2026 •

edited

Loading