Skip to content

Improve vmr-codeflow-status skill: current-state-first analysis, force push & empty diff detection#124231

Merged
lewing merged 11 commits intodotnet:mainfrom
lewing:skill/vmr-codeflow-force-push-detection
Feb 11, 2026
Merged

Improve vmr-codeflow-status skill: current-state-first analysis, force push & empty diff detection#124231
lewing merged 11 commits intodotnet:mainfrom
lewing:skill/vmr-codeflow-force-push-detection

Conversation

@lewing
Copy link
Member

@lewing lewing commented Feb 10, 2026

This improves the vmr-codeflow-status Copilot CLI skill with enhancements discovered while investigating a stale codeflow PR (#124095).

Design Change: Current State First

Inspired by the code-review skill restructuring in #124229, the analysis now follows a "current state first, comments as history" pattern:

  1. Assess current state from primary signals (PR state, diff size, force pushes, timeline) before reading any Maestro comments
  2. Form an independent verdict (MERGED / CLOSED / NO-OP / IN PROGRESS / STALE / ACTIVE)
  3. Then read comments as historical context to explain how we got to the current state
  4. Recommendations are generated by the agent from a structured JSON summary, not hardcoded script logic

Previously, the script read Maestro conflict/staleness comments first and drew conclusions from them — even when those comments were stale (e.g., after someone had already force-pushed a resolution). Recommendations were a 130-line if/elseif chain that couldn't adapt to edge cases.

New Capabilities

1. Current State Assessment (Step 2)

Synthesizes primary signals into an immediate verdict before any comment parsing:

  • ✅ MERGED — PR has been merged, no action needed
  • ✖️ CLOSED — PR was closed without merging, Maestro should create a replacement
  • 📭 NO-OP — empty diff, likely already resolved
  • 🔄 IN PROGRESS — recent force push, awaiting update
  • ⏳ STALE — no activity for >3 days
  • ✅ ACTIVE — PR has content and recent activity

2. Force Push Detection

Queries PR timeline for head_ref_force_pushed events, showing who force-pushed and when. Uses --slurp for correct pagination handling.

3. Empty Diff Detection

Checks changedFiles/additions/deletions to flag 0-change PRs as NO-OP (regardless of whether a force push caused it).

4. Post-Action Staleness Analysis

Cross-references force push timestamps against conflict/staleness warnings. When a force push post-dates these warnings, the script correctly identifies the issues as potentially resolved.

5. JSON Summary + Agent-Generated Recommendations

The script's hardcoded recommendations (130+ lines of branching logic) have been replaced with:

  • A [CODEFLOW_SUMMARY] JSON block emitted at the end of the script with all key facts
  • isCodeflowPR boolean for explicit codeflow vs non-codeflow classification
  • A "Generating Recommendations" section in SKILL.md that teaches the agent how to reason over the summary
  • The agent now synthesizes contextual, nuanced recommendations instead of picking from a canned decision tree

6. Codeflow History (renamed section)

The former "Staleness & Conflict Check" is now "Codeflow History" with a framing line: "Maestro warnings (historical — see Current State for present status)". This signals that comments describe past events, not necessarily current state.

Testing

…and post-action state

Add three improvements based on investigating a stale codeflow PR:

1. Force push detection - Query PR timeline for head_ref_force_pushed
   events, showing who force-pushed and when.

2. Empty diff detection - Check changedFiles/additions/deletions from
   the PR API to flag 0-change PRs that are effectively no-ops.

3. Post-action staleness analysis - Cross-reference force push timestamps
   against conflict/staleness warnings. When a force push post-dates
   these warnings and produces an empty diff, the PR is identified as a
   no-op with tailored recommendations (merge empty, close, or
   force-trigger).
Copilot AI review requested due to automatic review settings February 10, 2026 17:10
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 10, 2026
@lewing lewing requested a review from steveisok February 10, 2026 17:12
@lewing lewing added area-skills Agent Skills and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Feb 10, 2026
@lewing lewing requested a review from akoeplinger February 10, 2026 17:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the vmr-codeflow-status Copilot CLI skill to better diagnose codeflow PR state transitions by detecting force-pushes, recognizing empty/no-op PR diffs, and refining recommendations when warnings have likely already been acted on.

Changes:

  • Adds PR “empty diff” detection using changedFiles/additions/deletions from gh pr view.
  • Queries the PR timeline for head_ref_force_pushed events and reports actor/time/SHA.
  • Cross-references force-push timestamps against conflict/staleness warning timestamps to adjust recommendations (merge/close/force-trigger vs resolve-conflict).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
.github/skills/vmr-codeflow-status/scripts/Get-CodeflowStatus.ps1 Adds force-push timeline querying, empty-diff detection, and post-action staleness/conflict analysis to refine recommendations.
.github/skills/vmr-codeflow-status/SKILL.md Documents the new force-push and empty-diff detection behavior and adds guidance for “Maestro may be stuck” scenarios.

Inspired by the code-review skill pattern (PR dotnet#124229), restructure
the codeflow analysis to form an independent assessment from primary
signals before consulting Maestro comments:

- New 'Current State' section (Step 2) synthesizes empty diff, force
  push events, and activity recency into an immediate verdict:
  NO-OP / IN PROGRESS / STALE / ACTIVE

- PR Branch Analysis now appears before comments, providing commit
  categorization as primary data

- Renamed 'Staleness & Conflict Check' to 'Codeflow History' to
  clearly signal these are past events, not current state. Framing
  line directs reader to Current State for present status.

- Recommendations driven by current state assessment, with comment
  history as supporting context

The principle: comments tell you the history, not the present.
Determine the PR's actual state from primary signals (diff, branch,
timeline) before consulting Maestro comments for context.
@lewing lewing changed the title Improve vmr-codeflow-status skill: detect force pushes, empty diffs, and post-action state Improve vmr-codeflow-status skill: current-state-first analysis, force push & empty diff detection Feb 10, 2026
- Current State now checks PR state first: MERGED and CLOSED override
  all other heuristics (force push, staleness, etc.)
- Empty diff alone is now sufficient for NO-OP verdict (previously
  required both empty diff AND force push)
- Recommendations are state-aware: skip merge/close advice for
  already-terminal PRs
- Updated SKILL.md with MERGED/CLOSED state documentation

Fixes consensus findings from multi-model review (Claude Sonnet 4,
GPT-5).
Copilot AI review requested due to automatic review settings February 10, 2026 17:39
Use --slurp to merge paginated results into a single JSON array
before filtering, avoiding invalid JSON when timeline spans
multiple pages. Add try/catch for parse resilience.

Addresses Copilot review comment.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/skills/vmr-codeflow-status/scripts/Get-CodeflowStatus.ps1:1523

  • The closing-brace comment # end of else (non-terminal state) appears to annotate the wrong else block (it actually closes the $issues.Count -eq 0 branch). Consider updating/removing the comment (and/or re-indenting the nested blocks) to avoid confusion when modifying this section later.
        Write-Host "    1. Wait — Maestro should auto-update the PR" -ForegroundColor White
        Write-Host "    2. Trigger manually — if auto-updates seem delayed" -ForegroundColor White
        if ($subscriptionId) {

The script's Step 9 (Recommendations) was 130+ lines of hardcoded
if/elseif branching logic reasoning about state combinations. This
is exactly what LLMs do better — given structured facts, produce
contextual advice.

Changes:
- Script now emits a [CODEFLOW_SUMMARY] JSON block with all key
  facts (currentState, freshness, warnings, commits, etc.)
- Removed hardcoded recommendations from script
- Added 'Generating Recommendations' section to SKILL.md that
  teaches the agent how to reason over the JSON summary
- Agent synthesizes contextual, nuanced recommendations instead
  of canned text from a decision tree

The script still produces full human-readable output for Steps 1-8.
The JSON summary is the structured handoff point where the agent
takes over for the reasoning-heavy final step.
…exit code

Multi-model testing (Claude Sonnet 4, GPT-5) identified 5 issues:
- Add isCodeflowPR boolean to JSON summary for explicit classification
- isUpToDate is now null (not false) when freshness data unavailable
- daysSinceUpdate clamped to 0 minimum (was negative due to clock skew)
- Script exits 0 on completion (gh api failures leaked LASTEXITCODE=1)
- SKILL.md: non-codeflow PRs skip all darc command suggestions
Copilot AI review requested due to automatic review settings February 10, 2026 18:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

- Collapse duplicate isEmptyDiff check (with/without force push both
  returned NO-OP)
- Rename lastWarnTime2 to lastStalenessTime for consistency
Force-trigger overwrites the existing PR branch; it does not create
a new PR. A normal trigger after closing does create a new PR.
Added a table and warning to prevent the common mistake of
recommending 'close then force-trigger'.
Copilot AI review requested due to automatic review settings February 10, 2026 18:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Copilot AI review requested due to automatic review settings February 10, 2026 23:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/skills/vmr-codeflow-status/scripts/Get-CodeflowStatus.ps1:176

  • In Get-CodeflowPRHealth, clearing HasStaleness depends on successfully fetching/parsing gh pr checks (the commit-date comparison is nested under the $checksJson block). If gh pr checks fails/returns nothing (auth/rate limit), staleness will remain flagged even when the PR has commits newer than the last staleness warning. Consider running the “last commit after last warning” check independently of gh pr checks (and independently of mergeability), using the already-fetched comments + a gh pr view --json commits call guarded with try/catch/exit-code checks.
        $checksJson = gh pr checks $PRNumber -R $Repo --json name,state 2>$null
        if ($LASTEXITCODE -eq 0 -and $checksJson) {
            try {
                $checks = ($checksJson -join "`n") | ConvertFrom-Json
                $codeflowCheck = @($checks | Where-Object { $_.name -match 'Codeflow verification' }) | Select-Object -First 1
                if (($codeflowCheck -and $codeflowCheck.state -eq 'SUCCESS') -or $isMergeable) {
                    # No merge conflict — either Codeflow verification passes or PR is mergeable
                    $hasConflict = $false
                    # For staleness, check if there are commits after the last staleness warning
                    if ($hasStaleness) {
                        $commitsJson = gh pr view $PRNumber -R $Repo --json commits --jq '.commits[-1].committedDate' 2>$null
                        if ($LASTEXITCODE -eq 0 -and $commitsJson) {
                            $lastCommitTime = ($commitsJson -join "").Trim()
                            $lastWarnTime = $null
                            foreach ($comment in $prDetail.comments) {
                                if ($comment.author.login -match '^dotnet-maestro' -and $comment.body -match 'codeflow cannot continue|the source repository has received code changes') {
                                    $warnDt = [DateTimeOffset]::Parse($comment.createdAt).UtcDateTime
                                    if (-not $lastWarnTime -or $warnDt -gt $lastWarnTime) {
                                        $lastWarnTime = $warnDt
                                    }
                                }
                            }
                            $commitDt = if ($lastCommitTime) { [DateTimeOffset]::Parse($lastCommitTime).UtcDateTime } else { $null }
                            if ($lastWarnTime -and $commitDt -and $commitDt -gt $lastWarnTime) {
                                $hasStaleness = $false
                            }
                        }

Addresses review comment: when gh api .../timeline fails, emit
Write-Warning so the user knows force push data is missing, and
add forcePushes.fetchSucceeded boolean to the JSON summary so
recommendation logic can account for incomplete timeline data.
@lewing lewing enabled auto-merge (squash) February 11, 2026 15:23
@lewing lewing merged commit 964b374 into dotnet:main Feb 11, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-skills Agent Skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants