Skip to content

NO-ISSUE: Claude Agent for analyzing prow jobs#5800

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
copejon:no-issue-claude-prow-failure-analyzing-agent
Dec 15, 2025
Merged

NO-ISSUE: Claude Agent for analyzing prow jobs#5800
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
copejon:no-issue-claude-prow-failure-analyzing-agent

Conversation

@copejon
Copy link
Contributor

@copejon copejon commented Nov 24, 2025

Init agent that is capable of analyzing CI failures in prow. The agent's workflow focuses on a methodical approach to failure analysis, following these steps:

  1. Create a list errors and failures found in the build.log
  2. Characterize each error and failure based on context from the build log and use this to determine if the error is an infra issue, microshift runtime error, or a legitimate test failure.
  3. Investigate further depending on the nature of the error:
    • For legitimate test errors, analyze the test logs.
    • For runtime errors, download and analyze the sos report
  4. Produce a report based on the findings of step 3.

To invoke the agent, pass the prow job's url to claude, e.g.

$ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/5596/pull-ci-openshift-microshift-main-e2e-aws-tests-arm/1995881118070476800

There's plenty of room for improvement here. For future contributions, consider:

  • Delegation: use sub-agents to perform specialized, lower-level analysis (sos-report agent, microshift source code agent, etc). Especially useful for scoping agent's context to the task
  • Additional workflow steps, e.g. after identifying a legitmate test failure, analyze microshift code base (or diff, for PRs) to determine where the error was introduced.
  • Honing Suggested Remidations: in this PR, the agent is not given much direction on the HOW of error fixing and bases these recommendations off the context it's given.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 24, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 24, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 24, 2025
@copejon
Copy link
Contributor Author

copejon commented Nov 24, 2025

/test test-unit
/test verify

@kasturinarra
Copy link
Contributor

@copejon hey, should you change this command ? $ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/9999/pull-ci-openshift-microshift-release-4.20-metal-periodic-test/1234567894561234156

I tried to run it using @openshift-ci-analysis <job_url_name>`

@kasturinarra
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 26, 2025
@@ -0,0 +1,18 @@
{
"permissions": {
"allow": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these allow permissions overrride the allow-tools from other Claude commands? for example

I'd follow the approach to set permissions individually on each Claude command instead of adding global allow permissions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds an agent, so the perms have to be specified in the settings.json. That said, the settings.json doesn't override commands.

@copejon copejon force-pushed the no-issue-claude-prow-failure-analyzing-agent branch from 8c01b3f to be7e359 Compare December 2, 2025 17:44
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2025
@copejon
Copy link
Contributor Author

copejon commented Dec 2, 2025

@copejon hey, should you change this command ? $ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/9999/pull-ci-openshift-microshift-release-4.20-metal-periodic-test/1234567894561234156

I tried to run it using @openshift-ci-analysis <job_url_name>`

@kasturinarra That's my fault. The url in the description isn't for a real job. Will fix!

Also, this is structured as an agent. Just passing the url to claude (as long as claude is run in the project root) is enough to trigger the agent.

@copejon copejon force-pushed the no-issue-claude-prow-failure-analyzing-agent branch from be7e359 to 893a6ea Compare December 12, 2025 17:10
@copejon copejon marked this pull request as ready for review December 12, 2025 17:24
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2025
@openshift-ci openshift-ci bot requested review from agullon and pacevedom December 12, 2025 17:25
@copejon copejon changed the title NO-ISSUE add prow job analyzing claude agent NO-ISSUE Claude Agent for analyzing prow jobs Dec 12, 2025
@ggiguash
Copy link
Contributor

/lgtm
/verified by manual-testing

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Dec 15, 2025
@openshift-ci-robot
Copy link

@ggiguash: This PR has been marked as verified by manual-testing.

Details

In response to this:

/lgtm
/verified by manual-testing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ggiguash
Copy link
Contributor

/retitle NO-ISSUE: Claude Agent for analyzing prow jobs

@openshift-ci openshift-ci bot changed the title NO-ISSUE Claude Agent for analyzing prow jobs NO-ISSUE: Claude Agent for analyzing prow jobs Dec 15, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 15, 2025
@openshift-ci-robot
Copy link

@copejon: This pull request explicitly references no jira issue.

Details

In response to this:

Init agent that is capable of analyzing CI failures in prow. The agent's workflow focuses on a methodical approach to failure analysis, following these steps:

  1. Create a list errors and failures found in the build.log
  2. Characterize each error and failure based on context from the build log and use this to determine if the error is an infra issue, microshift runtime error, or a legitimate test failure.
  3. Investigate further depending on the nature of the error:
  • For legitimate test errors, analyze the test logs.
  • For runtime errors, download and analyze the sos report
  1. Produce a report based on the findings of step 3.

To invoke the agent, pass the prow job's url to claude, e.g.

$ claude https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_microshift/5596/pull-ci-openshift-microshift-main-e2e-aws-tests-arm/1995881118070476800

There's plenty of room for improvement here. For future contributions, consider:

  • Delegation: use sub-agents to perform specialized, lower-level analysis (sos-report agent, microshift source code agent, etc). Especially useful for scoping agent's context to the task
  • Additional workflow steps, e.g. after identifying a legitmate test failure, analyze microshift code base (or diff, for PRs) to determine where the error was introduced.
  • Honing Suggested Remidations: in this PR, the agent is not given much direction on the HOW of error fixing and bases these recommendations off the context it's given.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 15, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 15, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: copejon, ggiguash, kasturinarra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [copejon,ggiguash,kasturinarra]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 15, 2025

@copejon: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 73acdc1 into openshift:main Dec 15, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants