[smoke-detector] 🚨 CRITICAL: GenAIScript Invalid Model (gpt-4.1) - 5th Consecutive Failure Post-v0.24.0

# 🚨 CRITICAL RECURRING FAILURE - 5th Consecutive Occurrence

## Summary
The Smoke GenAIScript workflow has **FAILED AGAIN** after the v0.24.0 release with the **EXACT SAME ROOT CAUSE** that has been reported in **THREE previous issues** (#2157, #2204, #2207). This is the **5th consecutive failure** of this smoke test since 2025-10-22. Despite multiple investigations and issue reports, the configuration has never been corrected.

## Failure Details
- **Run**: [#18757658104](https://github.com/githubnext/gh-aw/actions/runs/18757658104)
- **Commit**: [8993988](https://github.com/githubnext/gh-aw/commit/899398806856cfa98a8b385c1bba0f12c077f20f) - "Release v0.24.0"
- **Trigger**: schedule (automated smoke test)
- **Duration**: 3.5 minutes
- **Failed Job**: detection (1.2 minutes)
- **Status**: ❌ **FAILED**

## Root Cause Analysis

### The Problem Persists UNCHANGED

The GenAIScript configuration **STILL** uses an invalid OpenAI model name:

**Location**: `.github/workflows/shared/genaiscript.md` line 6
```yaml
GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4.1"
```

**Problem**: `gpt-4.1` **DOES NOT EXIST** in OpenAI's model catalog.

**Valid OpenAI models**:
- `gpt-4o` ✅ (recommended)
- `gpt-4-turbo` ✅
- `gpt-4` ✅
- `gpt-3.5-turbo` ✅

### Error Chain (Identical to All Previous Occurrences)

1. GenAIScript attempts to resolve and use model `openai:gpt-4.1`
2. OpenAI API rejects the request (invalid model)
3. GenAIScript receives undefined/null response
4. GenAIScript crashes: **`TypeError: Cannot read properties of undefined (reading 'text')`**
5. Detection job fails with exit code 255
6. Smoke test marked as failed

### Stack Trace
```
2025-10-23T18:10:09.4293104Z 2025-10-23T18:10:09.429Z genaiscript:error {
2025-10-23T18:10:09.4293428Z   name: 'TypeError',
2025-10-23T18:10:09.4293872Z   message: "Cannot read properties of undefined (reading 'text')",
2025-10-23T18:10:09.4294339Z   stack: "TypeError: Cannot read properties of undefined (reading 'text')\n" +
2025-10-23T18:10:09.4295107Z     '    at githubActionSetOutputs ((redacted))\n' +
2025-10-23T18:10:09.4296330Z     '    at async Command.runScriptWithExitCode ((redacted))'
2025-10-23T18:10:09.4297303Z }
```

## Failed Jobs and Errors

### Job Execution Summary
1. ✅ **activation** - succeeded (2s)
2. ✅ **agent** - succeeded (1.6m) - Agent completed successfully
3. ❌ **detection** - **FAILED** (1.2m) - Threat detection crashed
4. ✅ **create_issue** - succeeded (5s)
5. ⏭️ **missing_tool** - skipped

## Investigation Findings

### Complete Failure Timeline

| # | Run ID | Date/Time (UTC) | Trigger | Issue Created | Issue Status |
|---|--------|-----------------|---------|---------------|--------------|
| 1 | [18727962258](https://github.com/githubnext/gh-aw/actions/runs/18727962258) | 2025-10-22 19:45:52 | workflow_dispatch | #2157 | Closed as "not_planned" |
| 2 | [18733557489](https://github.com/githubnext/gh-aw/actions/runs/18733557489) | 2025-10-23 00:19:22 | schedule | - | Covered by #2157 |
| 3 | [18739169072](https://github.com/githubnext/gh-aw/actions/runs/18739169072) | 2025-10-23 06:07:04 | schedule | #2204 | Closed as "completed" |
| 4 | [18747816413](https://github.com/githubnext/gh-aw/actions/runs/18747816413) | 2025-10-23 12:08:41 | schedule | #2207 | Closed as "completed" |
| 5 | **[18757658104](https://github.com/githubnext/gh-aw/actions/runs/18757658104)** | **2025-10-23 18:06:57** | **schedule** | **This issue** | **Open** |

**Pattern**: Failing every ~6 hours on scheduled runs  
**Duration**: Over 22 hours of continuous failures  
**Failure Rate**: 100% since first occurrence

### Why This Is Critical NOW

1. **Post-Release Failure**: This failure occurred immediately after the v0.24.0 release, indicating the configuration issue persists across releases
2. **Multiple Closed Issues**: Three separate issues (#2157, #2204, #2207) have been created and closed without fixing the root cause
3. **Wasted Resources**: Every scheduled run (every ~6 hours) consumes CI minutes while producing no value
4. **Security Gap**: Threat detection has been non-functional for over 22 hours
5. **False Confidence**: The team may not realize smoke tests are failing continuously

## Recommended Actions

### 🔴 CRITICAL - Immediate Fix (1 minute)

**Update `.github/workflows/shared/genaiscript.md` line 6:**

```diff
- GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4.1"
+ GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4o"
```

**That's it.** One line change. Will fix all 5 failures instantly.

### 🟡 Alternative: Disable Scheduled Workflow

If GenAIScript smoke tests are not being maintained, **disable the scheduled trigger** to stop generating failed runs and investigation overhead:

```yaml
# .github/workflows/smoke-genaiscript.md
# Comment out or remove the schedule trigger
```

### 🟢 Long-Term: Prevent Recurrence

1. **Add Pre-Flight Model Validation** - Validate model names before execution
2. **Schema Validation** - Use JSON schema to validate workflow configurations
3. **Better Error Handling** - Work with GenAIScript team to improve error messages
4. **Documentation** - Document valid model names in configuration files

## Historical Context

From investigation database (`/tmp/gh-aw/cache-memory/investigations/`):

```json
{
  "pattern_signature": "GENAISCRIPT_INVALID_MODEL",
  "first_occurrence": "2025-10-22T19:45:52Z",
  "recurrence_count": 5,
  "failure_rate": "100%",
  "days_recurring": 1,
  "hours_between_occurrences": [5.5, 6.2, 6.6, 6.0],
  "is_flaky": false,
  "external_dependency": "OpenAI API",
  "persistence_across_releases": true
}
```

## Impact Assessment

**Severity**: 🔴 **CRITICAL**
- All GenAIScript smoke tests failing continuously
- Threat detection non-functional for 22+ hours
- Multiple issues created and closed without resolution
- Post-release failure indicates configuration persists across versions

**Urgency**: 🔴 **IMMEDIATE**
- Simple one-line fix available
- Continues to fail every 6 hours indefinitely
- Wasting CI resources and investigation time

**Scope**:
- Affects: All workflows using `shared/genaiscript.md`
- Frequency: Every scheduled smoke test run
- Duration: Ongoing since 2025-10-22 19:45 UTC (22+ hours)

## Reproduction Steps

1. Configure GenAIScript with model: `openai:gpt-4.1`
2. Set OPENAI_API_KEY (so validation passes)
3. Run any GenAIScript workflow
4. Observe failure when invalid model is used
5. See TypeError accessing undefined result

## Related Issues

- #2157 - Original investigation (closed as "not_planned")
- #2204 - 3rd occurrence (closed as "completed")
- #2207 - 4th occurrence (closed as "completed")
- #2142 - Similar GenAIScript error (different root cause - missing API key)

---

## Request for Action

This issue is being created to request a decision on one of the following:

1. **Fix the configuration** (1-line change) to resolve the issue permanently
2. **Disable the scheduled workflow** if GenAIScript smoke tests are not planned to be maintained
3. **Explain the strategy** if this is expected behavior (so future investigations understand context)

The current situation - where the same failure occurs every 6 hours, generates investigation reports, creates issues that get closed, but nothing gets fixed - is not sustainable.

---

## Investigation Metadata

- **Investigator**: Smoke Detector (Failure Investigation Agent)
- **Investigation Run**: [#18757754195](https://github.com/githubnext/gh-aw/actions/runs/18757754195)
- **Pattern**: `GENAISCRIPT_INVALID_MODEL` (5th occurrence)
- **Investigation Record**: `/tmp/gh-aw/cache-memory/investigations/2025-10-23-18757658104.json`
- **Created**: 2025-10-23T18:15:00Z

> 🤖 AI generated by [Smoke Detector - Smoke Test Failure Investigator](https://github.com/githubnext/gh-aw/actions/runs/18757754195)
> This is an automated investigation of recurring smoke test failures.




> AI generated by [Smoke Detector - Smoke Test Failure Investigator](https://github.com/githubnext/gh-aw/actions/runs/18757754195)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[smoke-detector] 🚨 CRITICAL: GenAIScript Invalid Model (gpt-4.1) - 5th Consecutive Failure Post-v0.24.0 #2227

🚨 CRITICAL RECURRING FAILURE - 5th Consecutive Occurrence

Summary

Failure Details

Root Cause Analysis

The Problem Persists UNCHANGED

Error Chain (Identical to All Previous Occurrences)

Stack Trace

Failed Jobs and Errors

Job Execution Summary

Investigation Findings

Complete Failure Timeline

Why This Is Critical NOW

Recommended Actions

🔴 CRITICAL - Immediate Fix (1 minute)

🟡 Alternative: Disable Scheduled Workflow

🟢 Long-Term: Prevent Recurrence

Historical Context

Impact Assessment

Reproduction Steps

Related Issues

Request for Action

Investigation Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

#	Run ID	Date/Time (UTC)	Trigger	Issue Created	Issue Status
1	18727962258	2025-10-22 19:45:52	workflow_dispatch	#2157	Closed as "not_planned"
2	18733557489	2025-10-23 00:19:22	schedule	-	Covered by #2157
3	18739169072	2025-10-23 06:07:04	schedule	#2204	Closed as "completed"
4	18747816413	2025-10-23 12:08:41	schedule	#2207	Closed as "completed"
5	18757658104	2025-10-23 18:06:57	schedule	This issue	Open

[smoke-detector] 🚨 CRITICAL: GenAIScript Invalid Model (gpt-4.1) - 5th Consecutive Failure Post-v0.24.0 #2227

Description

🚨 CRITICAL RECURRING FAILURE - 5th Consecutive Occurrence

Summary

Failure Details

Root Cause Analysis

The Problem Persists UNCHANGED

Error Chain (Identical to All Previous Occurrences)

Stack Trace

Failed Jobs and Errors

Job Execution Summary

Investigation Findings

Complete Failure Timeline

Why This Is Critical NOW

Recommended Actions

🔴 CRITICAL - Immediate Fix (1 minute)

🟡 Alternative: Disable Scheduled Workflow

🟢 Long-Term: Prevent Recurrence

Historical Context

Impact Assessment

Reproduction Steps

Related Issues

Request for Action

Investigation Metadata

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions