feat: verification modes + evidence fields + transport combo rejection (hosted parity) by govindkavaturi-art · Pull Request #18 · cueapi/cueapi-core

govindkavaturi-art · 2026-04-17T01:30:37Z

Summary

Ports the outcome-verification feature from the hosted monorepo into cueapi-core, and fixes the partially-ported /verify endpoint from PR #15 so it honors the {valid, reason} contract documented by the hosted API.

What changed

Schema (`app/schemas/`)

VerificationMode enum: none, require_external_id, require_result_url, require_artifacts, manual
VerificationPolicy sub-object with a single mode field today (leaves room for future fields without breaking the shape)
verification: Optional[VerificationPolicy] on CueCreate and CueUpdate
OutcomeRequest extended with optional external_id, result_url, result_ref, result_type, summary, artifacts. Legacy shape ({success, result, error, metadata}) is unchanged

Model (`app/models/cue.py`)

verification_mode column: String(50), nullable, CHECK constraint over the enum values. NULL and 'none' are treated identically
evidence_* columns already exist (PR feat: port 10 missing endpoints for feature parity with hosted service #15) and are reused

Migration

017_add_verification_mode.py — adds the column + CHECK constraint. Applies cleanly on a blank DB (verified locally). Downgrade drops the constraint first, then the column

Services

outcome_service.record_outcome computes outcome_state from (success, verification_mode, evidence):

success	mode	evidence	outcome_state
false	any	—	`reported_failure`
true	none/NULL	—	`reported_success`
true	manual	—	`verification_pending`
true	`require_external_id`	present	`verified_success`
true	`require_external_id`	missing	`verification_failed`
true	`require_result_url`	present	`verified_success`
true	`require_result_url`	missing	`verification_failed`
true	`require_artifacts`	present	`verified_success`
true	`require_artifacts`	missing	`verification_failed`

cue_service._check_transport_verification_combo rejects worker transport paired with evidence-requiring modes at both create and update (see "Restriction" below)

Router

POST /v1/executions/{id}/verify now accepts {valid: bool, reason: str?} via a typed VerifyRequest body.
- valid=true (default) → verified_success (legacy behavior preserved — empty body still works)
- valid=false → verification_failed, reason recorded on evidence_summary (truncated to 500 chars, prepended to any existing summary)
- Accepted starting states expanded to include reported_failure — this was rejected before but there was no semantic reason to
OutcomeResponse now surfaces outcome_state

Intentional behavior change

POST /v1/executions/{id}/verify with an explicit {valid: false} body now transitions to verification_failed instead of verified_success. Before this PR, the endpoint ignored the request body and always transitioned to verified_success — a silent-failure bug that made the valid=false branch impossible to exercise. Callers relying on the always-success behavior were getting broken semantics anyway. Empty-body requests remain verified_success (the previous default).

Restriction

Worker-transport cues cannot combine with require_external_id / require_result_url / require_artifacts. Attempting to do so at create or PATCH time returns:

{
  "error": {
    "code": "unsupported_verification_for_transport",
    "transport": "worker",
    "verification_mode": "require_external_id",
    "supported_worker_modes": ["none", "manual"]
  }
}

This is because cueapi-worker < 0.3.0 has no mechanism to attach evidence to the outcome POST. The restriction will be lifted in a follow-up PR once cueapi-worker 0.3.0 (evidence reporting via CUEAPI_OUTCOME_FILE) is published to PyPI.

Tests

35 new tests across four files:

tests/test_verification_modes.py — 10 tests, 5 modes × (satisfied, unsatisfied / applicable variants)
tests/test_transport_verification_combo.py — 13 tests: 3 evidence modes rejected × (create, PATCH) + 2 worker-compatible modes accepted + 5 webhook-always-allowed modes + 3 PATCH transitions
tests/test_outcome_evidence.py — 4 tests: inline evidence persists, legacy shape still works, Pydantic length caps enforced, PATCH evidence still works
tests/test_verify_endpoints.py — 8 tests covering both branches of valid, empty body default, reason-preserves-existing-summary, invalid-state rejections, and /verification-pending

Test-suite delta

+35 new passing tests, 0 new failures
Amended one existing test: test_execution_parity.py::TestVerify::test_verify_wrong_state now uses a pre-outcome state (since reported_failure is now valid starting state)
Pre-existing failures in test_sdk_integration.py (7) — ModuleNotFoundError: No module named 'cueapi'. Confirmed on clean origin/main (stashed this PR's changes and re-ran). These are environment-dependent tests that expect the Python SDK to be installed; CI handles that

Backward compatibility

POST /outcome without evidence fields → identical behavior to before
POST /verify with empty body → identical behavior to before (verified_success)
Cues without a verification field → verification_mode = NULL → outcome-state engine treats as none → same reported_success / reported_failure semantics as before
PATCH /v1/executions/{id}/evidence → untouched, still accepts the two-step flow

References

Private monorepo sources: app/schemas/cue.py (VerificationMode/VerificationConfig), app/schemas/outcome.py (evidence fields), app/services/outcome_service.py (rule engine), app/services/cue_service.py (_check_transport_verification_combo), app/routers/executions.py (/verify body contract)
Audit context: this PR addresses the ABSENT/PARTIAL items 1, 2, 3, 4, 5, 6, 8 from the cueapi-core drift re-audit. Items 7, 9–14 are out of scope (alerts + sync-discipline land in follow-up PRs)

Test plan

35 new tests pass locally (pytest tests/test_verification_modes.py tests/test_transport_verification_combo.py tests/test_outcome_evidence.py tests/test_verify_endpoints.py)
Full pytest tests/ — no new failures (SDK-integration failures pre-exist)
Migration 017 applies cleanly on a blank DB (alembic upgrade head from empty schema)
Column + CHECK constraint verified in Postgres (\d cues)

🤖 Generated with Claude Code

Ports the outcome-verification feature from the hosted monorepo into cueapi-core and fixes the partial /verify endpoint that PR #15 left behind. Schema: - VerificationMode enum (none, require_external_id, require_result_url, require_artifacts, manual) + VerificationPolicy on CueCreate/CueUpdate. - OutcomeRequest accepts evidence fields inline (external_id, result_url, result_ref, result_type, summary, artifacts). Legacy shape still works. Model: - Migration 017: verification_mode column on cues (String(50), nullable, CHECK-constrained enum). NULL == 'none'. evidence_* columns already existed from PR #15 and are reused. Services: - outcome_service computes outcome_state from (success, mode, evidence). Missing required evidence -> verification_failed. Manual mode parks in verification_pending. Failure bypasses verification entirely. - cue_service _check_transport_verification_combo rejects worker+evidence at create and update. Lifted in a follow-up PR once cueapi-worker 0.3.0 lands on PyPI. Router: - POST /v1/executions/{id}/verify now accepts {valid: bool, reason: str?}. valid=true preserves legacy behavior; valid=false -> verification_failed with reason recorded on evidence_summary. Accepted starting states expanded to include reported_failure. Empty body defaults to valid=true for full backward compat. Tests: - 35 new tests across 4 files (verification_modes, transport_verification_combo, outcome_evidence, verify_endpoints). - Amended test_execution_parity.py::test_verify_wrong_state to use a pre-outcome state (reported_failure is now a valid starting state). - Full-suite delta: +35 passing, 0 new failures. Pre-existing SDK-integration failures (cueapi Python package not installed locally) unchanged. Alert firing for verification_failed deferred to PR 2 (alerts feature). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Ports the alerts feature to OSS. Deliberately excludes SendGrid/email — self-hosters configure alert_webhook_url and forward to their own Slack/Discord/ntfy/SMTP relay. Hosted cueapi.ai keeps managed email. Model + migrations: - app/models/alert.py: id/user_id/cue_id/execution_id/alert_type/ severity/message/alert_metadata (column 'metadata')/acknowledged/ created_at. CHECK on alert_type IN ('outcome_timeout', 'verification_failed', 'consecutive_failures'). CHECK on severity. Indexes: user_id, (user_id, created_at), execution_id. - alembic 018: alerts table. - alembic 019: users.alert_webhook_url (String 2048) + alert_webhook_secret (String 64), both nullable. - 018.down_revision = '016' intentionally — PR #18 introduces 017 but isn't merged yet. When PR #18 merges first, rebase this PR to chain 017 -> 018. Documented in the migration docstring. Services: - app/services/alert_service.py: create_alert with 5-min dedup on (user_id, alert_type, execution_id|cue_id). count_consecutive_failures walks execution history backwards, stops at first non-failed. Threshold = 3. Webhook delivery is fire-and-forget via asyncio.create_task. - app/services/alert_webhook.py: deliver_alert with HMAC-SHA256 over '{timestamp}.{sorted_payload_json}', 10s timeout, SSRF re-resolve at delivery, never raises. No-URL short-circuits silently. URL-without- secret logs a warning and skips. Router + auth: - app/routers/alerts.py: GET /v1/alerts with alert_type/since/limit/ offset filters, 400 on invalid type, auth-scoped. - app/routers/auth_routes.py: PATCH /me accepts alert_webhook_url (empty string clears; SSRF-validated). GET /alert-webhook-secret lazy-generates on first call. POST /alert-webhook-secret/regenerate requires X-Confirm-Destructive. Integration into outcome_service.record_outcome (post-commit): - verification_failed alert fires when execution.outcome_state == 'verification_failed'. Dormant on current main (the rule engine that sets this state lives in PR #18); activates automatically once #18 merges. No rebase of integration code required — only the migration chain needs updating. - consecutive_failures alert fires when the streak reaches 3 on a failed outcome. Independent of PR #18 — works on current main. - outcome_timeout alert firing deferred — requires a deadline-checking poller that cueapi-core doesn't have yet. CHECK constraint and router already accept the type so the wiring is drop-in when that poller lands. - Alert firing is wrapped in try/except — must never break outcome reporting. Tests (36 new, all passing): - test_alert_model.py (6): CRUD, CHECK rejection for invalid type/severity, parametrized valid types, index existence. - test_alert_service.py (7): create persists, dedup within window, dedup doesn't cross alert types, consecutive_failures counter + streak-breaking + threshold constant. - test_alert_webhook_delivery.py (7): no-URL short-circuit, URL- without-secret skip, SSRF block, HMAC signature recomputation, timeout/non-2xx/RuntimeError all swallowed. - test_alerts_api.py (8): empty list, own alerts, type filter, invalid type rejected, pagination, cross-user scoping, auth required. - test_alert_webhook_config.py (6): set valid URL, empty string clears, SSRF rejection at config, lazy secret generation, confirmation required, rotation. - test_outcome_triggers_alert.py (3): verification_failed end-to-end (seeds outcome_state to exercise the integration path), consecutive failures end-to-end, isolated failure does NOT fire. Full-suite delta: +36 passing, 0 new failures. Pre-existing SDK- integration failures (cueapi Python package not installed locally) unchanged. Docs: - README 'Alerts' section with alert types, querying, webhook setup. - examples/alert_webhook_receiver.py: 30-line Flask receiver with signature verification. - CHANGELOG [Unreleased] entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

argus-qa-ai

All CI checks passing. Approved by Argus.

PR #18 (verification modes) landed first, so record_outcome now unconditionally overwrites execution.outcome_state from the rule engine: (success, verification_mode, evidence). Pre-seeding outcome_state='verification_failed' before calling the endpoint was a pre-#18 strategy — the seed gets overwritten to 'reported_success' when the test sends success=True on a cue with no verification_mode. Fix: configure the cue with verification_mode='require_external_id' and report success=True without external_id. The rule engine naturally lands in verification_failed, which triggers the alert hook. This is the real production path users hit. No behavior change in the alert hook. Test fixture _cue() now accepts verification_mode kwarg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: alerts with webhook delivery + outcome integration Ports the alerts feature to OSS. Deliberately excludes SendGrid/email — self-hosters configure alert_webhook_url and forward to their own Slack/Discord/ntfy/SMTP relay. Hosted cueapi.ai keeps managed email. Model + migrations: - app/models/alert.py: id/user_id/cue_id/execution_id/alert_type/ severity/message/alert_metadata (column 'metadata')/acknowledged/ created_at. CHECK on alert_type IN ('outcome_timeout', 'verification_failed', 'consecutive_failures'). CHECK on severity. Indexes: user_id, (user_id, created_at), execution_id. - alembic 018: alerts table. - alembic 019: users.alert_webhook_url (String 2048) + alert_webhook_secret (String 64), both nullable. - 018.down_revision = '016' intentionally — PR #18 introduces 017 but isn't merged yet. When PR #18 merges first, rebase this PR to chain 017 -> 018. Documented in the migration docstring. Services: - app/services/alert_service.py: create_alert with 5-min dedup on (user_id, alert_type, execution_id|cue_id). count_consecutive_failures walks execution history backwards, stops at first non-failed. Threshold = 3. Webhook delivery is fire-and-forget via asyncio.create_task. - app/services/alert_webhook.py: deliver_alert with HMAC-SHA256 over '{timestamp}.{sorted_payload_json}', 10s timeout, SSRF re-resolve at delivery, never raises. No-URL short-circuits silently. URL-without- secret logs a warning and skips. Router + auth: - app/routers/alerts.py: GET /v1/alerts with alert_type/since/limit/ offset filters, 400 on invalid type, auth-scoped. - app/routers/auth_routes.py: PATCH /me accepts alert_webhook_url (empty string clears; SSRF-validated). GET /alert-webhook-secret lazy-generates on first call. POST /alert-webhook-secret/regenerate requires X-Confirm-Destructive. Integration into outcome_service.record_outcome (post-commit): - verification_failed alert fires when execution.outcome_state == 'verification_failed'. Dormant on current main (the rule engine that sets this state lives in PR #18); activates automatically once #18 merges. No rebase of integration code required — only the migration chain needs updating. - consecutive_failures alert fires when the streak reaches 3 on a failed outcome. Independent of PR #18 — works on current main. - outcome_timeout alert firing deferred — requires a deadline-checking poller that cueapi-core doesn't have yet. CHECK constraint and router already accept the type so the wiring is drop-in when that poller lands. - Alert firing is wrapped in try/except — must never break outcome reporting. Tests (36 new, all passing): - test_alert_model.py (6): CRUD, CHECK rejection for invalid type/severity, parametrized valid types, index existence. - test_alert_service.py (7): create persists, dedup within window, dedup doesn't cross alert types, consecutive_failures counter + streak-breaking + threshold constant. - test_alert_webhook_delivery.py (7): no-URL short-circuit, URL- without-secret skip, SSRF block, HMAC signature recomputation, timeout/non-2xx/RuntimeError all swallowed. - test_alerts_api.py (8): empty list, own alerts, type filter, invalid type rejected, pagination, cross-user scoping, auth required. - test_alert_webhook_config.py (6): set valid URL, empty string clears, SSRF rejection at config, lazy secret generation, confirmation required, rotation. - test_outcome_triggers_alert.py (3): verification_failed end-to-end (seeds outcome_state to exercise the integration path), consecutive failures end-to-end, isolated failure does NOT fire. Full-suite delta: +36 passing, 0 new failures. Pre-existing SDK- integration failures (cueapi Python package not installed locally) unchanged. Docs: - README 'Alerts' section with alert types, querying, webhook setup. - examples/alert_webhook_receiver.py: 30-line Flask receiver with signature verification. - CHANGELOG [Unreleased] entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(alerts): trigger verification_failed via rule engine, not pre-seed PR #18 (verification modes) landed first, so record_outcome now unconditionally overwrites execution.outcome_state from the rule engine: (success, verification_mode, evidence). Pre-seeding outcome_state='verification_failed' before calling the endpoint was a pre-#18 strategy — the seed gets overwritten to 'reported_success' when the test sends success=True on a cue with no verification_mode. Fix: configure the cue with verification_mode='require_external_id' and report success=True without external_id. The rule engine naturally lands in verification_failed, which triggers the alert hook. This is the real production path users hit. No behavior change in the alert hook. Test fixture _cue() now accepts verification_mode kwarg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Gk <gk@Gks-MacBook-Pro.local> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cueapi-worker 0.3.0 (released 2026-04-17 to PyPI) closes the worker- side evidence gap via CUEAPI_OUTCOME_FILE. The daemon reads the handler's per-run temp file after exit and merges the evidence fields into its outcome POST. All five verification modes now work on both transports. Changes: - app/services/cue_service.py: remove _check_transport_verification_combo and the two calls in create_cue + update_cue. Replace with info-level logging when a worker cue is configured with an evidence-requiring mode (breadcrumb for operators still running older cueapi-worker). - tests/test_transport_verification_combo.py: flip expected 400 → 201 on create, 400 → 200 on PATCH. Header comment documents the history. Two test classes renamed from WorkerEvidenceRejected* to WorkerEvidenceAccepted* / PatchTransitions::test_patch_worker_to_evidence_mode_accepted. - README.md: update transport-compatibility footnote to reflect the new accept-everything reality, with an upgrade hint for users on cueapi-worker < 0.3.0. - CHANGELOG: replace the "Restricted" entry (worker+evidence rejection) with a "Removed" entry describing the lift. Tests: 13/13 pass locally on the updated combo suite. Preconditions met: - cueapi-worker 0.3.0 published to PyPI (2026-04-17 22:04:39 UTC) - cueapi-core #18 merged to main (verification_mode column + rule engine in place to read verification_mode and produce outcome_state transitions) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

govindkavaturi-art enabled auto-merge (squash) April 17, 2026 01:32

govindkavaturi-art mentioned this pull request Apr 17, 2026

feat: alerts with webhook delivery + outcome integration #20

Merged

5 tasks

argus-qa-ai approved these changes Apr 17, 2026

View reviewed changes

Merge branch 'main' into feat/verification-modes-and-evidence-parity

a1db0e0

govindkavaturi-art merged commit 498c301 into main Apr 17, 2026
3 checks passed

This was referenced Apr 17, 2026

feat: lift worker transport evidence-verification rejection #21

Open

docs(postman): add official Postman collection generated from OpenAPI #17

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: verification modes + evidence fields + transport combo rejection (hosted parity)#18

feat: verification modes + evidence fields + transport combo rejection (hosted parity)#18
govindkavaturi-art merged 2 commits intomainfrom
feat/verification-modes-and-evidence-parity

govindkavaturi-art commented Apr 17, 2026

Uh oh!

argus-qa-ai left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

govindkavaturi-art commented Apr 17, 2026

Summary

What changed

Schema (app/schemas/)

Model (app/models/cue.py)

Migration

Services

Router

Intentional behavior change

Restriction

Tests

Test-suite delta

Backward compatibility

References

Test plan

Uh oh!

argus-qa-ai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Schema (`app/schemas/`)

Model (`app/models/cue.py`)