feat: Phase 3.5 — cookie import, QA testing, team retro (v0.3.1) (#29) by stoll · Pull Request #1 · stoll/gstack

stoll · 2026-03-13T18:00:53Z

Phase 2: Enhanced browser — dialog handling, upload, state checks, snapshots

CircularBuffer O(1) ring buffer for console/network/dialog (was O(n) array+shift)
Async buffer flush with Bun.write() (was appendFileSync)
Dialog auto-accept/dismiss with buffer + prompt text support
File upload command (upload <file...>)
Element state checks (is visible/hidden/enabled/disabled/checked/editable/focused)
Annotated screenshots with ref labels overlaid (-a flag)
Snapshot diffing against previous snapshot (-D flag)
Cursor-interactive element scan for non-ARIA clickables (-C flag)
Snapshot scoping depth limit (-d N flag)
Health check with page.evaluate + 2s timeout
Playwright error wrapping — actionable messages for AI agents
Fix useragent — context recreation preserves cookies/storage/URLs
wait --networkidle / --load / --domcontentloaded flags
console --errors filter (error + warning only)
cookie-import with auto-fill domain from page URL
166 integration tests (was ~63)

Phase 2: Rewrite SKILL.md as QA playbook + command reference

Reorient SKILL.md files from raw command reference to QA-first playbook with 10 workflow patterns (test user flows, verify deployments, dogfood features, responsive layouts, file upload, forms, dialogs, compare pages). Compact command reference tables at the bottom.

Phase 3: /qa skill — systematic QA testing with health scores

New /qa skill for systematic web app QA testing. Three modes:

full: 5-10 documented issues with screenshots and repro steps
quick: 30-second smoke test with health score
regression: compare against saved baseline

Includes issue taxonomy (7 categories, 4 severity levels), structured report template, health score rubric (weighted across 7 categories), framework detection guidance (Next.js, Rails, WordPress, SPA).

Also adds browse/bin/find-browse (DRY binary discovery using git rev-parse), .gstack/ to .gitignore, and updated TODO roadmap.

Bump to v0.3.0 — Phase 2 + Phase 3 changelog
feat: cookie-import-browser — Chromium cookie decryption module + tests

Pure logic module for reading and decrypting cookies from macOS Chromium browsers (Comet, Chrome, Arc, Brave, Edge). Supports v10 AES-128-CBC encryption with macOS Keychain access, PBKDF2 key derivation, and per-browser key caching. 18 unit tests with encrypted cookie fixtures.

feat: cookie picker web UI + route handler

Two-panel dark-theme picker served from the browse server. Left panel shows source browser domains with search and import buttons. Right panel shows imported domains with trash buttons. No cookie values exposed. 6 API endpoints, importedDomains Set tracking, inline clearCookies.

feat: wire cookie-import-browser into browse server

Add cookie-picker route dispatch (no auth, localhost-only), add cookie-import-browser to WRITE_COMMANDS and CHAIN_WRITE, add serverPort property to BrowserManager, add write command with two modes (picker UI vs --domain direct import), update CLI help text.

chore: /setup-browser-cookies skill + docs (Phase 3.5)
chore: bump version and changelog (v0.3.1)
security: redact sensitive values from command output (PR Security: Redact sensitive values from command output garrytan/gstack#21)

type no longer echoes text (reports character count), cookie redacts value with ****, header redacts Authorization/Cookie/X-API-Key/X-Auth-Token, storage set drops value, forms redacts password fields. Prevents secrets from persisting in LLM transcripts. 7 new tests.

Credit: fredluz (PR garrytan#21)

security: path traversal prevention for screenshot/pdf/eval (PR fix: path traversal security vulnerability in screenshot/pdf/eval commands garrytan/gstack#26)

Add validateOutputPath() for screenshot/pdf/responsive (restricts to /tmp and cwd) and validateReadPath() for eval (blocks .. sequences and absolute paths outside safe dirs). 7 new tests.

Credit: Jah-yee (PR garrytan#26)

fix: auto-install Playwright Chromium in setup (PR Fix browse setup when Playwright Chromium is missing garrytan/gstack#22)

Setup now verifies Playwright can launch Chromium, and auto-installs it via bunx playwright install chromium if missing. Exits non-zero if build or Chromium launch fails.

Credit: AkbarDevop (PR garrytan#22)

security: fix path validation bypass, CORS restriction, cookie-import path check

startsWith('/tmp') matched '/tmpevil' — now requires trailing slash
CORS Access-Control-Allow-Origin changed from * to http://127.0.0.1:
cookie-import now validates file paths (was missing validateReadPath)
3 new tests for prefix collision and cookie-import path traversal

fix: address review informational issues + add regression tests

Add cookie-import to CHAIN_WRITE set for chain command routing
Add path validation to snapshot -a -o output path
Fix package.json version to match 0.3.1
Use crypto.randomUUID() for temp DB paths (unpredictable filenames)
Add regression tests for chain cookie-import and snapshot path validation

docs: add /qa, /setup-browser-cookies to README + update BROWSER.md

Add /qa and /setup-browser-cookies to skills table, install/update/uninstall blurbs
Add dedicated README sections for both new skills with usage examples
Update demo workflow to show cookie import → QA → browse flow
Update BROWSER.md: cookie import commands, new source files, test count (203)
Update skill count from 6 to 8

feat: team-aware /retro v2.0 — per-person praise and growth opportunities

Identify current user via git config, orient narrative as "you" vs teammates
Add per-author metrics: commits, LOC, focus areas, commit type mix, sessions
New "Your Week" section with personal deep-dive for whoever runs the command
New "Team Breakdown" with per-person praise and growth opportunities
Track AI-assisted commits via Co-Authored-By trailers
Personal + team shipping streaks
Tone: praise like a 1:1, growth like investment advice, never compare negatively

docs: add Conductor parallel sessions section to README

* Phase 2: Enhanced browser — dialog handling, upload, state checks, snapshots - CircularBuffer O(1) ring buffer for console/network/dialog (was O(n) array+shift) - Async buffer flush with Bun.write() (was appendFileSync) - Dialog auto-accept/dismiss with buffer + prompt text support - File upload command (upload <sel> <file...>) - Element state checks (is visible/hidden/enabled/disabled/checked/editable/focused) - Annotated screenshots with ref labels overlaid (-a flag) - Snapshot diffing against previous snapshot (-D flag) - Cursor-interactive element scan for non-ARIA clickables (-C flag) - Snapshot scoping depth limit (-d N flag) - Health check with page.evaluate + 2s timeout - Playwright error wrapping — actionable messages for AI agents - Fix useragent — context recreation preserves cookies/storage/URLs - wait --networkidle / --load / --domcontentloaded flags - console --errors filter (error + warning only) - cookie-import <json-file> with auto-fill domain from page URL - 166 integration tests (was ~63) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Phase 2: Rewrite SKILL.md as QA playbook + command reference Reorient SKILL.md files from raw command reference to QA-first playbook with 10 workflow patterns (test user flows, verify deployments, dogfood features, responsive layouts, file upload, forms, dialogs, compare pages). Compact command reference tables at the bottom. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Phase 3: /qa skill — systematic QA testing with health scores New /qa skill for systematic web app QA testing. Three modes: - full: 5-10 documented issues with screenshots and repro steps - quick: 30-second smoke test with health score - regression: compare against saved baseline Includes issue taxonomy (7 categories, 4 severity levels), structured report template, health score rubric (weighted across 7 categories), framework detection guidance (Next.js, Rails, WordPress, SPA). Also adds browse/bin/find-browse (DRY binary discovery using git rev-parse), .gstack/ to .gitignore, and updated TODO roadmap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Bump to v0.3.0 — Phase 2 + Phase 3 changelog Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: cookie-import-browser — Chromium cookie decryption module + tests Pure logic module for reading and decrypting cookies from macOS Chromium browsers (Comet, Chrome, Arc, Brave, Edge). Supports v10 AES-128-CBC encryption with macOS Keychain access, PBKDF2 key derivation, and per-browser key caching. 18 unit tests with encrypted cookie fixtures. * feat: cookie picker web UI + route handler Two-panel dark-theme picker served from the browse server. Left panel shows source browser domains with search and import buttons. Right panel shows imported domains with trash buttons. No cookie values exposed. 6 API endpoints, importedDomains Set tracking, inline clearCookies. * feat: wire cookie-import-browser into browse server Add cookie-picker route dispatch (no auth, localhost-only), add cookie-import-browser to WRITE_COMMANDS and CHAIN_WRITE, add serverPort property to BrowserManager, add write command with two modes (picker UI vs --domain direct import), update CLI help text. * chore: /setup-browser-cookies skill + docs (Phase 3.5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.3.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: redact sensitive values from command output (PR #21) type no longer echoes text (reports character count), cookie redacts value with ****, header redacts Authorization/Cookie/X-API-Key/X-Auth-Token, storage set drops value, forms redacts password fields. Prevents secrets from persisting in LLM transcripts. 7 new tests. Credit: fredluz (PR #21) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: path traversal prevention for screenshot/pdf/eval (PR #26) Add validateOutputPath() for screenshot/pdf/responsive (restricts to /tmp and cwd) and validateReadPath() for eval (blocks .. sequences and absolute paths outside safe dirs). 7 new tests. Credit: Jah-yee (PR #26) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: auto-install Playwright Chromium in setup (PR #22) Setup now verifies Playwright can launch Chromium, and auto-installs it via `bunx playwright install chromium` if missing. Exits non-zero if build or Chromium launch fails. Credit: AkbarDevop (PR #22) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: fix path validation bypass, CORS restriction, cookie-import path check - startsWith('/tmp') matched '/tmpevil' — now requires trailing slash - CORS Access-Control-Allow-Origin changed from * to http://127.0.0.1:<port> - cookie-import now validates file paths (was missing validateReadPath) - 3 new tests for prefix collision and cookie-import path traversal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review informational issues + add regression tests - Add cookie-import to CHAIN_WRITE set for chain command routing - Add path validation to snapshot -a -o output path - Fix package.json version to match 0.3.1 - Use crypto.randomUUID() for temp DB paths (unpredictable filenames) - Add regression tests for chain cookie-import and snapshot path validation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add /qa, /setup-browser-cookies to README + update BROWSER.md - Add /qa and /setup-browser-cookies to skills table, install/update/uninstall blurbs - Add dedicated README sections for both new skills with usage examples - Update demo workflow to show cookie import → QA → browse flow - Update BROWSER.md: cookie import commands, new source files, test count (203) - Update skill count from 6 to 8 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: team-aware /retro v2.0 — per-person praise and growth opportunities - Identify current user via git config, orient narrative as "you" vs teammates - Add per-author metrics: commits, LOC, focus areas, commit type mix, sessions - New "Your Week" section with personal deep-dive for whoever runs the command - New "Team Breakdown" with per-person praise and growth opportunities - Track AI-assisted commits via Co-Authored-By trailers - Personal + team shipping streaks - Tone: praise like a 1:1, growth like investment advice, never compare negatively Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Conductor parallel sessions section to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…ytan#384) * feat: /cso v2 — infrastructure-first security audit Rewrite /cso from code-centric OWASP scanning to infrastructure-first attack surface analysis. 15 phases covering secrets archaeology, dependency supply chain, CI/CD pipeline security, webhook verification, LLM/AI security, skill supply chain scanning, plus OWASP Top 10, STRIDE, and data classification. Key design decisions from eng review + Codex adversarial review: - Soft gate stack detection (prioritize, don't skip) - Error on conflicting scope flags (never silently ignore) - Permission gate before scanning ~/.claude/skills/ - Graceful degradation when audit tools aren't installed - Finding fingerprints for cross-run trend tracking - Variant analysis: one verified vuln triggers codebase-wide search - Dual confidence modes: daily (8/10 gate) vs comprehensive (2/10) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: /cso v2 acknowledgements — 10 projects that informed the design Credits: Sentry (confidence gating), Trail of Bits (mental model + variant analysis), Shannon/Keygraph (active verification validation), afiqiqmal (framework detection + LLM security), Snyk ToxicSkills (skill supply chain), Miessler PAI (incident playbooks), McGo (report format), Claude Code Security Pack (modular validation), Anthropic CCS (500+ zero-days), and @gus_argon (v1 blind spot identification). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: /cso v2 E2E tests — full audit, diff mode, infra scope Three E2E test cases with planted vulnerabilities: - cso-full-audit: hardcoded API key + .env tracked by git - cso-diff-mode: webhook without signature verification on feature branch - cso-infra-scope: unpinned GitHub Action + Dockerfile without USER Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: /cso E2E tests — correct logCost and recordE2E signatures logCost requires (label, result), recordE2E requires (collector, name, suite, result). Fixed all 3 test cases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: /cso infra E2E test — increase timeout to 360s The infra scope test runs Agent sub-tasks for parallel finding verification which can take longer than 240s. Increased maxTurns from 25 to 60 and timeout from 240s to 360s. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: /cso infra E2E test — sharper prompt to prevent exploration waste The agent was burning 30+ turns exploring a 3-file repo (18 Glob calls, Explore subagent, 4 SKILL.md reads) before starting the audit. Two Agent verification subagents then ate ~100s, causing the 240s timeout. Fix: tell the agent the repo is tiny, list the exact files, skip the preamble, remove Agent from allowed tools, reduce maxTurns 60→30. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.11.6.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Codex adversarial findings in /cso v2 Six fixes from Codex adversarial review: 1. Phase 2: Use `git log -G` (regex) instead of `-S` (literal) for patterns with alternation (ghp_|gho_|github_pat_, etc.) 2. Phase 12 exclusion garrytan#5: Add exception so CI/CD pipeline findings from Phase 4 are never auto-discarded when --infra is active 3. Phase 12 exclusion garrytan#6: Add exception that unpinned actions and missing CODEOWNERS are concrete risks, not "missing hardening" 4. Phase 12 exclusion garrytan#15: Add exception that SKILL.md files are executable prompt code, not documentation — Phase 8 findings in SKILL.md must not be excluded 5. Phase 12 exclusion #1: Add exception that LLM cost/spend amplification from Phase 7 is financial risk, not DoS 6. E2E tests: Add exitReason === 'success' assertion to all 3 tests; move finalizeEvalCollector to file-level afterAll Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

stoll merged commit e971973 into stoll:main Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase 3.5 — cookie import, QA testing, team retro (v0.3.1) (#29)#1

feat: Phase 3.5 — cookie import, QA testing, team retro (v0.3.1) (#29)#1
stoll merged 1 commit intostoll:mainfrom
garrytan:main

stoll commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stoll commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants