feat: Agent Improvement Engine (insights, improve, evolve)#765
feat: Agent Improvement Engine (insights, improve, evolve)#765alishakawaguchi wants to merge 32 commits intomainfrom
Conversation
Adds EvolveSettings struct, EvolveConfig field on EntireSettings, and GetEvolveConfig() convenience method with defaults (SessionThreshold=5). Includes mergeJSON support and unit tests covering nil/zero/explicit cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: a9304e803fa1
Create cmd/entire/cli/llmcli with Runner.Execute(), StripGitEnv(), and ExtractJSONFromMarkdown() so the upcoming improve/generator.go can reuse the same CLI invocation logic without duplicating it from summarize. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: bb2d026600c0
Move terminal color detection, width calculation, token formatting, and lipgloss style construction into a new exported cmd/entire/cli/termstyle package so upcoming renderers (insights, improve) can reuse them without duplicating the logic or importing the cli package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: 04021bde7725
After summary generation in CondenseSession(), compute a SessionScore using pure math via insights.ScoreSession/ComputeOverall and return it in CondenseResult. No AI call, no latency impact (<1ms). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: a7a24df28aa9
…gestion generator Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds the `entire insights` command that reads session quality data from a SQLite cache backed by the entire/checkpoints/v1 branch, then renders scores, trends, and agent comparisons in the terminal or as JSON. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add the evolve package with three responsibilities: threshold-based triggering (ShouldTrigger/IncrementSessionCount/RecordRun), in-memory suggestion lifecycle tracking (Tracker with Accept/Reject/MeasureImpact), and user-facing notification (CheckAndNotify) when the session count meets the configured threshold. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: a67b9372b855
Adds `entire improve` — a two-phase pipeline that queries recurring friction from the SQLite insights cache, deep-reads transcripts for evidence, detects context files, and calls Claude to generate unified diff suggestions. Also adds JSON tags to insightsdb.FrictionTheme. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: 64a14f6741b8
…engine - Add context.Context to all insightsdb methods (noctx) - Wrap external package errors (wrapcheck) - Add nolint for tx.Rollback errcheck and maintidx on CondenseSession - Fix nilerr in cache refresh, unparam in renderInsightsTerminal - Add insightsdb cache/db files, insights scoring package - Run go mod tidy for modernc.org/sqlite dependency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: b5a4d26c47f7
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
cmd/entire/cli/improve_cmd.go
Outdated
| summaries := sessionRowsToSummaries(rows) | ||
| analysis := improve.AnalyzePatterns(summaries) | ||
| // Overlay the transcript excerpts we fetched into the analysis. | ||
| applyExcerpts(analysis.RepeatedFriction, patterns) |
There was a problem hiding this comment.
Theme mismatch prevents transcript excerpts from being applied
High Severity
applyExcerpts matches dst and src patterns by Theme, but the two slices use incompatible theme formats. buildFrictionPatterns sets Theme to the raw friction text from the database (e.g., "Lint errors not caught by agent"), while AnalyzePatterns sets Theme to a classified keyword (e.g., "lint"). The map lookup in applyExcerpts will never find a match, so transcript excerpts collected in Phase 2 are silently discarded and never included in the prompt sent to Claude.
Additional Locations (1)
| return nil | ||
| } | ||
| return f | ||
| } |
There was a problem hiding this comment.
nullableFloat stores legitimate zero scores as NULL
Medium Severity
nullableFloat converts any zero float to SQL NULL, but the scoring functions (scoreFriction, scoreFocus, scoreFirstPassSuccess) can legitimately produce 0.0 for poor-performing sessions (e.g., 5+ friction items yields ScoreFriction = 0). This conflates "score not yet computed" with "worst possible score" at the database level, making it impossible to distinguish the two states in queries.
| total += totalTokensFromUsage(tu.SubagentTokens) | ||
| } | ||
| return total | ||
| } |
There was a problem hiding this comment.
Duplicate token summation function added in strategy package
Low Severity
totalTokensFromUsage is functionally identical to termstyle.TotalTokens (which was extracted from the old totalTokens in status_style.go in this same PR). Both recursively sum InputTokens + CacheCreationTokens + CacheReadTokens + OutputTokens including subagent tokens. The new function duplicates existing logic rather than reusing it.
There was a problem hiding this comment.
Pull request overview
Adds an “Agent Improvement Engine” to Entire CLI: new insights and improve commands powered by a local SQLite cache, plus an opt-in “evolve” loop to nudge users to run improvements after N sessions.
Changes:
- Introduces
entire insights(session scoring, trends, agent comparisons) backed by a SQLite cache (modernc.org/sqlite). - Introduces
entire improve(recurring friction detection + optional transcript deep-read + Claude CLI suggestions with unified diffs). - Extracts shared utilities into new
llmcli(Claude CLI runner) andtermstyle(terminal styling) packages; adds evolve settings + basic evolve state helpers.
Reviewed changes
Copilot reviewed 40 out of 41 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| go.mod | Adds modernc.org/sqlite dependency (plus indirects). |
| go.sum | Updates module checksums for new dependencies. |
| cmd/entire/cli/termstyle/termstyle.go | New shared terminal styling helpers (color/width detection, rules, token formatting). |
| cmd/entire/cli/termstyle/termstyle_test.go | Unit tests for termstyle utilities. |
| cmd/entire/cli/summarize/claude.go | Refactors summarization Claude invocation to use shared llmcli. |
| cmd/entire/cli/llmcli/llmcli.go | New shared Claude CLI runner with git isolation + markdown JSON extraction. |
| cmd/entire/cli/llmcli/llmcli_test.go | Tests for runner defaults, error paths, git isolation, markdown extraction. |
| cmd/entire/cli/status_style.go | Switches existing status renderer to delegate styling/token helpers to termstyle. |
| cmd/entire/cli/strategy/manual_commit_types.go | Extends CondenseResult with optional SessionScore. |
| cmd/entire/cli/strategy/manual_commit_condensation.go | Computes session score during condensation and returns it in CondenseResult. |
| cmd/entire/cli/strategy/manual_commit_condensation_test.go | Adds tests for token/turn-count helper functions used in scoring. |
| cmd/entire/cli/insightsdb/db.go | New SQLite cache DB open/migrations + table listing for tests. |
| cmd/entire/cli/insightsdb/db_test.go | Tests for DB creation/migrations/idempotency. |
| cmd/entire/cli/insightsdb/cache.go | Cache insert/query helpers and schema mapping structs. |
| cmd/entire/cli/insightsdb/cache_test.go | Tests for cache meta + insert behavior + denormalized fields. |
| cmd/entire/cli/insightsdb/queries.go | Cache query methods (last N, by agent, recurring friction, etc.). |
| cmd/entire/cli/insightsdb/queries_test.go | Tests for query ordering/filtering/recurring friction behavior. |
| cmd/entire/cli/insights/insights.go | New insights domain types (scores, trends, report). |
| cmd/entire/cli/insights/scorer.go | New scoring algorithm + overall weighting. |
| cmd/entire/cli/insights/scorer_test.go | Unit tests for scoring functions. |
| cmd/entire/cli/insights/trends.go | Trend analysis + per-agent comparisons. |
| cmd/entire/cli/insights/trends_test.go | Tests for trends + agent comparisons. |
| cmd/entire/cli/insights_cmd.go | New entire insights command: cache refresh + renderers + score computation. |
| cmd/entire/cli/improve/improve.go | New improve domain types (context files, suggestions, pattern analysis). |
| cmd/entire/cli/improve/context_files.go | Detects known context files and reads content. |
| cmd/entire/cli/improve/context_files_test.go | Tests for context file detection behavior. |
| cmd/entire/cli/improve/analyzer.go | Builds repeated-friction themes + deduped learnings/open items from summaries. |
| cmd/entire/cli/improve/analyzer_test.go | Tests for analyzer theming/deduplication/threshold behavior. |
| cmd/entire/cli/improve/generator.go | Generates context-file suggestions via Claude CLI using llmcli. |
| cmd/entire/cli/improve/generator_test.go | Tests for generator parsing, defaults, IDs, timestamps, error handling. |
| cmd/entire/cli/improve_cmd.go | New entire improve command: friction query, deep-read excerpts, suggestion rendering. |
| cmd/entire/cli/evolve/evolve.go | New evolve state + suggestion record types. |
| cmd/entire/cli/evolve/trigger.go | Trigger logic for session-threshold evolution loop. |
| cmd/entire/cli/evolve/trigger_test.go | Tests for trigger logic/state updates. |
| cmd/entire/cli/evolve/notify.go | Emits user-facing “tip” to run entire improve when threshold reached. |
| cmd/entire/cli/evolve/notify_test.go | Tests for notification behavior. |
| cmd/entire/cli/evolve/tracker.go | In-memory tracker for suggestion lifecycle and simple impact measurement. |
| cmd/entire/cli/evolve/tracker_test.go | Tests for tracker operations. |
| cmd/entire/cli/settings/settings.go | Adds evolve settings + defaulting + JSON merge handling. |
| cmd/entire/cli/settings/settings_evolve_test.go | Tests for evolve settings defaulting/override behavior. |
| cmd/entire/cli/root.go | Registers new insights and improve commands. |
| // Split into first and second halves. | ||
| mid := len(scores) / 2 | ||
| firstHalf := scores[:mid] | ||
| secondHalf := scores[mid:] | ||
|
|
||
| firstAvg := average(firstHalf, extract) | ||
| secondAvg := average(secondHalf, extract) | ||
|
|
There was a problem hiding this comment.
ComputeTrends assumes the input slice is ordered oldest→newest when splitting into halves. In entire insights, sessions are queried ORDER BY created_at DESC, so the newest sessions land in the first half and trend directions will be inverted. Consider sorting by CreatedAt ascending inside ComputeTrends (or explicitly reversing in the caller) before computing averages/data points.
| // buildPrompt constructs the prompt for the Claude CLI. | ||
| // All untrusted content (friction text, learnings, context file content) is wrapped | ||
| // in XML tags to prevent prompt injection. | ||
| func buildPrompt(analysis PatternAnalysis, contextFiles []ContextFile) string { | ||
| var sb strings.Builder | ||
|
|
||
| sb.WriteString(`Analyze recurring patterns from recent AI coding sessions and suggest | ||
| improvements to the project's context files. | ||
|
|
||
| `) | ||
|
|
||
| // Repeated friction section | ||
| sb.WriteString("<repeated_friction>\n") | ||
| if len(analysis.RepeatedFriction) == 0 { | ||
| sb.WriteString("(no repeated friction patterns found)\n") | ||
| } else { | ||
| for _, p := range analysis.RepeatedFriction { | ||
| fmt.Fprintf(&sb, "Theme: %s issues (occurred %d times)\n", p.Theme, p.Count) | ||
| for _, ex := range p.Examples { | ||
| fmt.Fprintf(&sb, " - %q\n", ex) | ||
| } | ||
| if p.TranscriptExcerpt != "" { | ||
| fmt.Fprintf(&sb, " Excerpt: %q\n", p.TranscriptExcerpt) | ||
| } | ||
| } |
There was a problem hiding this comment.
The prompt-building comment says untrusted content is wrapped in XML tags to prevent prompt injection, but the inserted friction/learnings/context contents are not escaped. A friction string containing < / </repeated_friction> could break the structure and undermine the mitigation. Consider escaping/encoding untrusted strings (e.g., XML-escape or JSON-marshal values) and updating the comment to reflect the actual guarantees.
cmd/entire/cli/improve/generator.go
Outdated
| suggestions := make([]Suggestion, 0, len(resp.Suggestions)) | ||
| for i, s := range resp.Suggestions { | ||
| suggestions = append(suggestions, Suggestion{ | ||
| ID: fmt.Sprintf("sug-%d-%d", now.Unix(), i), |
There was a problem hiding this comment.
Suggestion IDs are based on now.Unix() (seconds) plus the loop index. Two entire improve runs within the same second can generate identical IDs (especially if they produce the same number of suggestions), which will collide with the suggestions.id primary key in SQLite. Consider using UnixNano, a UUID, or a random suffix for IDs.
| ID: fmt.Sprintf("sug-%d-%d", now.Unix(), i), | |
| ID: fmt.Sprintf("sug-%d-%d", now.UnixNano(), i), |
| // Build repeated friction list (threshold: 2+ occurrences) | ||
| var repeated []FrictionPattern | ||
| for theme, acc := range byTheme { | ||
| if acc.count < 2 { | ||
| continue | ||
| } | ||
| sessions := make([]string, 0, len(acc.sessions)) | ||
| for id := range acc.sessions { | ||
| sessions = append(sessions, id) | ||
| } | ||
| repeated = append(repeated, FrictionPattern{ | ||
| Theme: theme, | ||
| Count: acc.count, | ||
| Examples: acc.examples, | ||
| AffectedSessions: sessions, | ||
| }) | ||
| } |
There was a problem hiding this comment.
AnalyzePatterns builds RepeatedFriction by iterating over a map, so the order of patterns (and therefore CLI output/prompt content) is nondeterministic across runs. This can lead to noisy diffs in generated suggestions and make behavior harder to reason about. Consider sorting repeated deterministically (e.g., by Count desc, then Theme asc) before returning.
cmd/entire/cli/insights_cmd.go
Outdated
| row.InputTokens = meta.TokenUsage.InputTokens + meta.TokenUsage.CacheCreationTokens + meta.TokenUsage.CacheReadTokens | ||
| row.OutputTokens = meta.TokenUsage.OutputTokens | ||
| row.TotalTokens = row.InputTokens + row.OutputTokens |
There was a problem hiding this comment.
metadataToSessionRow populates InputTokens with input+cache tokens but never sets CacheTokens, even though the DB schema and SessionRow struct have a dedicated cache_tokens column. This makes cached token breakdown inconsistent and prevents showing cache token usage accurately. Suggest setting InputTokens to just input tokens, CacheTokens to cache creation+read, and TotalTokens to input+cache+output.
| row.InputTokens = meta.TokenUsage.InputTokens + meta.TokenUsage.CacheCreationTokens + meta.TokenUsage.CacheReadTokens | |
| row.OutputTokens = meta.TokenUsage.OutputTokens | |
| row.TotalTokens = row.InputTokens + row.OutputTokens | |
| // Keep token categories distinct so cache usage can be reported accurately. | |
| row.InputTokens = meta.TokenUsage.InputTokens | |
| row.CacheTokens = meta.TokenUsage.CacheCreationTokens + meta.TokenUsage.CacheReadTokens | |
| row.OutputTokens = meta.TokenUsage.OutputTokens | |
| row.TotalTokens = row.InputTokens + row.CacheTokens + row.OutputTokens |
cmd/entire/cli/insightsdb/cache.go
Outdated
| // nullableFloat converts a zero float to a SQL NULL value. | ||
| // Non-zero floats are passed through as-is. | ||
| func nullableFloat(f float64) interface{} { | ||
| if f == 0 { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
nullableFloat converts 0.0 to SQL NULL. For score fields, 0 can be a valid computed value (e.g., severe friction), so this loses information in the DB and can change behavior for any future SQL aggregates/filters. Prefer inserting 0 as 0 and using an explicit “score computed” flag/column if you need to represent “unknown”.
| // nullableFloat converts a zero float to a SQL NULL value. | |
| // Non-zero floats are passed through as-is. | |
| func nullableFloat(f float64) interface{} { | |
| if f == 0 { | |
| return nil | |
| } | |
| // nullableFloat passes floats through as-is so that 0.0 is preserved as a valid value. | |
| func nullableFloat(f float64) interface{} { |
| // QuerySessionsWithFriction returns checkpoint IDs of sessions containing | ||
| // friction matching the given SQL LIKE pattern (e.g., "%tool call failed%"). | ||
| func (idb *InsightsDB) QuerySessionsWithFriction(ctx context.Context, pattern string) ([]string, error) { | ||
| rows, err := idb.db.QueryContext(ctx, | ||
| "SELECT DISTINCT checkpoint_id FROM friction WHERE text LIKE ?", | ||
| pattern, | ||
| ) |
There was a problem hiding this comment.
QuerySessionsWithFriction returns only checkpoint IDs, dropping session_index even though the friction table is keyed by (checkpoint_id, session_index). Callers (e.g., transcript deep-read) can end up reading the wrong session (or always index 0) and miss the friction evidence. Consider returning (checkpoint_id, session_index) pairs (or a small struct) and adjusting callers accordingly.
cmd/entire/cli/improve_cmd.go
Outdated
| // Phase 2: Deep-read transcripts for top friction themes. | ||
| patterns := buildFrictionPatterns(frictionThemes) | ||
| if err = attachTranscriptExcerpts(ctx, idb, patterns, worktreeRoot); err != nil { | ||
| // Non-fatal: proceed without transcript excerpts. | ||
| _ = err | ||
| } | ||
|
|
||
| // Phase 3: Detect context files. | ||
| contextFiles := improve.DetectContextFiles(worktreeRoot) | ||
|
|
||
| // Phase 4: Build analysis from session data + friction patterns, then generate. | ||
| summaries := sessionRowsToSummaries(rows) | ||
| analysis := improve.AnalyzePatterns(summaries) | ||
| // Overlay the transcript excerpts we fetched into the analysis. | ||
| applyExcerpts(analysis.RepeatedFriction, patterns) |
There was a problem hiding this comment.
Phase 2/4 excerpt wiring looks broken: patterns := buildFrictionPatterns(frictionThemes) sets Theme to the raw friction text, but AnalyzePatterns produces RepeatedFriction themes like "lint", "test", etc. applyExcerpts matches by Theme, so transcript excerpts will almost never attach to the analysis passed into Generator.Generate. Consider running attachTranscriptExcerpts directly on analysis.RepeatedFriction (or ensuring both phases use the same theme key).
| // Phase 2: Deep-read transcripts for top friction themes. | |
| patterns := buildFrictionPatterns(frictionThemes) | |
| if err = attachTranscriptExcerpts(ctx, idb, patterns, worktreeRoot); err != nil { | |
| // Non-fatal: proceed without transcript excerpts. | |
| _ = err | |
| } | |
| // Phase 3: Detect context files. | |
| contextFiles := improve.DetectContextFiles(worktreeRoot) | |
| // Phase 4: Build analysis from session data + friction patterns, then generate. | |
| summaries := sessionRowsToSummaries(rows) | |
| analysis := improve.AnalyzePatterns(summaries) | |
| // Overlay the transcript excerpts we fetched into the analysis. | |
| applyExcerpts(analysis.RepeatedFriction, patterns) | |
| // Phase 2: (deprecated wiring) Deep-read transcripts will be attached after analysis is built. | |
| // Phase 3: Detect context files. | |
| contextFiles := improve.DetectContextFiles(worktreeRoot) | |
| // Phase 4: Build analysis from session data, then attach transcript excerpts and generate. | |
| summaries := sessionRowsToSummaries(rows) | |
| analysis := improve.AnalyzePatterns(summaries) | |
| // Attach transcript excerpts directly to the repeated friction patterns used in generation. | |
| if err = attachTranscriptExcerpts(ctx, idb, analysis.RepeatedFriction, worktreeRoot); err != nil { | |
| // Non-fatal: proceed without transcript excerpts. | |
| _ = err | |
| } |
cmd/entire/cli/improve_cmd.go
Outdated
| // truncateString truncates s to at most maxLen bytes, appending "..." if truncated. | ||
| func truncateString(s string, maxLen int) string { | ||
| if len(s) <= maxLen { | ||
| return s | ||
| } | ||
| return s[:maxLen] + "..." | ||
| } |
There was a problem hiding this comment.
truncateString slices by bytes (s[:maxLen]), which can cut multi-byte UTF-8 sequences (common in transcripts) and produce invalid text. Consider truncating by runes (or using utf8.ValidString / []rune), and ensure the ellipsis doesn’t exceed the requested limit if that matters.
cmd/entire/cli/improve/generator.go
Outdated
| func (g *Generator) Generate(ctx context.Context, analysis PatternAnalysis, contextFiles []ContextFile) ([]Suggestion, error) { | ||
| prompt := buildPrompt(analysis, contextFiles) | ||
|
|
||
| raw, err := g.Runner.Execute(ctx, prompt) |
There was a problem hiding this comment.
Generator.Generate assumes g.Runner is non-nil and will panic if a caller constructs Generator{}. Since Generator is exported, consider defaulting g.Runner to &llmcli.Runner{} when nil (or returning a clear error) to avoid panics in other packages/tests.
| raw, err := g.Runner.Execute(ctx, prompt) | |
| runner := g.Runner | |
| if runner == nil { | |
| runner = &llmcli.Runner{} | |
| } | |
| raw, err := runner.Execute(ctx, prompt) |
The ireturn linter now flags these regardless of nolint directives. These are type-assertion helpers that must return interfaces by design. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 47ce155f7bff
- Replace byte-based truncateString with UTF-8 safe stringutil.TruncateRunes - Fix TotalTokens in insights_cmd to include subagent tokens via termstyle.TotalTokens - Deduplicate totalTokensFromUsage in strategy by delegating to termstyle.TotalTokens - Replace hand-rolled splitCSV with strings.Split - DRY up Accept/Reject in evolve tracker via shared resolve method - Cap friction examples at 10 per theme to prevent unbounded LLM prompt growth - Remove unnecessary WHAT comments in evolve/tracker.go - Restore nolint:ireturn directives removed in prior commit (fixes lint) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: d63aba52b8ce
Parse total_cost_usd and usage fields from Claude CLI JSON response. llmcli.Execute now returns UsageInfo alongside the result, and entire improve shows a cost/token summary line after the report. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8e74831eb583
Keep session scoring from agent-improvements and writeOpts variable with v2 dual-write from main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 71977e30b301
- Generate summaries on-demand for sessions that lack them during insights/improve runs, scoped to --last N sessions requested - Fix token efficiency scoring: replace exp decay (always 0 for real data) with sigmoid centered at 500k tokens/turn - Fix first-pass scoring: raise baseline to 90, reduce penalties (friction 10→5, turns 3→2, open items 5→3) - Fix friction scoring: reduce penalty from 20→15 per item - Fix focus scoring: use gaussian on turns/files ratio instead of flat band - Add has_summary column to sessions table for tracking - Add UpdateSessionSummary for backfill cache updates - Remove nullableFloat (stored valid 0.0 scores as NULL) - Always compute partial scores even without summaries - Show "(no summary)" indicator in insights output - Restore nolint:ireturn directives on capabilities.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: dea7491a6bda
Entire-Checkpoint: faa75c2dab8f
Entire-Checkpoint: b4d20b476fee
Entire-Checkpoint: 89347a654b11
Entire-Checkpoint: 74a14a71f350
Scaffold the memorylooptui package (styles, keys, messages, render helpers, root model with tab switching) and implement the memories tab with bubbles/table for record listing, status filter cycling, search mode, detail pane toggle, and lifecycle action keybindings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 0f0d39f4f655
- Fix truncate() to slice by rune instead of byte to prevent UTF-8 corruption - Replace hand-rolled ANSI width parser with lipgloss.Width() - Distinguish empty state messages: no store vs no records vs no filter matches - Refactor updateSearch to use key.Matches() and msg.Runes instead of msg.String() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c63d65192881
…d-memory form - Injection tab: log table with prompt tester and match results - History tab: refresh history table with R trigger - Settings tab: mode/policy/max cycling with auto-save - Add-memory form: inline textinput fields on memories tab (n key) - Root model: input capture coordination, refresh placeholder Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 0b5a0dcac774
…ail pane on by default - Change accent from cyan (6) to orange/amber (214) across all tabs - Make inactive tabs use lighter gray (245) so they're visible on dark terminals - Show detail pane by default instead of hiding behind Enter toggle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 45aeb0cc713e
- Increase maxWidth from 80 to 120 for wider terminal support - Add rounded-border card to memories detail pane - Add injection log detail view with bordered card for selected entries - Add bordered card around prompt tester match results - Make prompt tester input more visible with orange arrow prompt - Inline pushState calls in handler methods for correct state propagation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 1b62f97ce468
…tions - Border color changed from dark gray (8) to light gray (245) for visibility - Added blank line between tab bar and filter bar - Added spacing between memory table and detail card - Fixed card width calculation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 14ef2cb624b1
…ngs key fix - Settings tab: each setting in a bordered card with proper layout - Injection tab: constrain log table height to prevent overflow - Settings update: capture changed msg in local struct to fix closure scope - Stats line condensed to single row Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 9872a596d85e
- Selected setting options shown with orange background + black text chip - Unselected options in light gray - Max injected number in bold orange - Removed debug file logging from settings and root Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 9393df2b476e
Entire-Checkpoint: 55da86a28935
Entire-Checkpoint: d0d138a8b16d
…ILS header Redesign the TUI navigation to look like proper tabs with an amber underline under the active tab and a MEMORY LOOP app title. Convert filter labels to uppercase chip buttons matching the Settings tab pattern. Add DETAILS section header above the memory detail card. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: abe1cf53d792
Add blank lines between title, metadata, body, why, and stats sections. Increase card padding and add spacing between filter chips and table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: a68a4e85fda9
…larity Injection tab: use section headers, increase card padding, add blank lines between sections, align detail card fields. History tab: add descriptive header explaining what refreshes do, widen scope column from 8 to 16 chars. Increase prompt preview storage limit from 120 to 500 chars so injection detail cards show more of the original prompt. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 7c16b9136913
Add `entire skill` command that discovers skill files across agent config directories (.claude/skills/, .gemini/agents/, etc.), tracks their usage from insightsdb session data in a dedicated skill-analytics.db, and generates AI-powered improvement suggestions via an interactive Bubbletea TUI with picker screen and 3-tab dashboard (Stats, Friction, Improve). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 12bbd28d76e5


Summary
Adds three new features to help users improve their AI coding sessions based on collected session data:
entire insights— Session quality scoring, cross-session trends, and agent comparisons. SQLite-cached, <1s response time.entire improve— Two-phase friction analysis (SQLite index → transcript deep-read → Claude CLI) that generates context file improvement suggestions (CLAUDE.md, AGENTS.md, .cursorrules, .gemini) with evidence and unified diffs.Architecture
.entire/insights.db) — local analytics cache usingmodernc.org/sqlite(pure Go, CGO_ENABLED=0 compatible). Populated fromentire/checkpoints/v1branch with staleness detection.insights/scores/on the checkpoint branch for future frontend consumption.llmclipackage — Common Claude CLI execution extracted fromsummarize/claude.go. Both summarize and improve use it with different prompts.termstylepackage — Shared terminal styling extracted fromstatus_style.goto avoid duplication across renderers.New packages
cmd/entire/cli/termstyle/cmd/entire/cli/llmcli/cmd/entire/cli/insightsdb/cmd/entire/cli/insights/cmd/entire/cli/improve/cmd/entire/cli/evolve/Key decisions
IsSummarizeEnabled()evolve.enabled: falseby default)modernc.org/sqlite(32MB → 40MB)Test plan
mise run test:ci— all unit + integration tests passentire insightsrenders scores, trends, and agent comparisonsentire improve --dry-runshows friction patterns without AI callentire improvegenerates context file suggestions.entire/insights.dbis created and populated on first runCondenseResult.SessionScore)🤖 Generated with Claude Code
Note
Medium Risk
Adds new CLI commands plus a new SQLite cache and hooks into session condensation to compute/store quality scores, which could impact core session/metadata workflows and local disk state. Also introduces Claude CLI execution via a shared runner and transcript reads, increasing integration surface with external tooling.
Overview
Introduces a new analytics and improvement workflow:
entire insightscomputes per-session quality scores, trend metrics, and agent comparisons from a local SQLite cache, with both terminal and--jsonoutput.Adds
entire improveto analyze recent sessions for recurring friction (SQLite index + optional transcript deep-read) and then call the Claude CLI to generate context-file suggestions (with evidence and unified diffs), including a--dry-runmode that skips AI/transcript reads.Adds an opt-in evolution loop (
settings.evolve) to track sessions since the last improvement run and print a tip prompting users to runentire improveafter a configurable threshold; also refactors shared infrastructure by extractingllmcli(Claude CLI runner + git isolation) andtermstyle(shared lipgloss styling), and adds themodernc.org/sqlitedependency for the new cache.Written by Cursor Bugbot for commit 7816386. Configure here.