Conversation
- Extract entity logic to src/entities.ts with enhanced extraction (file paths, URLs, version numbers) - Compute entity_retention, structural_integrity, reference_coherence, and composite quality_score in CompressResult - Add relevanceThreshold option: low-value messages replaced with compact stubs instead of low-quality summaries - Export bestSentenceScore for external relevance scoring - Add roadmap-v2.md tracking all planned improvements
- Tiered budget: keeps recencyWindow fixed, progressively compresses older content by priority tier (tighten → stub → truncate) instead of shrinking the recency window via binary search - Adaptive summary budget: scales with content density — entity-dense messages get up to 45% budget, sparse content gets down to 15% - budgetStrategy option: 'binary-search' (default) or 'tiered' - Both sync and async paths supported for tiered strategy
- New entropyScorer option: plug in a small LM for self-information based sentence importance scoring (Selective Context paper) - entropyScorerMode: 'replace' (entropy only) or 'augment' (weighted average with heuristic, default) - src/entropy.ts: splitSentences, normalizeScores, combineScores utils - Sync and async paths supported; async scorer throws in sync mode - Zero new dependencies: scorer is user-provided function
- Detects Q&A pairs, request→action→confirmation chains, corrections, and acknowledgment patterns in message history - Groups flow chains into single compression units producing more coherent summaries (e.g., "Q: how does X work? → A: it uses Y") - conversationFlow option: opt-in, default false - Flow chains override soft preservation (recency, short content) but not hard blocks (system role, dedup, tool_calls)
…uto) - compressionDepth option controls summarization aggressiveness - gentle: standard sentence selection (default, backward compatible) - moderate: 50% tighter budgets for more aggressive compression - aggressive: entity-only stubs for maximum ratio - auto: progressively tries gentle → moderate → aggressive until tokenBudget fits, with quality gate (stops if quality < 0.60) - Both sync and async paths supported
- Coreference tracking (coreference option): when a compressed message defines an entity referenced by a preserved message, the definition is inlined into the summary to prevent orphaned references - Semantic clustering (semanticClustering option): groups messages by topic using TF-IDF cosine similarity + entity overlap Jaccard, then compresses each cluster as a unit for better topic coherence - Both features are opt-in, zero new dependencies
- Segments text into Elementary Discourse Units with dependency graph - Clause boundary detection via discourse markers (then, because, which...) - Pronoun/demonstrative, temporal, and causal dependency edges - When selecting EDUs for summary, dependency parents are included (up to 2 levels) to prevent incoherent output - discourseAware option: opt-in, default false
- 8 adversarial test cases: pronoun-heavy, scattered entities, correction chains, code-interleaved prose, near-duplicates with critical differences, 10k+ char messages, mixed SQL/JSON/bash, and full round-trip integrity with all features enabled - Update roadmap: 14 of 16 items complete
- ML token classifier (mlTokenClassifier option): per-token keep/remove classification via user-provided model (LLMLingua-2 style). Includes sync/async support, whitespace tokenizer, mock classifier for testing - A/B comparison tool (npm run bench:compare): side-by-side comparison of default vs v2 features across coding, deep conversation, and agentic scenarios. Reports ratio, quality, entity retention, tokens - All 16/16 roadmap items now complete
…tion - bench/run.ts: new Quality Metrics (v2) table showing entity retention, structural integrity, reference coherence, and quality score per scenario - bench/baseline.ts: QualityResult type, quality section in generated docs, average quality score in summary table - bench/compare.ts: add Long Q&A and Technical explanation scenarios, rename V2 option set to "V2 balanced" (no relevanceThreshold) - flow.ts: exclude messages with code fences from flow chain detection to prevent Q&A chains from dropping code content - package.json: add bench:compare script
- New docs/v2-features.md: full documentation for all 11 new features with usage examples, how-it-works sections, and explicit tradeoff analysis for each feature - docs/api-reference.md: updated exports listing, 13 new options in CompressOptions table, 5 new result fields, new types (MLTokenClassifier, TokenClassification) - docs/token-budget.md: added tiered budget strategy and compression depth sections with cross-links - docs/README.md: added V2 Features to index - Each feature documents: what it does, how to use it, how it works internally, and what you give up (the tradeoff)
- Flow chains and clusters no longer skip non-member messages between chain endpoints. Previously, a chain spanning indices [1,4] would skip indices 2,3 even if they weren't chain members (dropping code) - Importance threshold raised from 0.35 to 0.65. The old threshold preserved nearly all messages in entity-rich conversations, reducing compression ratio by up to 30% with no quality benefit - EDU scorer replaced length-based heuristic with information-density scoring (identifiers, numbers, emphasis) to avoid keeping long filler clauses over short technical ones
- Quick reference table, feature section, and TSDoc all flag the 8-28% ratio regression without a custom ML scorer - Explain why: dependency tracking inherently fights compression by pulling in parent EDUs, and the rule-based scorer can't distinguish load-bearing dependencies from decorative ones - Recommend using exported segmentEDUs/scoreEDUs/selectEDUs directly with a custom scorer instead of the discourseAware option - Remove discourseAware from recommended feature combinations
Adaptive entity-aware budgets were changing default compression output (6% regression on coding scenario) because extractEntities was called unconditionally. Now entity-adaptive budgets only activate when compressionDepth is explicitly set to moderate/aggressive/auto. Default path (no v2 options) now produces identical output to develop.
- Flow chains and clusters only mark themselves as processed AFTER successful compression. Previously they were marked on entry, causing non-compressed chain members to be silently dropped - Semantic clusters restricted to consecutive indices only — non-consecutive merges broke round-trip because uncompress can't restore interleaved message ordering - Added V2 Features Comparison section to bench reporter showing each feature individually and recommended combo vs default, with per-scenario ratio/quality and delta row - All 8 scenarios × 8 configs pass round-trip verification
| if (score > best) best = score; | ||
| } | ||
| return best; | ||
| } |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
| scoreMap.set(i, rawScores[i]); | ||
| } | ||
| } else { | ||
| // augment: weighted average of heuristic and entropy |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
| if (userSummarizer) { | ||
| next = gen.next(await withFallback(text, userSummarizer, budget)); | ||
| } else { | ||
| next = gen.next(summarize(text, budget, externalScores)); |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
| if (entities.length === 0) return ''; | ||
|
|
||
| // For each entity, find the sentence where it first appears | ||
| const sentences = sourceContent.match(/[^.!?\n]+[.!?]+/g) ?? [sourceContent]; |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
| */ | ||
| export function segmentEDUs(text: string): EDU[] { | ||
| // First split into sentences | ||
| const sentences = text.match(/[^.!?\n]+[.!?]+/g) ?? [text.trim()]; |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
| * Returns the sentences and their original indices for reassembly. | ||
| */ | ||
| export function splitSentences(text: string): string[] { | ||
| const sentences = text.match(/[^.!?\n]+[.!?]+/g); |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
Summary
Feature impact (bench results, default recencyWindow=0)
compressionDepth: 'moderate'relevanceThreshold: 3conversationFlowimportanceScoringcoreferencesemanticClusteringdiscourseAwareRecommended usage
Key fixes during development
Test plan
npm run benchquality metrics all ≥ 0.94 on default pathnpm run bench:compareshows v1 vs v2 side-by-side