Problem
DiffScope uses a single model for the entire review. All three competitors route different tasks to different models:
| Task |
Greptile |
CodeRabbit |
Qodo |
| Triage/classification |
— |
Cheap model |
model_weak |
| Summarization |
GPT-5-nano / GPT-4o-mini |
Light model |
model_weak |
| Embeddings |
text-embedding-3-small |
Unknown |
Qodo Embed-1 |
| Review |
Claude Sonnet 4 |
GPT-4/o1/Claude |
Primary model |
| Verification |
— |
Separate agent |
— |
| Self-reflection |
— |
— |
model_reasoning |
| Complex tasks |
Claude Opus 4 |
— |
— |
| Lightweight |
Claude Haiku 4.5 |
— |
— |
Using frontier models for summarization and triage wastes tokens and money. Using cheap models for review misses bugs.
Proposed Solution
Model Hierarchy
models:
primary: claude-sonnet-4-6 # review, verification
weak: claude-haiku-4-5 # triage, summarization, NL translation
reasoning: claude-opus-4-6 # complex analysis, self-reflection
embedding: text-embedding-3-small # RAG indexing
fallback:
- claude-sonnet-4-6
- gpt-4o
Task Routing
| Task |
Model |
Rationale |
| File triage (NEEDS_REVIEW vs cosmetic) |
weak |
Simple classification |
| Per-file summarization |
weak |
Cheap, high-volume |
| NL translation of code chunks |
weak |
Indexing task |
| Embedding generation |
embedding |
Specialized |
| Code review |
primary |
Quality matters |
| Verification pass |
primary |
Accuracy matters |
| Complex cross-file analysis |
reasoning |
Needs deep reasoning |
| Commit message generation |
weak |
Simple task |
Implementation
enum ModelRole {
Primary,
Weak,
Reasoning,
Embedding,
}
impl Config {
fn model_for_role(&self, role: ModelRole) -> &ModelConfig {
match role {
ModelRole::Primary => &self.primary_model,
ModelRole::Weak => self.weak_model.as_ref().unwrap_or(&self.primary_model),
ModelRole::Reasoning => self.reasoning_model.as_ref().unwrap_or(&self.primary_model),
ModelRole::Embedding => self.embedding_model.as_ref().unwrap_or(&self.primary_model),
}
}
}
Fallback Chain
Like Qodo: try primary, catch error, try next in fallback list. Simple and robust.
Expected Impact
- Cost reduction: 60-80% for summarization/triage tasks
- Speed: Cheap models respond 3-5x faster
- Quality: Frontier models focused where they matter (review + verification)
- Greptile reports 75% lower inference costs despite 3x more context tokens, primarily through model routing + prompt caching
Priority
Medium — cost optimization + quality improvement. Becomes critical at scale.
Problem
DiffScope uses a single model for the entire review. All three competitors route different tasks to different models:
model_weakmodel_weakmodel_reasoningUsing frontier models for summarization and triage wastes tokens and money. Using cheap models for review misses bugs.
Proposed Solution
Model Hierarchy
Task Routing
Implementation
Fallback Chain
Like Qodo: try primary, catch error, try next in fallback list. Simple and robust.
Expected Impact
Priority
Medium — cost optimization + quality improvement. Becomes critical at scale.