High-throughput Private Information Retrieval for Ethereum light clients.
- Full mainnet matrix PIR on H100: 53.0 ms latency, 1.3 TB/s throughput
- Subtree-optimized GPU kernel: 27.4 ms on B200 (2.51 TB/s)
- Storage lookups use 8-byte Cuckoo tags with full 52-byte verification
- Verifiable PIR (sumcheck/binius) gated behind
verifiable-pirfeature
| Metric | Value |
|---|---|
| GPU scan throughput (H100) | 1,300 GB/s |
| GPU scan latency (68.8 GB) | 53.0 ms |
| GPU scan throughput (B200) | 2,510 GB/s |
| GPU scan latency (68.8 GB) | 27.4 ms |
# Build with AVX-512 and parallel support
cargo build --release --features avx512,parallel
# Run benchmark (75GB matrix, 3 iterations)
./target/release/bench_scan --rows 78643200 --iterations 3 --warmup-iterations 1 --scan-only --parallelThis project uses mandatory tooling for task tracking and code quality:
# View all tasks
backlog task list
# Work on a task
backlog task edit TASK-X -s "In Progress"
# Commit with task reference (auto-linked via git hook)
git commit -m "feat: implement feature
Addresses TASK-X"# View latest reviews
roborev list
# Check specific commit
roborev show COMMIT_SHA
# Address findings
roborev address JOB_IDThe .git/hooks/post-commit hook automatically:
- Triggers roborev review for every commit
- Links commits to tasks when message includes "Addresses TASK-X"
- Updates zoekt search index
All commits must reference a task ID and will be automatically reviewed.
crates/
├── morphogen-core/ # Core types (DeltaBuffer, EpochSnapshot, GlobalState)
├── morphogen-dpf/ # DPF key trait and implementations
├── morphogen-storage/ # AlignedMatrix, ChunkedMatrix
└── morphogen-server/ # Scan kernel, server, benchmarks
avx512- Enable AVX-512 SIMD optimizationsparallel- Enable multi-threaded chunk processing (rayon)profiling- Enable detailed timing instrumentationverifiable-pir- Enable sumcheck/binius proof plumbing
- Kanban - Project status and tasks (historic)
- Backlog - Active task management (use
backlogCLI) - Protocol & Architecture - v5.0
- Performance - Optimization findings
- Profiling Guide - How to profile
- Cryptography & Core Mechanics - Fused kernel and DPF logic
- Crypto Analysis - Why we use ChaCha8 over AES on GPUs
- Phase 79 Brief - Kernel optimization summary
- Trace Explanation - Sumcheck trace sizing
Modal benchmarks and data-prep scripts live in experiments/.
- Epoch-Based Delta-PIR: Wait-free snapshot isolation for live updates
- Parallel Cuckoo Retrieval: 3 simultaneous DPF queries per request
- Copy-on-Write Merge: Striped CoW for O(delta) epoch transitions
- AVX-512 Scan Kernel: 64-byte SIMD with 8-row unrolling
MIT