fix: add cache to speed up fn tipset_by_height#5944
Conversation
WalkthroughAdds FullTipset.persist to validate and store tipsets, changes read_msg_cids to accept a block header, introduces a CHAIN_FINALITY-based LRU checkpoint cache for tipset_by_height, refactors chain follower to borrow Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant ChainFollower
participant ChainStore
participant Network
participant Blockstore
Caller->>ChainFollower: get_full_tipset(&TipsetKey)
ChainFollower->>ChainStore: load_full_tipset(&TipsetKey)
alt present in store
ChainStore-->>ChainFollower: FullTipset
else missing
ChainFollower->>Network: chain_exchange_full_tipset(peer, &TipsetKey)
Network-->>ChainFollower: FullTipset
ChainFollower->>Blockstore: FullTipset.persist(db)
end
ChainFollower-->>Caller: FullTipset
sequenceDiagram
participant RPC
participant ChainStore
participant Network
participant Blockstore
participant VM
RPC->>ChainStore: try load_full_tipset(ts.key())
alt present
ChainStore-->>RPC: FullTipset
else missing
RPC->>Network: chain_exchange_messages(None, &ts)
Network-->>RPC: FullTipset
RPC->>Blockstore: FullTipset.persist(db)
end
RPC->>VM: compute_tipset_state(ts)
VM-->>RPC: Result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
2d94f03 to
518014f
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (8)
src/chain/store/index.rs (2)
153-156: Consider caching both child and parent on boundariesYou only cache
childat boundaries. Cachingparentwhen it lands on a boundary can improve hit rates for lookups that resolve to older epochs in the same call.Optional tweak:
for (child, parent) in self.chain(from).tuple_windows() { if child.epoch() % CHECKPOINT_INTERVAL == 0 { CACHE.push(child.epoch(), child.clone()); } + if parent.epoch() % CHECKPOINT_INTERVAL == 0 { + CACHE.push(parent.epoch(), parent.clone()); + } ... }
208-308: Add tests that exercise checkpoint cache across forksCurrent tests don’t hit the checkpoint path (epochs are < 1000) and don’t cover fork-selection correctness with cached checkpoints. Please add:
- A scenario that populates the cache at a checkpoint epoch on branch A, then calls
tipset_by_heighton branch B that passes through the same checkpoint epoch, asserting the result remains on branch B.- A scenario verifying we never select a checkpoint >
from.epoch().I can draft these tests if helpful.
src/chain_sync/validation.rs (1)
72-75: Nit: fix typo in the comment (“mst root” → “msg root”)Minor typo in the validation comment.
- // 1. Calculating the message root using all of the messages to ensure it - // matches the mst root in the block header 2. Ensuring it has not + // 1. Calculating the message root using all of the messages to ensure it + // matches the msg root in the block header 2. Ensuring it has notsrc/blocks/tipset.rs (1)
550-558: Persist full tipset with pre-persist msg-root validation — clarify side effectsThe implementation validates msg roots before persisting each block, which is a good safety check. One concern: TipsetValidator::validate_msg_root likely computes and persists TxMeta/AMTs as a side effect. If subsequent block.persist fails, the store may end up with TxMeta referencing messages not yet present. This is acceptable if failures are practically impossible or writes are idempotent, but the side effect is non-obvious from the name.
- Consider documenting validate_msg_root’s persistence side effects here, or switching to a side-effect-free check (if available), followed by an explicit persist of TxMeta.
- Alternatively, rename validate_msg_root to something that reflects its effect (e.g., ensure_msg_root_persisted) to make intent clear at call sites.
Would you like me to draft doc comments or a small refactor to make this explicit?
src/rpc/methods/f3.rs (1)
584-601: Persist after full validation and use FullTipset::persist to centralize logicCurrently, blocks are persisted before full tipset validation, and the per-block loop duplicates logic now encapsulated by FullTipset::persist (which also validates msg roots). To reduce the risk of persisting invalid data and to keep logic centralized, validate first, then persist the whole tipset via FullTipset::persist.
Proposed change:
- Run validator.validate first (already present).
- Replace the per-block loop with fts.persist(ctx.store())? after successful validation.
Apply this diff:
- let fts = ctx + let fts = ctx .sync_network_context .chain_exchange_full_tipset(None, &tsk) .await?; - for block in fts.blocks() { - block.persist(ctx.store())?; - } let validator = TipsetValidator(&fts); validator.validate( ctx.chain_store(), None, &ctx.chain_store().genesis_tipset(), ctx.chain_config().block_delay_secs, )?; + // Persist the tipset after successful validation. + fts.persist(ctx.store())?; let ts = Arc::new(Tipset::from(fts)); ctx.chain_store().put_tipset(&ts)?; ctx.chain_store().set_heaviest_tipset(finalized_ts)?;src/rpc/methods/state.rs (1)
1435-1449: Solid fallback: pre-loading FullTipset with network fetch when absent in storeGood addition. Ensures state computation has messages available by first attempting a local load and falling back to ChainExchange if needed. Persisting the fetched FullTipset closes the loop.
Two small improvements would help operability and diagnostics (optional):
- Add a debug trace when falling back to the network to aid troubleshooting.
- Add context to persist errors for quicker issue triage.
// Attempt to load full tipset from the store - if crate::chain_sync::load_full_tipset(ctx.chain_store(), ts.key()).is_err() { + if crate::chain_sync::load_full_tipset(ctx.chain_store(), ts.key()).is_err() { + tracing::debug!(epoch = %ts.epoch(), tsk = %ts.key(), "FullTipset not in store; fetching via ChainExchangeMessages"); // Load full tipset from the network let fts = ctx .sync_network_context .chain_exchange_messages(None, &ts) .await .map_err(|e| anyhow::anyhow!(e))?; - fts.persist(ctx.store())?; + fts.persist(ctx.store()).context("persisting fetched FullTipset into local store")?; }src/chain_sync/chain_follower.rs (2)
434-452: Consistent, robust FullTipset retrieval and persistence
- Early return on successful local load is optimal.
- Network fallback to
chain_exchange_full_tipsetand persisting the entire tipset is the right behavior.Minor nit: prefer adding error context for the network call and persist path to improve debuggability.
- let tipset = network - .chain_exchange_full_tipset(peer_id, tipset_keys) - .await - .map_err(|e| anyhow::anyhow!(e))?; - tipset.persist(&chain_store.db)?; + let tipset = network + .chain_exchange_full_tipset(peer_id, tipset_keys) + .await + .map_err(|e| anyhow::anyhow!(e))?; + tipset + .persist(&chain_store.db) + .context("persisting FullTipset fetched via ChainExchange")?;
458-477: Use FullTipset::persist instead of per-block persist in batch pathThe single-tipset path persists the entire tipset atomically. Here, the batch path persists blocks individually, which is slower and risks subtle inconsistencies if future persist behavior diverges between block and tipset levels. For consistency and maintainability, persist per tipset.
let tipsets = network .chain_exchange_full_tipsets(peer_id, tipset_keys) .await .map_err(|e| anyhow::anyhow!(e))?; - for tipset in tipsets.iter() { - for block in tipset.blocks() { - block.persist(&chain_store.db)?; - } - } + for tipset in &tipsets { + tipset.persist(&chain_store.db)?; + } Ok(tipsets)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (12)
src/blocks/tipset.rs(2 hunks)src/chain/store/chain_store.rs(2 hunks)src/chain/store/index.rs(4 hunks)src/chain_sync/chain_follower.rs(7 hunks)src/chain_sync/mod.rs(1 hunks)src/chain_sync/network_context.rs(3 hunks)src/chain_sync/validation.rs(1 hunks)src/libp2p/chain_exchange/message.rs(1 hunks)src/libp2p/chain_exchange/provider.rs(1 hunks)src/rpc/methods/chain.rs(1 hunks)src/rpc/methods/f3.rs(2 hunks)src/rpc/methods/state.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
🧬 Code Graph Analysis (11)
src/chain/store/chain_store.rs (4)
src/message/chain_message.rs (1)
cid(29-34)src/blocks/header.rs (2)
cid(69-71)cid(304-306)src/shim/message.rs (1)
cid(195-198)src/blocks/block.rs (1)
cid(41-43)
src/rpc/methods/state.rs (1)
src/chain_sync/chain_follower.rs (1)
load_full_tipset(479-501)
src/blocks/tipset.rs (3)
src/blocks/block.rs (1)
persist(46-50)src/chain/store/chain_store.rs (1)
db(502-502)src/chain_sync/validation.rs (1)
validate_msg_root(106-116)
src/libp2p/chain_exchange/message.rs (2)
src/shim/sector.rs (1)
try_from(243-245)src/shim/crypto.rs (2)
try_from(205-217)try_from(325-332)
src/chain_sync/mod.rs (1)
src/chain_sync/chain_follower.rs (2)
chain_follower(134-331)load_full_tipset(479-501)
src/rpc/methods/f3.rs (1)
src/blocks/tipset.rs (1)
tsk(265-269)
src/libp2p/chain_exchange/provider.rs (1)
src/chain/store/chain_store.rs (2)
read_msg_cids(494-513)db(502-502)
src/chain/store/index.rs (3)
src/beacon/drand.rs (1)
new(251-269)src/blocks/tipset.rs (11)
new(242-260)new(494-511)epoch(306-308)epoch(543-545)from(105-110)from(160-162)from(166-168)from(172-177)from(181-186)from(205-214)from(479-484)src/utils/cache/lru.rs (1)
new_with_default_metrics_registry(96-101)
src/rpc/methods/chain.rs (1)
src/chain/store/chain_store.rs (1)
read_msg_cids(494-513)
src/chain_sync/network_context.rs (2)
src/blocks/tipset.rs (2)
new(242-260)new(494-511)src/blocks/header.rs (1)
new(282-288)
src/chain_sync/chain_follower.rs (4)
src/blocks/tipset.rs (2)
key(336-339)key(530-533)src/rpc/mod.rs (1)
chain_store(466-468)src/state_manager/mod.rs (1)
chain_store(296-298)src/blocks/chain4u.rs (1)
tipset(183-185)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: All lint checks
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (15)
src/libp2p/chain_exchange/message.rs (1)
128-130: Generic TryFrom bound + error mapping look goodRelaxing the bound to
T: TryFrom<TipsetBundle, Error: Display>and mapping errors viato_string()is correct and backward-compatible with existingTryFrom<TipsetBundle, Error = String>impls. Control flow remains unchanged.Also applies to: 137-141
src/chain/store/chain_store.rs (2)
470-476: Call-site update to pass block header is correctDelegating to
read_msg_cids(db, bh)aligns with the new signature and removes the need to plumb raw CID parameters around.
494-513: Improved diagnostics for missing TxMetaPassing the full block header allows a much more actionable error: including the epoch and block CID is great for triage. Logic for loading AMTs remains unchanged.
src/chain_sync/validation.rs (1)
76-76: Static call to validate_msg_root is correctSwitching from
self.validate_msg_roottoSelf::validate_msg_rootmatches the new static method signature and removes a needless receiver.src/chain_sync/network_context.rs (1)
244-246: Broadened generic bounds align with chain_exchange/message changesUpdating
handle_chain_exchange_requestto acceptT: TryFrom<TipsetBundle, Error: Display>syncs withChainExchangeResponse::into_result::<T>(). Good cohesion across layers.src/rpc/methods/chain.rs (1)
556-561: Correct API update: pass the whole block header to read_msg_cidsSwitching to read_msg_cids(ctx.store(), &blk) aligns with the new signature and improves error context downstream. Looks good.
src/chain_sync/mod.rs (1)
16-21: Publicly re-exporting load_full_tipset is consistent with new call sitesRe-export looks correct and matches updated RPC/state call sites relying on borrowed TipsetKey. No issues.
src/libp2p/chain_exchange/provider.rs (1)
98-101: read_msg_cids call updated correctlyPassing block_header matches the revised API and enables improved error reporting (epoch + block cid). Change is correct.
src/rpc/methods/f3.rs (1)
166-171: Metric name change may break dashboards/alertsRenaming the cache metrics key to "get_powertable" can affect existing dashboards/alerts expecting "get_powertable_cache". If this rename is intentional, ensure dashboards and alerts are updated accordingly.
Do you want a follow-up PR checklist for updating observability assets?
src/rpc/methods/state.rs (1)
1453-1453: LGTM: using the pre-loaded tipset for state computationReusing the locally bound
tsafter ensuring messages are available is correct and keeps the control flow cohesive.src/chain_sync/chain_follower.rs (5)
179-180: LGTM: pass TipsetKey by referenceBorrowing
&tipset_keysavoids an extra allocation/copy and aligns with the new API.
191-192: LGTM: consistent borrowing of TipsetKeyUsing
&keyhere keeps the call-site consistent with the updated signature and avoids unnecessary moves.
479-485: LGTM: public, borrowed-keyload_full_tipsetwith typed message loadExposing a borrowed
&TipsetKeyAPI and reconstructing aFullTipsetviablock_messagesis clean and efficient.Also applies to: 501-501
601-601: LGTM: parent readiness check with borrowed-key loaderUsing
load_full_tipset(&self.cs, tipset.parents())for readiness keeps the new borrowing conventions consistent and avoids needless allocations.
875-876: LGTM: updated call site to pass borrowed keyThe fetch task now correctly passes
&keytoget_full_tipset_batch.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/chain/store/index.rs (1)
125-147: Cache key is ambiguous across forks; guard adoption with a same-branch checkThe global LRU is keyed only by ChainEpoch. Unconditionally replacing
fromwith a cached tipset at that epoch can select a tipset from a different fork than the caller’sfrom. This breaks the contract thatfromdisambiguates branches.Action:
- Only adopt a cached checkpoint if it’s an ancestor of the provided
from(i.e., on the same branch). This preserves correctness without giving up the performance benefit when the checkpoint matches the branch.Apply this diff within the shown hunk:
- let mut checkpoint_from_epoch = to; - while checkpoint_from_epoch < from_epoch { - if let Some(checkpoint_from) = CACHE.get_cloned(&checkpoint_from_epoch) { - from = checkpoint_from; - break; - } - checkpoint_from_epoch = next_checkpoint(checkpoint_from_epoch); - } + // Try to seed `from` from a checkpoint on the same branch. + let mut checkpoint_from_epoch = to; + while checkpoint_from_epoch < from_epoch { + if let Some(cp) = CACHE.get_cloned(&checkpoint_from_epoch) { + // Guard adoption: only if cached cp is an ancestor of current `from`. + let cp_epoch = cp.epoch(); + let same_branch = self + .chain(from.clone()) + .take_while(|ts| ts.epoch() >= cp_epoch) + .any(|ts| ts.key() == cp.key()); + if same_branch { + from = cp; + break; + } + // else: ignore and keep searching older checkpoints + } + checkpoint_from_epoch = next_checkpoint(checkpoint_from_epoch); + }If you prefer a helper, I can provide a small
is_ancestor_on_same_branch(&self, child, ancestor)utility to encapsulate the check. Let me know.
🧹 Nitpick comments (3)
src/chain/store/index.rs (3)
125-133: Scope of static cache may cross DB/network boundaries
static CACHE: LazyLock<...>is shared across allChainIndexinstances and databases in-process. This can lead to:
- Cross-network/DB contamination (e.g., testnet/mainnet, separate MemoryDBs in tests).
- Attempting to walk parents of a cached tipset that do not exist in the current
self.db.Consider scoping this cache to
ChainIndex(as a struct field) or including a partitioning dimension (e.g., network/chain ID) in the key to avoid cross-context pollution.
134-136: Minor: clarify checkpoint computation and boundary behavior
next_checkpointalways returns the next multiple strictly greater thanepoch(even when already aligned). This is fine given the current loop starts attoand probestofirst, but a quick comment here would prevent future misinterpretation.Apply:
- fn next_checkpoint(epoch: ChainEpoch) -> ChainEpoch { - epoch - epoch.mod_floor(&CHECKPOINT_INTERVAL) + CHECKPOINT_INTERVAL - } + // Returns the next multiple of CHECKPOINT_INTERVAL strictly greater than `epoch`. + fn next_checkpoint(epoch: ChainEpoch) -> ChainEpoch { + epoch - epoch.mod_floor(&CHECKPOINT_INTERVAL) + CHECKPOINT_INTERVAL + }
160-166: Use saturating subtraction to avoid potential overflow and reflect intentThe condition
child.epoch() + CHECKPOINT_INTERVAL <= from_epochcan be expressed without addition, avoiding a potential overflow edge case and matching the “at least one interval behind” intent.Apply:
- if child.epoch() % CHECKPOINT_INTERVAL == 0 - && child.epoch() + CHECKPOINT_INTERVAL <= from_epoch + if child.epoch() % CHECKPOINT_INTERVAL == 0 + && child.epoch() <= from_epoch.saturating_sub(CHECKPOINT_INTERVAL) { CACHE.push(child.epoch(), child.clone()); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/chain/store/index.rs(4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.907Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-08-18T03:09:47.907Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.907Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Applied to files:
src/chain/store/index.rs
🧬 Code Graph Analysis (1)
src/chain/store/index.rs (2)
src/beacon/drand.rs (1)
new(251-269)src/utils/cache/lru.rs (1)
new_with_default_metrics_registry(96-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: cargo-publish-dry-run
- GitHub Check: All lint checks
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (2)
src/chain/store/index.rs (2)
122-124: Allowingmut fromis appropriate for cache seedingChanging the parameter to
mutis necessary to reseed the traversal starting point from a cached checkpoint. No concerns here.
4-4: std::sync::LazyLock is supported under MSRV 1.89.0
The project’s rust-toolchain.toml pins the channel to 1.89.0, which includesstd::sync::LazyLock. No change is needed here.
src/chain/store/index.rs
Outdated
| mut from: Arc<Tipset>, | ||
| resolve: ResolveNullTipset, | ||
| ) -> Result<Arc<Tipset>, Error> { | ||
| const CHECKPOINT_INTERVAL: ChainEpoch = 1000; |
There was a problem hiding this comment.
Could we use the chain finality constant here, or a value expressed in terms of chain finality?
src/chain/store/index.rs
Outdated
| resolve: ResolveNullTipset, | ||
| ) -> Result<Arc<Tipset>, Error> { | ||
| const CHECKPOINT_INTERVAL: ChainEpoch = 1000; | ||
| static CACHE: LazyLock<SizeTrackingLruCache<ChainEpoch, Arc<Tipset>>> = |
There was a problem hiding this comment.
Why choose Tipset for the value instead of just TipsetKey? I expect a Tipset to have a much larger memory footprint.
There was a problem hiding this comment.
Good point. Change to using TipsetKey
There was a problem hiding this comment.
Will a chekpoint tipset value be cached in the ChainStore's TipsetCache? If not, is this possible or do we even want this?
There was a problem hiding this comment.
Change to using self.load_tipset instead which is tipset_cache aware
It's pretty clear what the performance improvement is, but what was the bug that was fixed? |
elmattic
left a comment
There was a problem hiding this comment.
Does this PR also resolve the performance issues we've been having with certain RPC endpoints, such as eth_getBlockByNumber, etc.?
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/chain/store/index.rs (3)
127-133: LRU capacity: minor ergonomic tweakUse the existing
nonzero!macro (already imported) for clarity instead oftry_into().expect("infallible").static CACHE: LazyLock<SizeTrackingLruCache<ChainEpoch, TipsetKey>> = LazyLock::new(|| { SizeTrackingLruCache::new_with_default_metrics_registry( "tipset_by_height".into(), - 4096.try_into().expect("infallible"), + nonzero!(4096_usize), ) });
17-17: Dropnum::Integerby usingrem_euclidYou can avoid the extra trait import and dependency by using the standard
rem_euclidfor non-negative modulo. Behavior matchesmod_floorhere.- use num::Integer; + // use of rem_euclid below removes the need for num::Integer- fn next_checkpoint(epoch: ChainEpoch) -> ChainEpoch { - epoch - epoch.mod_floor(&CHAIN_FINALITY) + CHAIN_FINALITY - } + fn next_checkpoint(epoch: ChainEpoch) -> ChainEpoch { + epoch - epoch.rem_euclid(CHAIN_FINALITY) + CHAIN_FINALITY + }Also applies to: 134-137
141-150: Leverage the tipset cache when seeding from the checkpoint cacheUse
self.load_tipsetto populate the primary tipset cache and track metrics instead of callingTipset::loaddirectly.- while checkpoint_from_epoch < from_epoch { - if let Some(checkpoint_from_key) = CACHE.get_cloned(&checkpoint_from_epoch) - && let Ok(Some(checkpoint_from)) = Tipset::load(&self.db, &checkpoint_from_key) - { - from = checkpoint_from.into(); - break; - } - checkpoint_from_epoch = next_checkpoint(checkpoint_from_epoch); - } + while checkpoint_from_epoch < from_epoch { + if let Some(checkpoint_from_key) = CACHE.get_cloned(&checkpoint_from_epoch) { + if let Some(checkpoint_from) = self.load_tipset(&checkpoint_from_key)? { + from = checkpoint_from; + break; + } + } + checkpoint_from_epoch = next_checkpoint(checkpoint_from_epoch); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
src/blocks/tipset.rs(2 hunks)src/chain/store/chain_store.rs(2 hunks)src/chain/store/index.rs(4 hunks)src/shim/mod.rs(1 hunks)src/shim/policy.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/blocks/tipset.rs
- src/chain/store/chain_store.rs
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-08-18T03:09:47.932Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Applied to files:
src/chain/store/index.rs
🧬 Code Graph Analysis (1)
src/chain/store/index.rs (2)
src/beacon/drand.rs (1)
new(251-269)src/utils/cache/lru.rs (1)
new_with_default_metrics_registry(96-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (4)
src/shim/policy.rs (1)
4-4: Straightforward re-export — looks goodPublicly re-exporting
policy_constantsvia the shim is clean and keeps call sites tidy.src/shim/mod.rs (1)
19-19: Module exposure approvedAdding
pub mod policy;properly exposes the new shim module and unblocks downstreamCHAIN_FINALITYusage.src/chain/store/index.rs (2)
163-167: Finality-gated checkpoints are fork-safe and efficientCaching only when
child.epoch() % CHAIN_FINALITY == 0andchild.epoch() + CHAIN_FINALITY <= from_epochkeeps entries finalized and safe to reuse across branches, aligning with your finality-first approach (per your stated preference in recent work). Nice.
4-4: MSRV (1.89.0) supports std::sync::LazyLock
rust-toolchain.toml pins the toolchain to 1.89.0, which includesstd::sync::LazyLock. No further action needed.
The previous logic won't compute a state when the tipset messages are missing. Now it backfills the messages from p2p network.
Yes, this RPC method uses |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
src/chain/store/index.rs (5)
125-133: Finality-based checkpoint cache is sound; consider moving the static to module scopeThe CHAIN_FINALITY-scoped LRU keyed by ChainEpoch→TipsetKey is the right trade-off here. One minor improvement: placing this
staticinside a generic method yields one cache per monomorphization (e.g., differentDBtypes), which slightly increases memory and dilutes hit rates. Moving it to module scope (single global cache) avoids duplication.Also, for clarity/consistency with the rest of the file that uses
nonzero!, consider using it for the capacity.Add this near the top of the module (outside the impl):
use std::sync::LazyLock; use crate::shim::clock::ChainEpoch; use crate::utils::cache::SizeTrackingLruCache; use nonzero_ext::nonzero; static TIPSET_BY_HEIGHT_CHECKPOINTS: LazyLock<SizeTrackingLruCache<ChainEpoch, TipsetKey>> = LazyLock::new(|| { SizeTrackingLruCache::new_with_default_metrics_registry( "tipset_by_height".into(), nonzero!(4096usize), ) });Then inside
tipset_by_height, replace uses ofCACHEwithTIPSET_BY_HEIGHT_CHECKPOINTSand remove the in-functionstaticblock.
134-138: Simplify checkpoint alignment and drop thenum::IntegerdependencyEpochs are non-negative in this context, so you can avoid
mod_floorand the extranumimport. A simpler align-up is easier to read.Apply this diff:
- // use `CHAIN_FINALITY` as checkpoint interval - fn next_checkpoint(epoch: ChainEpoch) -> ChainEpoch { - epoch - epoch.mod_floor(&CHAIN_FINALITY) + CHAIN_FINALITY - } + // Align up to the next CHAIN_FINALITY boundary (or stay if already aligned). + fn next_checkpoint(epoch: ChainEpoch) -> ChainEpoch { + let rem = epoch % CHAIN_FINALITY; + if rem == 0 { epoch } else { epoch + (CHAIN_FINALITY - rem) } + }And (if unused elsewhere) remove the now-unneeded import:
- use num::Integer;
141-151: Checkpoint seeding loop: correct clamping and safe adoptionSeeding from the first cached checkpoint between
to..from_epochis correct and efficient. The ≤ from_epoch clamp ensures we never “jump ahead,” and the finality-based cache guarantees canonical-chain checkpoints, so branch ambiguity isn’t a concern.A tiny improvement: add a short comment stating the invariant that only finalized epochs are cached (and thus safe across branches), to prevent future regressions.
163-167: Cache only finalized epochs: LGTM; minor arithmetic robustnessThe finality guard is correct. To avoid potential overflow in
child.epoch() + CHAIN_FINALITY(theoretically, if epochs ever approach i64::MAX), compare using subtraction instead.Apply this diff:
- if child.epoch() % CHAIN_FINALITY == 0 && child.epoch() + CHAIN_FINALITY <= from_epoch { + if child.epoch() % CHAIN_FINALITY == 0 && child.epoch() <= from_epoch - CHAIN_FINALITY { CACHE.push(child.epoch(), child.key().clone()); }
4-4: Imports: consider trimmingnum::Integerif you adopt the simplified alignmentIf you switch
next_checkpointto the simpler arithmetic, you can dropuse num::Integer;.LazyLockimport looks good and keeps dependencies minimal.Also applies to: 17-17
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/chain/store/index.rs(4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-08-18T03:09:47.932Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Applied to files:
src/chain/store/index.rs
🧬 Code Graph Analysis (1)
src/chain/store/index.rs (2)
src/blocks/tipset.rs (11)
from(105-110)from(160-162)from(166-168)from(172-177)from(181-186)from(205-214)from(479-484)new(242-260)new(494-511)epoch(306-308)epoch(543-545)src/utils/cache/lru.rs (1)
new_with_default_metrics_registry(96-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
- GitHub Check: Build MacOS
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build Ubuntu
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (rust)
🔇 Additional comments (1)
src/chain/store/index.rs (1)
122-126: Good call makingfrommutable to allow checkpoint reseedingAllowing
fromto be reseeded from a cached checkpoint is necessary for the fast path and keeps the external behavior intact.
|
@hanabi1224 Any reasons not to add the |
Summary of changes
This is to facilitate @ADobrodey to generate historical lite snapshots with bug fixes and performance improvements
Changes introduced in this pull request:
fn tipset_by_heightReference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit
New Features
Refactor
Chores