fix: optimize collect_events in Eth RPC#6750
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds ChainStore::load_child_tipset; changes Error::NotFound/State to carry Cow<'static, str>; replaces receipt/event cache with ExecutedMessage/ExecutedTipset and load_executed_tipset helpers; updates RPC, filter logic, dev tooling, tests, and removes two get_size helpers. Changes
Sequence Diagram(s)sequenceDiagram
participant RPC as RPC handler
participant SM as StateManager
participant CS as ChainStore
participant DB as Blockstore/DB
RPC->>SM: request executed tipset (include_events?)
SM->>CS: request tipset state / load child tipset (when resolving next)
CS->>DB: read tipset / headers
DB-->>CS: return tipset data
CS-->>SM: return tipset / child tipset
SM->>SM: execute messages -> build ExecutedMessage(s) (attach events if requested)
SM-->>RPC: return ExecutedTipset (state_root, executed_messages)
RPC->>RPC: build Block / receipts / filters from ExecutedTipset
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
284905f to
485c446
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)
92-113: Document or narrow the new executed-tipset API.
ExecutedMessage,ExecutedTipset,LoadExecutedTipsetOptions,StateOutput, and the new loader methods are now public, but they do not describe their semantics, andLoadExecutedTipsetOptionsstill looks internal-only. Either add rustdoc here or reduce visibility where possible so we do not commit to an unnecessary API surface.As per coding guidelines, "Document public functions and structs with doc comments".
Also applies to: 463-481
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/state_manager/mod.rs` around lines 92 - 113, The new types ExecutedMessage, ExecutedTipset, LoadExecutedTipsetOptions, and StateOutput (and any new loader methods) are exported publicly without documentation or clear intended semantics; either add concise rustdoc comments describing the purpose, fields, and expected invariants for each public type and the loader methods, or reduce their visibility (e.g., make them pub(crate) or private) if they are intended for internal use only; update the declarations for ExecutedMessage, ExecutedTipset, LoadExecutedTipsetOptions, StateOutput, and the loader function signatures to include the chosen visibility and/or add short /// comments explaining each field (state_root, receipt, events, include_events, receipt_root, etc.) and the contract of the loader methods so the public API surface is intentional and documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/rpc/methods/eth/filter/mod.rs`:
- Around line 281-349: The event index bookkeeping is inconsistent: unresolved
events increment event_count before matching and matched unresolved events
increment it again, while resolved non-matches never advance it, making
event_idx unstable. Fix by capturing the current event index at the top of the
loop (e.g., let current_event_idx = event_count; then immediately increment
event_count by 1), use current_event_idx when constructing
CollectedEvent.event_idx, and remove the other event_count increments inside the
resolution/match branches (including the branch that currently increments before
continue and the increment after pushing a CollectedEvent); this ensures each
event advances the counter exactly once and event_idx reflects stable event
order (refer to event_count, CollectedEvent.event_idx, skip_event /
SkipEvent::OnUnresolvedAddress, and the loop over events).
In `@src/utils/get_size/mod.rs`:
- Around line 39-46: Add Rust doc comments (using ///) to the two public helpers
vec_with_stack_only_item_heap_size_helper and vec_heap_size_helper: describe
what each function computes (heap bytes used by a Vec's allocation), document
the type parameter and the v parameter (e.g., that T is item type and v is the
vector whose heap allocation is measured), state the return value (usize bytes),
and note any constraints/traits (for vec_heap_size_helper mention that T:
GetSize is required). Keep comments brief and follow examples in the module
(one-line summary plus short param/returns lines or a short example if present
elsewhere).
---
Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 92-113: The new types ExecutedMessage, ExecutedTipset,
LoadExecutedTipsetOptions, and StateOutput (and any new loader methods) are
exported publicly without documentation or clear intended semantics; either add
concise rustdoc comments describing the purpose, fields, and expected invariants
for each public type and the loader methods, or reduce their visibility (e.g.,
make them pub(crate) or private) if they are intended for internal use only;
update the declarations for ExecutedMessage, ExecutedTipset,
LoadExecutedTipsetOptions, StateOutput, and the loader function signatures to
include the chosen visibility and/or add short /// comments explaining each
field (state_root, receipt, events, include_events, receipt_root, etc.) and the
contract of the loader methods so the public API surface is intentional and
documented.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 587e6a20-c54d-44d2-a66d-f68bf6783c7f
📒 Files selected for processing (10)
src/chain/store/chain_store.rssrc/chain/store/errors.rssrc/dev/subcommands/state_cmd.rssrc/rpc/methods/eth.rssrc/rpc/methods/eth/filter/mod.rssrc/state_manager/cache.rssrc/state_manager/mod.rssrc/state_manager/tests.rssrc/state_manager/utils.rssrc/utils/get_size/mod.rs
💤 Files with no reviewable changes (1)
- src/state_manager/utils.rs
3967263 to
206f271
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/rpc/methods/eth/filter/mod.rs (2)
303-312: Address resolution fallback is correct but could be more idiomatic.The nested
if letpattern works correctly. For slightly cleaner pattern matching:♻️ Optional: Use match for clearer control flow
- let resolved = if let Ok(resolved) = result { - resolved - } else { - if let SkipEvent::OnUnresolvedAddress = skip_event { - // Skip event - continue; - } else { - id_addr - } - }; + let resolved = match (result, skip_event) { + (Ok(resolved), _) => resolved, + (Err(_), SkipEvent::OnUnresolvedAddress) => continue, + (Err(_), SkipEvent::Never) => id_addr, + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rpc/methods/eth/filter/mod.rs` around lines 303 - 312, Replace the nested `if let` with a `match` on `result` to make control flow clearer: match on `result` producing `resolved` for `Ok(resolved)`, and for `Err(_)` check `skip_event` to either `continue` when `SkipEvent::OnUnresolvedAddress` or fall back to `id_addr`; update the binding currently named `resolved` accordingly so callers use the same variable. Target the block handling `result`, `skip_event`, `SkipEvent::OnUnresolvedAddress`, and `id_addr` in this function.
271-354: Event collection logic looks correct.The refactored
collect_eventsproperly:
- Loads executed tipset data via the new
load_executed_tipsethelper- Iterates over
ExecutedMessageentries with correct destructuring- Tracks
event_idxwith stable, filter-independent incrementing at the start of each event iteration- Handles address resolution failures with appropriate context
- Enforces
max_filter_resultsbefore pushingThe
is_first_eventpattern at lines 284-290 is explicit about zero-based indexing intent, though it could be slightly simplified.♻️ Optional: Simplify event index tracking
The
is_first_eventflag pattern works but could be replaced with a simpler approach:- let mut event_idx = 0; - let mut is_first_event = true; + let mut event_idx: i64 = -1; for ( msg_idx, ExecutedMessage { message, events, .. }, ) in executed_tipset.executed_messages.into_iter().enumerate() { if let Some(events) = events { for event in events.iter() { - if is_first_event { - // event index should be zero-based. - event_idx = 0; - is_first_event = false; - } else { - event_idx += 1; - } + event_idx += 1;This achieves the same zero-based result with less branching.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rpc/methods/eth/filter/mod.rs` around lines 271 - 354, The is_first_event pattern is verbose; simplify event index tracking in collect_events by removing is_first_event and using a single mut event_idx: initialize let mut event_idx = 0usize; inside the inner loop capture the current index (e.g., let current_event_idx = event_idx;) then increment event_idx += 1; and use current_event_idx when constructing CollectedEvent (instead of event_idx). Update all references to event_idx in that scope accordingly so behavior remains zero-based and deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/rpc/methods/eth/filter/mod.rs`:
- Around line 303-312: Replace the nested `if let` with a `match` on `result` to
make control flow clearer: match on `result` producing `resolved` for
`Ok(resolved)`, and for `Err(_)` check `skip_event` to either `continue` when
`SkipEvent::OnUnresolvedAddress` or fall back to `id_addr`; update the binding
currently named `resolved` accordingly so callers use the same variable. Target
the block handling `result`, `skip_event`, `SkipEvent::OnUnresolvedAddress`, and
`id_addr` in this function.
- Around line 271-354: The is_first_event pattern is verbose; simplify event
index tracking in collect_events by removing is_first_event and using a single
mut event_idx: initialize let mut event_idx = 0usize; inside the inner loop
capture the current index (e.g., let current_event_idx = event_idx;) then
increment event_idx += 1; and use current_event_idx when constructing
CollectedEvent (instead of event_idx). Update all references to event_idx in
that scope accordingly so behavior remains zero-based and deterministic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d8dfaee3-4825-4f76-8e2f-e468e6258cf3
📒 Files selected for processing (1)
src/rpc/methods/eth/filter/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/state_manager/mod.rs`:
- Around line 92-105: The new public types ExecutedMessage, ExecutedTipset, and
LoadExecutedTipsetOptions (and any other public structs/functions introduced
around lines 463-491) need rustdoc comments: add concise /// doc comments above
each pub struct and above each pub field describing purpose and semantics (e.g.,
ExecutedMessage: what ChainMessage, Receipt, and events represent;
ExecutedTipset: meaning of state_root and executed_messages;
LoadExecutedTipsetOptions: meaning of include_events), and add docs for any
newly public functions or methods in that range explaining their behavior,
inputs, and return values so the public API is fully documented per project
guidelines.
- Around line 467-468: The code is silently converting real storage/blockstore
errors into OK cache-miss behavior by calling .ok() on
self.chain_store().load_child_tipset(...) and using unwrap_or(false); update the
hot-paths (references: load_child_tipset, chain_store(),
load_executed_tipset_inner and the other occurrences around the noted lines) to
distinguish NotFound/Absent cases from genuine IO/errors: do not call .ok() or
blanket unwrap_or(false); instead propagate or return the error (or log and
return an Err) so callers (e.g., eth_getLogs paths) can fast-fail on blockstore
failures, or explicitly match the Result to treat only a NotFound/None as cache
miss while surfacing other errors upstream.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e04ccaf2-3752-4b72-a48c-f564e45a4b33
📒 Files selected for processing (3)
src/rpc/methods/eth.rssrc/rpc/methods/eth/filter/mod.rssrc/state_manager/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/rpc/methods/eth/filter/mod.rs
- src/rpc/methods/eth.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)
118-124: Add doc comments toStateOutput.This public struct lacks documentation while the other new structs (
ExecutedMessage,ExecutedTipset,LoadExecutedTipsetOptions) all have proper doc comments. Consider adding a brief description of whatstate_rootandreceipt_rootrepresent.📝 Suggested doc comment
+/// Output of tipset state computation. +/// +/// Contains the resulting state tree root and the receipt root +/// after executing all messages in a tipset. #[derive(Debug, Default, Clone, GetSize)] pub struct StateOutput { #[get_size(ignore)] + /// Root CID of the state tree after message execution. pub state_root: Cid, #[get_size(ignore)] + /// Root CID of the message receipts AMT. pub receipt_root: Cid, }As per coding guidelines,
**/*.rs: Document public functions and structs with doc comments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/state_manager/mod.rs` around lines 118 - 124, Add a doc comment to the public struct StateOutput describing its purpose and briefly explaining the meaning of its fields: mention that StateOutput represents the resulting roots after execution and clarify that state_root is the CID of the post-execution actor state tree while receipt_root is the CID of the message receipt root; add short /// comments above the struct and each field (StateOutput, state_root, receipt_root) consistent with the style used on ExecutedMessage/ExecutedTipset/LoadExecutedTipsetOptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 118-124: Add a doc comment to the public struct StateOutput
describing its purpose and briefly explaining the meaning of its fields: mention
that StateOutput represents the resulting roots after execution and clarify that
state_root is the CID of the post-execution actor state tree while receipt_root
is the CID of the message receipt root; add short /// comments above the struct
and each field (StateOutput, state_root, receipt_root) consistent with the style
used on ExecutedMessage/ExecutedTipset/LoadExecutedTipsetOptions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f1f62422-5729-4a5a-a698-08f2412b8a1d
📒 Files selected for processing (2)
Cargo.tomlsrc/state_manager/mod.rs
✅ Files skipped from review due to trivial changes (1)
- Cargo.toml
Summary of changes
This PR tries to optimize the performance of
collect_eventsby porting Sqlite index generation logic from lotus. Simplified caching logic and performance is now bound totipset_by_heightwhich will be improved later.Manually verified issues in #6249 are gone.
k6 benchmark results:
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes #6249
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores