feat: util for calculating multihash from byte stream#5824
feat: util for calculating multihash from byte stream#5824LesnyRumcajs merged 5 commits intomainfrom
Conversation
WalkthroughA new method, Changes
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Clippy (1.86.0)warning: failed to write cache, path: /usr/local/registry/index/index.crates.io-1949cf8c6b5b557f/.cache/an/yh/anyhow, error: Permission denied (os error 13) Caused by: 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
53ca3b0 to
d359939
Compare
| Ok(hasher.finalize()) | ||
| } | ||
|
|
||
| Ok(match self { |
There was a problem hiding this comment.
nit: do you think it's a good opportunity for a macro to avoid code duplication?
There was a problem hiding this comment.
I don't think a macro would significantly reduce the LoC, but it would hurt readability. What do you think?
There was a problem hiding this comment.
No strong position here, I don't expect this code to be updated with many new algorithms anytime soon in the future. We can leave it as is.
src/utils/multihash.rs
Outdated
| self.wrap(hash(&mut hasher, bytes)?)? | ||
| } | ||
| _ => { | ||
| unimplemented!("{self:?}") |
There was a problem hiding this comment.
That's a bit scary - we shouldn't crash the entire Forest on a bad reader.
There was a problem hiding this comment.
changed to anyhow::bail
There was a problem hiding this comment.
let's add a test case for it for completeness?
src/utils/multihash.rs
Outdated
| fn test_digest_byte_stream() { | ||
| use MultihashCode::*; | ||
|
|
||
| let mut bytes = vec![0; 10000]; |
There was a problem hiding this comment.
if we're doing a bit of a fuzzy tests, perhaps it would make sense to randomize the input size as well.
There was a problem hiding this comment.
Or just manually check some edge conditions; input size of zero, 1, 1024. Also, there's an uncovered path with an unknown hashing algorithm.
There was a problem hiding this comment.
IdentityHasher is for testing and does not work for input > 64 bytes
/// Identity hasher with a maximum size.
///
/// # Panics
///
/// Panics if the input is bigger than the maximum size.
/// Ported from <https://github.com/multiformats/rust-multihash/pull/289>
There was a problem hiding this comment.
Added 0, 1, 1024
| let mut buf = [0; 1024]; | ||
| loop { | ||
| let n = bytes.read(&mut buf)?; | ||
| if n == 0 { | ||
| break; | ||
| } | ||
| if let Some(b) = buf.get(0..n) { | ||
| hasher.update(b); | ||
| } | ||
| } | ||
| Ok(hasher.finalize()) |
There was a problem hiding this comment.
| let mut buf = [0; 1024]; | |
| loop { | |
| let n = bytes.read(&mut buf)?; | |
| if n == 0 { | |
| break; | |
| } | |
| if let Some(b) = buf.get(0..n) { | |
| hasher.update(b); | |
| } | |
| } | |
| Ok(hasher.finalize()) | |
| while let Some(n) = bytes.read(&mut buf).ok() { | |
| if n == 0 { | |
| break; | |
| } | |
| hasher.update(&buf[..n]); | |
| } | |
| Ok(hasher.finalize()) |
nit: some code golf if you want
There was a problem hiding this comment.
I think the io::Error from .read should be escalated here.
Then I get
warning: irrefutable `while let` pattern
--> src/utils/multihash.rs:96:19
|
96 | while let n = bytes.read(&mut buf)? {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this pattern will always match, so the loop will never exit
= help: consider instead using a `loop { ... }` with a `let` inside it
= note: `#[warn(irrefutable_let_patterns)]` on by default
...
error: slicing may panic
--> src/utils/multihash.rs:100:32
|
100 | hasher.update(&buf[..n]);
| ^^^^^^^^
|
= help: consider using `.get(..n)`or `.get_mut(..n)` instead
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#indexing_slicing
There was a problem hiding this comment.
It could be a while loop like below but I'm not sure if it's better than a loop loop, what do you think?
let mut n = 0;
while {
n = bytes.read(&mut buf)?;
n > 0
} {
if let Some(b) = buf.get(0..n) {
hasher.update(b);
}
}There was a problem hiding this comment.
Let's leave the original version, then.
Summary of changes
Part of #5810
Changes introduced in this pull request:
digest_byte_streamfor calculating multihash and CID of a large byte stream, e.g. F3 snapshotReference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit
New Features
Tests