-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Simplify Spark sha2 implementation
#19475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| signature: Signature::user_defined(Volatility::Immutable), | ||
| aliases: vec![], | ||
| signature: Signature::coercible( | ||
| vec![ | ||
| Coercion::new_implicit( | ||
| TypeSignatureClass::Native(logical_binary()), | ||
| vec![TypeSignatureClass::Native(logical_string())], | ||
| NativeType::Binary, | ||
| ), | ||
| Coercion::new_implicit( | ||
| TypeSignatureClass::Native(logical_int32()), | ||
| vec![TypeSignatureClass::Integer], | ||
| NativeType::Int32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving away from user_defined; also we cast strings to binary to simplify implementation as we only need raw bytes either way
| let mut digest = sha2::Sha224::default(); | ||
| digest.update(value); | ||
| Some(hex_encode(digest.finalize())) | ||
| } | ||
| (Some(value), Some(0 | 256)) => { | ||
| let mut digest = sha2::Sha256::default(); | ||
| digest.update(value); | ||
| Some(hex_encode(digest.finalize())) | ||
| } | ||
| (Some(value), Some(384)) => { | ||
| let mut digest = sha2::Sha384::default(); | ||
| digest.update(value); | ||
| Some(hex_encode(digest.finalize())) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We directly use the sha2 crate to do the hashing now; previously we used functions exposed by datafusion-functions but that seemed like unnecessary indirection
| regex = "1.12" | ||
| rstest = "0.26.1" | ||
| serde_json = "1" | ||
| sha2 = "^0.10.9" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because datafusion-spark now uses sha2 directly, we extract it as a common dependency
Which issue does this PR close?
Signature::user_definedfor normal functions #12725Rationale for this change
Simplify implementation, also remove usage of user_defined.
What changes are included in this PR?
Refactor to be simpler.
Are these changes tested?
Added tests.
Are there any user-facing changes?
No.