-
Notifications
You must be signed in to change notification settings - Fork 1.9k
perf: skip double lookup in multi group by hash map #19430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
perf: skip double lookup in multi group by hash map #19430
Conversation
|
run benchmark aggregate_query_sql |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmarks |
|
run benchmark tpch tpcds |
1 similar comment
|
run benchmark tpch tpcds |
|
(the script was having problems) |
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
N/A
Rationale for this change
Making DataFusion faster.
the
insert_accountedfirst find the entry (without increasing the capacity) and if missing insert usingentry().insert()which search again the entry from what I understand. so avoided usinginsert_accountedand always useentryand insert of missing which will prepare for insert if the table doesn't have enough spaceWhat changes are included in this PR?
insert_accountedonhashbrown::HashTablewithentry()and insert if missingallocated_sizetohashbrown::HashTablemap_sizeand calculate it on call tosizeas it is very cheapAre these changes tested?
Existing tests
Are there any user-facing changes?
added
allocated_sizesimilar toallocated_sizeforVecthat we added.from my local tests, it showed good perf improvements