feat: Add HashJoin support for BuildRight#437
Conversation
152df93 to
383ff05
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #437 +/- ##
============================================
+ Coverage 34.17% 34.30% +0.13%
- Complexity 850 855 +5
============================================
Files 116 116
Lines 38547 38653 +106
Branches 8523 8545 +22
============================================
+ Hits 13172 13259 +87
- Misses 22609 22647 +38
+ Partials 2766 2747 -19 ☔ View full report in Codecov by Sentry. |
| join match { | ||
| case b: BroadcastHashJoinExec if b.isNullAwareAntiJoin => | ||
| // DataFusion HashJoin LeftAnti has bugs on null keys. | ||
| withInfo(join, "DataFusion doesn't support null-aware anti join") | ||
| return None | ||
| case _ => // no-op |
ee5952e to
9e25a6a
Compare
|
This requires apache/datafusion#10702 to be in DataFusion new release. |
| } | ||
| diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala | ||
| index f33432ddb6f..060f874ea72 100644 | ||
| index f33432ddb6f..9cf7a9dd4e3 100644 |
There was a problem hiding this comment.
I guess we need the same change for 3.4.3.diff ?
d41a68e to
d494b4f
Compare
andygrove
left a comment
There was a problem hiding this comment.
LGTM but I am seeing a performance regression with this PR. I am guessing it is related to the DataFusion upgrade.
|
Hmm, it's weird. I've updated the plan stability results. However, |
I finally produced the query plan same as CI by following same commands of the CI pipeline. |
|
Merged. Thanks @kazuyukitanimura @andygrove |
* feat: Add HashJoin support for BuildRight * Enable test * Update plan stability * More * Update plan stability * Refine * Fix * Update diffs to fix Spark tests * Update diff * Update Spark 3.4.3 diff * Use BuildSide enum * Update diffs * Update plan stability for Spark 4.0 * Update q5a plan
Which issue does this PR close?
Closes #390.
Rationale for this change
What changes are included in this PR?
How are these changes tested?