Skip to content

fix: interval analysis error when have two filterexec that inner filter proves zero selectivity#20743

Merged
alamb merged 3 commits intoapache:mainfrom
haohuaijin:fix-statistics-panic
Mar 10, 2026
Merged

fix: interval analysis error when have two filterexec that inner filter proves zero selectivity#20743
alamb merged 3 commits intoapache:mainfrom
haohuaijin:fix-statistics-panic

Conversation

@haohuaijin
Copy link
Contributor

@haohuaijin haohuaijin commented Mar 6, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

In collect_new_statistics, when a filter proves no rows can match, use a typed null (e.g., ScalarValue::Int32(None)) instead of untyped ScalarValue::Null for column min/max/sum values. The column's data type is looked up from the schema so that downstream interval analysis can still intersect intervals of the same type.

Are these changes tested?

add one test case

Are there any user-facing changes?

@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 6, 2026
@haohuaijin haohuaijin changed the title fix: interval analysis panic when have two filterexec that inner filter proves zero selectirity fix: interval analysis error when have two filterexec that inner filter proves zero selectirity Mar 6, 2026
@github-actions github-actions bot added the core Core DataFusion crate label Mar 6, 2026
Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, makes sense to me!

@alamb alamb changed the title fix: interval analysis error when have two filterexec that inner filter proves zero selectirity fix: interval analysis error when have two filterexec that inner filter proves zero selectivity Mar 10, 2026
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb alamb added this pull request to the merge queue Mar 10, 2026
Merged via the queue into apache:main with commit daa8f52 Mar 10, 2026
35 checks passed
@haohuaijin haohuaijin deleted the fix-statistics-panic branch March 10, 2026 15:56
@haohuaijin
Copy link
Contributor Author

thanks @jonathanc-n @hengfeiyang @alamb for reviews

alamb pushed a commit to alamb/datafusion that referenced this pull request Mar 11, 2026
…er proves zero selectivity (apache#20743)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- Closes apache#20742

## Rationale for this change

- see apache#20742

## What changes are included in this PR?

In `collect_new_statistics`, when a filter proves no rows can match, use
a typed null (e.g., ScalarValue::Int32(None)) instead of untyped
ScalarValue::Null for column min/max/sum values. The column's data type
is looked up from the schema so that downstream interval analysis can
still intersect intervals of the same type.

## Are these changes tested?

add one test case

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
alamb pushed a commit to alamb/datafusion that referenced this pull request Mar 11, 2026
…er proves zero selectivity (apache#20743)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- Closes apache#20742

## Rationale for this change

- see apache#20742

## What changes are included in this PR?

In `collect_new_statistics`, when a filter proves no rows can match, use
a typed null (e.g., ScalarValue::Int32(None)) instead of untyped
ScalarValue::Null for column min/max/sum values. The column's data type
is looked up from the schema so that downstream interval analysis can
still intersect intervals of the same type.

## Are these changes tested?

add one test case

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
alamb added a commit that referenced this pull request Mar 12, 2026
…t inner filter proves zero selectivity (#20743) (#20880)

- Part of #20855
- Closes #20742 on branch-52

This PR:
- Backports #20743 from
@haohuaijin to the branch-52 line

Co-authored-by: Huaijin <[email protected]>
alamb added a commit that referenced this pull request Mar 12, 2026
…t inner filter proves zero selectivity (#20743) (#20882)

- Part of #19692
- Closes #20742 on branch-53

This PR:
- Backports #20743 from
@haohuaijin to the branch-53 line

Co-authored-by: Huaijin <[email protected]>
lukekim pushed a commit to spiceai/datafusion that referenced this pull request Mar 12, 2026
…t inner filter proves zero selectivity (apache#20743) (apache#20880)

- Part of apache#20855
- Closes apache#20742 on branch-52

This PR:
- Backports apache#20743 from
@haohuaijin to the branch-52 line

Co-authored-by: Huaijin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

interval analysis error when have two filterexec that inner filter proves zero selectirity

4 participants