Skip to content

Conversation

@srilman
Copy link
Contributor

@srilman srilman commented Mar 22, 2025

Closes #1778.

Rationale for this change

Current, filters that are applied to the top-level struct column do not work. For example, given a table of schema:

table {
  2: id: optional int
  1: data: required string
  3: location: struct<5: latitude: optional float, 6: longitude: optional float>
}

We want to support applying filters to field location, such as location is not null. Note that filters like location == {"latitude": ..., "longitude": ...} wont work right now, but can be equivalently rewritten to location.latitude == ... and location.longitude == ....

Are these changes tested?

Yes, tests were added at both the schema level and table reads.

Are there any user-facing changes?

Support some basic filters on struct columns at the top-level.



def test_add_top_level_primitives(primitive_fields: NestedField) -> None:
def test_add_top_level_primitives(primitive_fields: List[NestedField]) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing a typing error I noticed. The type of the fixture was incorrect.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @srilman Thanks for fixing this, this looks great to me 👍

@Fokko Fokko merged commit 71cb247 into apache:main Mar 23, 2025
7 checks passed
@Fokko Fokko added this to the PyIceberg 0.9.1 milestone Apr 20, 2025
Fokko pushed a commit that referenced this pull request Apr 25, 2025
Closes #1778.

# Rationale for this change

Current, filters that are applied to the top-level struct column do not
work. For example, given a table of schema:
```
table {
  2: id: optional int
  1: data: required string
  3: location: struct<5: latitude: optional float, 6: longitude: optional float>
}
```
We want to support applying filters to field `location`, such as
`location is not null`. Note that filters like `location == {"latitude":
..., "longitude": ...}` wont work right now, but can be equivalently
rewritten to `location.latitude == ... and location.longitude == ...`.

# Are these changes tested?

Yes, tests were added at both the schema level and table reads.

# Are there any user-facing changes?

Support some basic filters on struct columns at the top-level.
gabeiglio pushed a commit to Netflix/iceberg-python that referenced this pull request Aug 13, 2025
Closes apache#1778.

# Rationale for this change

Current, filters that are applied to the top-level struct column do not
work. For example, given a table of schema:
```
table {
  2: id: optional int
  1: data: required string
  3: location: struct<5: latitude: optional float, 6: longitude: optional float>
}
```
We want to support applying filters to field `location`, such as
`location is not null`. Note that filters like `location == {"latitude":
..., "longitude": ...}` wont work right now, but can be equivalently
rewritten to `location.latitude == ... and location.longitude == ...`.

# Are these changes tested?

Yes, tests were added at both the schema level and table reads.

# Are there any user-facing changes?

Support some basic filters on struct columns at the top-level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Applying Filter on Top-Level Struct Columns Throws Error

2 participants