Skip to content

Implemented Upper/lower for REE#16969

Closed
rich-t-kid-datadog wants to merge 1 commit intoapache:mainfrom
rich-t-kid-datadog:baah/RunEndEncoding-string-functions
Closed

Implemented Upper/lower for REE#16969
rich-t-kid-datadog wants to merge 1 commit intoapache:mainfrom
rich-t-kid-datadog:baah/RunEndEncoding-string-functions

Conversation

@rich-t-kid-datadog
Copy link

@rich-t-kid-datadog rich-t-kid-datadog commented Jul 29, 2025

Which issue does this PR close?

Work towards closing Ree Epic

Rationale for this change

Adding a RunEndEncoded branch in case_conversion allows for REE with string's as a value type to be converted to upper/lower using the Datafusion UDF

What changes are included in this PR?

Allows for Lower/Upper UDF to be called on Run-End Encoded Arrays

Are these changes tested?

Yes both the upper and lower functions have test attached to them for REE

Are there any user-facing changes?

This is an extensible change only. Users will see no changes unless they explicitly opt in to use the UDF with REE arrays.

@github-actions github-actions bot added the functions Changes to functions implementation label Jul 29, 2025
@rich-t-kid-datadog rich-t-kid-datadog force-pushed the baah/RunEndEncoding-string-functions branch from 1a40951 to a5359f9 Compare July 30, 2025 13:07
@rich-t-kid-datadog rich-t-kid-datadog force-pushed the baah/RunEndEncoding-string-functions branch from a5359f9 to b746836 Compare July 30, 2025 13:08
@timsaucer
Copy link
Member

@rich-t-kid-datadog Is this ready? It looks like it's fully implemented and has unit tests. I think this is a great way to start REE support, which I'm also interested in. If we merge in this PR then I think we can uncomment parts of the tests in #16715 so that it's not just merging in a commented out file.

@timsaucer timsaucer self-requested a review August 31, 2025 12:35
Comment on lines +348 to +355
if value_index.data_type() == &DataType::Utf8 {
case_conversion_run_array::<i32, _>(
array,
op,
name,
&run_index.data_type(),
)
} else if value_index.data_type() == &DataType::LargeUtf8 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Utf8View? As I'm thinking about it, it doesn't seem to make sense to have a REE with value data type of Utf8View. I haven't dug deep enough to verify this though. I ask mostly for my own understanding.

@github-actions
Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label Oct 31, 2025
@github-actions github-actions bot closed this Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation Stale PR has not had any activity for some time

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants