Merged
Conversation
ahuang11
commented
Mar 19, 2026
Merged
alxmrs
requested changes
Mar 22, 2026
Owner
alxmrs
left a comment
There was a problem hiding this comment.
Tests need some improvements. In general, thank you for making this fix!
xarray_sql/sql_test.py
Outdated
Comment on lines
+215
to
+233
| def test_max_ignores_nan(self, nan_ds): | ||
| ctx = XarrayContext() | ||
| ctx.from_dataset("data", nan_ds) | ||
| result = ctx.sql("SELECT MAX(temp) AS mx FROM data").to_pandas() | ||
| assert result["mx"].iloc[0] == 8.0 | ||
|
|
||
| def test_min_ignores_nan(self, nan_ds): | ||
| ctx = XarrayContext() | ||
| ctx.from_dataset("data", nan_ds) | ||
| result = ctx.sql("SELECT MIN(temp) AS mn FROM data").to_pandas() | ||
| assert result["mn"].iloc[0] == 1.0 | ||
|
|
||
| def test_avg_ignores_nan(self, nan_ds): | ||
| import numpy as np | ||
| ctx = XarrayContext() | ||
| ctx.from_dataset("data", nan_ds) | ||
| result = ctx.sql("SELECT AVG(temp) AS avg FROM data").to_pandas() | ||
| expected = np.nanmean([1.0, 2.0, 4.0, 5.0, 7.0, 8.0]) | ||
| assert abs(result["avg"].iloc[0] - expected) < 1e-6 |
Owner
There was a problem hiding this comment.
I think these could be melded into one test -- or better yet, only one of these should imply the others.
Collaborator
Author
|
Made changes!
Playing devil's advocate, I'm used to having pandas df and xarray ds be returned, but this can be discussed more in #148 |
Signed-off-by: Andrew <15331990+ahuang11@users.noreply.github.com>
715b610 to
e78cc76
Compare
alxmrs
approved these changes
Mar 27, 2026
Owner
alxmrs
left a comment
There was a problem hiding this comment.
LGTM. Thank for this critical fix!!
alxmrs
added a commit
that referenced
this pull request
Mar 30, 2026
Closes #144 After attempting this, I realized it did not feel like a good first issue~ Note I used Claude Opus to plan and write this. Two tiers based on calendar type: 1. Gregorian-like (standard, noleap, all_leap) → pa.timestamp('us'). String SQL filters work naturally (WHERE time > '1980-01-01'). Microsecond resolution covers ±292k years, avoiding the nanosecond 1678–2262 overflow. 2. Non-Gregorian (360_day, julian) → pa.int64() with xarray:units/xarray:calendar metadata on the Arrow field. Lossless — a 360-day "Feb 30th" isn't forced into Gregorian. Auto-registered cftime() UDF provides ergonomic filtering: WHERE time > cftime('2000-07-01'). Now this runs: ``` import xarray as xr from metpy.cbook import get_test_data from xarray_sql import XarrayContext ds = xr.tutorial.open_dataset("rasm") context = XarrayContext().from_dataset("rasm", ds, chunks={"time": 2}) context.sql("SELECT * FROM rasm LIMIT 1000") ``` <img width="230" height="279" alt="image" src="https://github.com/user-attachments/assets/5f7ff224-d4b1-429f-a3b3-51e017475c21" /> However as you can see, there are a lot of NaNs; to be fixed in a separate PR in #151 --------- Co-authored-by: Alex Merose <al@merose.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I noticed that the values were all NaN. This fixes it. Used Claude Opus during planning and execution.