Alignment of n-dimensional indexes with partially excluded dims#10293
Alignment of n-dimensional indexes with partially excluded dims#10293dcherian merged 11 commits intopydata:mainfrom
Conversation
Support checking exact alignment of indexes that use multiple dimensions in the cases where some of those dimensions are included in alignment while others are excluded. Added `exclude_dims` keyword argument to `Index.equals()` (and still support old signature with future warning). Also fixed bug: indexes associated with scalar coordinates were ignored during alignment. Added tests as well.
| @overload | ||
| def equals(self, other: Index) -> bool: ... | ||
|
|
||
| @overload | ||
| def equals( | ||
| self, other: Index, *, exclude_dims: frozenset[Hashable] | None = None | ||
| ) -> bool: ... | ||
|
|
||
| def equals(self, other: Index, **kwargs) -> bool: |
There was a problem hiding this comment.
I added those overloads so Mypy doesn't fail for 3rd-party Xarray indexes. But maybe we want Mypy to fail and thereby encourage maintainers updating their code?
xarray/structure/alignment.py
Outdated
| exclude_dims = idx_all_dims & self.exclude_dims | ||
| if exclude_dims == idx_all_dims: | ||
| continue | ||
| elif exclude_dims and self.join != "exact": |
There was a problem hiding this comment.
I'm having trouble deciding if any of the other cases make any sense. Seems OK to remove anything that requires actual reindexing, which leaves "override" and I'm not sure about that. Let's have the error message recommend the user to open an issue if they feel it should work.
There was a problem hiding this comment.
I think that you are right. I'll try with other join options except "override".
dcherian
left a comment
There was a problem hiding this comment.
This seems OK to me, but is definitely hard to reason about.
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
for more information, see https://pre-commit.ci
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Better readability.
|
Yeah this certainly doesn't help better understand the alignment rules, which are already complex. It looks to me like a pretty robust approach for dealing with "multi-dimensional" indexes and dimensions excluded from alignment, though. |
| # TODO: always copy object's index when no re-indexing is required? | ||
| # (instead of assigning the aligned index) | ||
| # (need performance assessment) | ||
| if key in self.keep_original_indexes: | ||
| assert self.reindex[key] is False | ||
| new_idx = obj_idx.copy(deep=self.copy) | ||
| new_idx_vars = new_idx.create_variables(obj_idx_vars) | ||
| else: | ||
| new_idx = aligned_idx | ||
| new_idx_vars = { | ||
| k: v.copy(deep=self.copy) for k, v in aligned_idx_vars.items() | ||
| } |
There was a problem hiding this comment.
I've made this to avoid any general performance regression for the pretty narrow-scope feature added here. However, self.keep_original_indexes may not be needed and we could write instead:
if self.reindex[key] is False:
new_idx = obj_idx.copy(deep=self.copy)
new_idx_vars = new_idx.create_variables(obj_idx_vars)
else:
new_idx = aligned_idx
new_idx_vars = {
k: v.copy(deep=self.copy) for k, v in aligned_idx_vars.items()
}(i.e., always copy the index of each original object when no-reindexing is required)
I'm not sure about the performance impact of obj_idx.copy(deep=self.copy). Actually, I'm not sure either about v.copy(deep=self.copy) below (what we do in the main branch) and if we even want that.
This needs further investigation / performance assessment. Maybe in a follow-up PR?
There was a problem hiding this comment.
IMO it's fine to address in a followup PR
Test a bit more than join="exact".
|
@shoyer any thoughts on this? |
whats-new.rstSupport checking exact alignment of indexes that use multiple dimensions in the cases where some of those dimensions are included in alignment while others are excluded.
Add the
exclude_dimskeyword argument toIndex.equals()(still support old signature withFutureWarning).Also fixed a bug: indexes associated with scalar coordinates were ignored during alignment.
TODO:
exclude_dimskwarg toCoordinateTransform.equals()as well