GH-38325: [Python] Implement PyCapsule interface for Device data in PyArrow#40717
Conversation
| if requested_schema is not None: | ||
| target_type = DataType._import_from_c_capsule(requested_schema) | ||
|
|
||
| if target_type != self.type: | ||
| # TODO should protect from trying to cast non-CPU data | ||
| try: | ||
| casted_array = _pc().cast(self, target_type, safe=True) | ||
| inner_array = pyarrow_unwrap_array(casted_array) | ||
| except ArrowInvalid as e: | ||
| raise ValueError( | ||
| f"Could not cast {self.type} to requested type {target_type}: {e}" | ||
| ) | ||
| else: | ||
| inner_array = self.sp_array | ||
| else: | ||
| inner_array = self.sp_array |
There was a problem hiding this comment.
This part is a bit repetitive with the non-device version. I could factor that out into a shared helper function
paleolimbot
left a comment
There was a problem hiding this comment.
I had a look through here and it is looking great! I don't see anything out of place (but I am also only a little bit familiar with the existing code). I don't personally mind the repeated cast bit (as long as the repetition is tested, which it looks like it is).
Just for this PR I prototyped something similar in nanoarrow ( apache/arrow-nanoarrow#409 ), and the only difference I see is that the device_id for the CPU is -1. I can't find in the spec exactly what it should be...is -1 the accepted value?
import pyarrow as pa
# ! pip install "https://github.com/paleolimbot/arrow-nanoarrow/archive/414bbc44d3e84ecac2807713438d6988ff4d5245.zip#egg=nanoarrow&subdirectory=python"
import nanoarrow as na
from nanoarrow import device
# Wrapper to prevent c_device_array() from falling back on __arrow_c_array__()
class DeviceArrayWrapper:
def __init__(self, obj):
self.obj = obj
def __arrow_c_device_array__(self, requested_schema=None):
return self.obj.__arrow_c_device_array__(requested_schema=requested_schema)
pa_array = pa.array([1, 2, 3])
device.c_device_array(DeviceArrayWrapper(pa_array))
#> <nanoarrow.device.CDeviceArray>
#> - device_type: 1
#> - device_id: -1
#> - array: <nanoarrow.c_lib.CArray int64>
#> - length: 3
#> - offset: 0
#> - null_count: 0
#> - buffers: (0, 2199023452480)
#> - dictionary: NULL
#> - children[0]:|
Thanks for testing!
That's a good point. In practice this comes from our implementation in C++: Lines 82 to 86 in 434f872 But we should probably clarify that in the spec whether that's allowed / expected. I see that DLPack actually specified that as to be 0 for CPU: https://dmlc.github.io/dlpack/latest/c_api.html#c.DLDevice.device_id |
…yView, and the CArray (#409) When device support was first added, the `CArrayView` was device-aware but the `CArray` was not. This worked well until it was clear that `__arrow_c_array__` needed to error if it did not represent a CPU array (and the `CArray` had no way to check). Now, the `CArray` has a `device_type` and `device_id`. A nice side-effect of this is that we get back the `view()` method (whose removal @jorisvandenbossche had lamented!). This also implements the device array protocol to help test apache/arrow#40717 . This protocol isn't finalized yet and I could remove that part until it is (although it doesn't seem likely to change). The non-cpu case is still hard to test without real-world CUDA support...this PR is just trying to get the right information in the right place as early as possible. ```python import nanoarrow as na array = na.c_array([1, 2, 3], na.int32()) array.device_type, array.device_id #> (1, 0) ``` --------- Co-authored-by: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
|
I updated this PR to include the I think in the meantime, we also have stream support for the device interface, so that could be added as well (although given this is already quite big, I can also do that in a separate follow-up PR). I should also still explicitly test this on CUDA. |
python/pyarrow/array.pxi
Outdated
| target_type = DataType._import_from_c_capsule(requested_schema) | ||
|
|
||
| if target_type != self.type: | ||
| # TODO should protect from trying to cast non-CPU data |
There was a problem hiding this comment.
Is this check easy to do? (If the failure mode is a crash maybe this would be good to do?)
There was a problem hiding this comment.
Yes, we actually expose a device_type on Array and RecordBatch, so we can easily validate this and raise an informative error when trying to cast to requested_schema for non-CPU data.
|
Thanks for the review! Going to merge this then, and open follow-up issues for expanding to the stream interface and CUDA tests |
|
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 1815a67. There was 1 benchmark result indicating a performance regression:
The full Conbench report has more details. It also includes information about 42 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
PyArrow implementation for the specification additions being proposed in #40708
What changes are included in this PR?
New
__arrow_c_device_array__method topyarrow.Arrayandpyarrow.RecordBatch, and support in thepyarrow.array(..),pyarrow.record_batch(..)andpyarrow.table(..)functions to consume objects that have those methods.Are these changes tested?
Yes (for CPU only for now, #40385 is a prerequisite to test this for CUDA)