Skip to content

Conversation

@abhikran-quic
Copy link
Contributor

  • While introducing dma operations at graph level, relax KillAfterLastUse pass introduces kill_tensor operation after dma_copy. This leads to memory being deallocated when asynchronous copy operation is in progress. Hence, moving the input/output buffers to dma_wait to ensure kill_tensor is introduced after dma_wait at the graph level.
  • Also, the logic for size calculation is updated to use GetDataSize function.
  • The test case is updated to use offsets instead of allocating different storage in VTCM.

While introducing dma operations at graph level, relax KillAfterLastUse pass introduces kill_tensor operation after dma_copy.
This leads to memory being deallocated when asynchronous copy operation is in progress.
Hence, moving the input/output buffers to dma_wait to ensure kill_tensor is
introduced after dma_wait at the graph level.
Also, the logic for size calculation is updated to use GetDataSize function.
The test case is updated to use offsets instead of allocating different storage in VTCM.
@abhikran-quic
Copy link
Contributor Author

cc: @Hzfengsy @quic-sanirudh

Copy link
Contributor

@quic-sanirudh quic-sanirudh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, apart from the minor comment for unused argument. Thanks.

Comment on lines 55 to 56
.set_body_typed([](TVMArgValue vm_ptr, int queue_id, int inflight_dma, NDArray src_arr,
NDArray dst_arr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For src_arr and dst_arr, perhaps we should add the [[maybe_unused]].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @quic-sanirudh ! I have fixed this in the latest patch.

@Hzfengsy Hzfengsy merged commit 94866f7 into apache:main Mar 15, 2024
@abhikran-quic abhikran-quic deleted the abhikran/dma_builtin_fix branch March 15, 2024 08:41
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
* [VM] [Hexagon] Add buffers to dma_wait builtin
While introducing dma operations at graph level, relax KillAfterLastUse pass introduces kill_tensor operation after dma_copy.
This leads to memory being deallocated when asynchronous copy operation is in progress.
Hence, moving the input/output buffers to dma_wait to ensure kill_tensor is
introduced after dma_wait at the graph level.
Also, the logic for size calculation is updated to use GetDataSize function.
The test case is updated to use offsets instead of allocating different storage in VTCM.

* Fix review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants