Skip to content

Conversation

@yuej-jin
Copy link
Contributor

@yuej-jin yuej-jin commented Dec 19, 2021

This PR fixes #5630
We need to add ".sync" suffix for warp shuffle intrinsics after Volta architecture:
https://docs.nvidia.com/cuda/volta-tuning-guide/index.html

@yuej-jin yuej-jin changed the title Support new warp shuffle intrinsics after CUDA volta architecture Support new warp shuffle intrinsics after CUDA Volta architecture Dec 19, 2021
@abadams
Copy link
Member

abadams commented Dec 19, 2021

Thanks for this, I just had one very minor comment.

@yuej-jin
Copy link
Contributor Author

yuej-jin commented Dec 21, 2021

I think the 1 failing of buildbot is not related to this PR. Could you help taking a look? @abadams @jrk

@yuej-jin yuej-jin requested a review from abadams December 21, 2021 02:07
@yuej-jin yuej-jin requested a review from abadams December 21, 2021 11:10
@yuej-jin
Copy link
Contributor Author

Hi @abadams , the code is updated, could you help to have a look? (BTW, I think the 1 failing of buildbot is not related to this PR.)

@abadams
Copy link
Member

abadams commented Dec 23, 2021

Thanks for the PR! Merging.

@abadams abadams merged commit 1d1f06a into halide:master Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

warp shuffle intrinsics no longer work with cuda compute capability 8.0

2 participants