Skip to content

Conversation

@noelchalmers
Copy link
Contributor

@noelchalmers noelchalmers commented Jun 21, 2020

Description

This PR implements a new API, hostMalloc which lets users malloc occa::memory regions in pinned host memory, if the backend device supports this, i.e. in CUDA, HIP, and somewhat OpenCL.

Edit: After consideration, we don't need to introduce a separate device::hostMalloc API for a functionality that is only present on some backends. We'll instead trigger the hostMalloc behavior by passing a "{'host': true}" prop in the device::malloc.

This PR also somewhat changes how the device/pinned host pointers are managed in backends in order to let users obtain the correct underlying pointer via the memory::ptr() method automatically, rather than having to pass a particular occa::properties setting. Users can also directly pass these hostMalloc'd buffers into kernels since GPU devices can directly read & write pinned host memory regions*. Finally, the correct copyTo/copyFrom behaviors are determined automatically based on the location of the underlying pointers.

Some examples of the resulting API change in user applications can be seen in this draft PR: paranumal/libparanumal#28

*The OpenCL spec unfortunately does not explicitly guarantee that GPU kernels will directly access pinned host memory. For now, users will continue to have to explicitly copy back the contents of hostMalloc'd buffers in order to guarantee coherency. Potentially, OCCA could also expose some kind of sync wherein a pinned host buffer would be re-mapped.

P.S. I have no idea if there is a similar functionality in Metal. If someone wants to take a look at the corresponding tweaks in the Metal backend, that would be greatly appreciated.

@codecov
Copy link

codecov bot commented Jun 21, 2020

Codecov Report

Merging #376 (21534c4) into main (83194cc) will decrease coverage by 0.13%.
The diff coverage is 80.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #376      +/-   ##
==========================================
- Coverage   75.29%   75.15%   -0.14%     
==========================================
  Files         254      254              
  Lines       19438    19416      -22     
==========================================
- Hits        14635    14593      -42     
- Misses       4803     4823      +20     
Impacted Files Coverage Δ
include/occa/core/memory.hpp 100.00% <ø> (ø)
include/occa/core/memory.tpp 100.00% <ø> (ø)
src/core/memory.cpp 71.06% <50.00%> (+0.69%) ⬆️
src/c/memory.cpp 100.00% <100.00%> (ø)
src/occa/internal/core/memory.cpp 100.00% <100.00%> (ø)
include/occa/types/json.tpp 41.66% <0.00%> (-28.34%) ⬇️
include/occa/types/json.hpp 66.50% <0.00%> (-10.68%) ⬇️
src/occa/internal/lang/specialMacros.cpp 58.62% <0.00%> (-0.87%) ⬇️
src/occa/internal/utils/sys.cpp 71.88% <0.00%> (-0.13%) ⬇️
include/occa/core/base.tpp 100.00% <0.00%> (ø)

@dmed256
Copy link
Member

dmed256 commented Jun 24, 2020

Thanks for adding this! I'll take a look at it this weekend, it's a busy week sorry 😞

@dmed256 dmed256 changed the base branch from master to main July 12, 2020 22:16
Copy link
Member

@dmed256 dmed256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you so much for making these changes! I made a few styling review suggestions so I could batch commit them before merging.

@dmed256 dmed256 merged commit 6d31efd into libocca:main Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants