Skip to content

Conversation

@Technici4n
Copy link

Hi, this PR is the result of trying to profile some Julia code on LUMI (default ROCm version 6.0.something, there is a build of 6.2.4 available, planned upgrade to 6.3 soon), and eventually succeeding after multiple days. Please let me know if any information is wrong and I will happily correct it. It would also be helpful if others with more experience could run the various commands on other versions of ROCm and see if the situation is different there.

Here is a test script that I used:

using AMDGPU

function rangePush(message)
    @ccall "libroctx64".roctxRangePushA(message::Ptr{Cchar})::Cint
end

function rangePop()
    @ccall "libroctx64".roctxRangePop()::Cint
end

N = 10000
mat = ROCArray(randn(N, N))
vec = ROCArray(randn(N))

tot = 0.0
for i in 1:10
    rangePush("Iteration $i")
    tot += sum(mat .* vec)
    rangePop()
end
println(tot)

PS: Given the existence of #801, I suppose that rocprofv3 is not expected to be working yet?

@luraess
Copy link
Member

luraess commented Dec 5, 2025

Thanks for the update! I will try it out on the CI machines and may add some infos about profiling with MPI as well.

WRT ROCTX, given that only these 2 functions seem to work for now, I wonder whether it would make sense to have them exposed by AMDGPU instead of relying on ROCTX.jl just for that?

@Technici4n
Copy link
Author

given that only these 2 functions seem to work for now

Have you tried the others? I haven't myself as range push and pop were sufficient for my needs :)

whether it would make sense to have them exposed by AMDGPU

I don't know why NVTX is a separate package from CUDA, maybe the same reasoning applies here. It is possible that NVTX is a very light (and hard if the code is annotated) dependency whereas package authors prefer to leave CUDA support in an ext module?


[rocprofv2](https://github.com/ROCm/rocprofiler?tab=readme-ov-file#rocprofiler-v2)
allows profiling both HSA & HIP API calls (rocprof being deprecated).
[rocprof](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[rocprof](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler)
[rocprof](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/index.html)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links to the doc

allows profiling both HSA & HIP API calls (rocprof being deprecated).
[rocprof](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler)
allows profiling HSA & HIP API calls, kernel launches, and more...
Multiple major versions are available: `rocprof`, `rocprofv2` and `rocprofv3`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could add upfront here a comment stating that rocprofv3 is now to be used as others are now deprecated (unless one is on a system where it may not work).

While [ROCTX.jl](https://github.com/JuliaGPU/ROCTX.jl) aims to offer a Julia wrapper around it,
it does not seem to be working yet. PRs welcome!
(Note: the `ccall`s above do _not_ require ROCTX.jl to be loaded!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this addition!

@luraess
Copy link
Member

luraess commented Jan 6, 2026

I checked the various profiling tool version on the CI machine and with ROCm 6.4 all seem to work in reporting basic tracing info.

Maybe, before merging, one could simplify and streamline further the doc, focussing on v3 while still showing how to launch profiling with v2 and v1. Also, one could add to consistently for each version the -o prof or another output name. Some duplicate about link to Perfetto UI could also be removed.

After this, I would be happy to merge - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants