rocprofiler is the profiler utility for GPGPU programs written in HIP.
dev-util/rocprofiler pulls HIP toolchain (provided by dev-util/hip) along with profiling/tracing libraries, headers, and scripts.
emerge --ask dev-util/rocprofiler
Detailed usage can be found in rocprof document.
To simply profile GPU kernels in a program, run the following command:
This will generate a
results.csv containing execution duration of kernels.
To simply trace a program, collecting HIP/HSA API calls and kernel execution details, run the following command:
rocprof --sys-trace <program-to-be-profiled>
This will further generates
results.json， which is a standard trace file.
Gentoo has stripped out the proprietary AQL profiler library. AQL may refer to architected queuing language or asynchronous queuing language.
Full command line arguments can be viewed by running
There are various viewers, including Chromium and perfetto.
chrome://tracing to load and view trace files should be the simplest method, and does not require network.
Note that Debian has stripped the tracing  of its chromium package.
- AMD ROCProfiler User Guide v5.1 – The official user guide