From Gentoo Wiki
Jump to:navigation Jump to:search
This article is a stub. Please help out by expanding it - how to get started.

rocprofiler is the profiler utility for GPGPU programs written in HIP.


dev-util/rocprofiler pulls HIP toolchain (provided by dev-util/hip) along with profiling/tracing libraries, headers, and scripts.


Install dev-util/rocprofiler:

root #emerge --ask dev-util/rocprofiler


Detailed usage can be found in rocprof document.


To simply profile GPU kernels in a program, run the following command:

user $rocprof <program-to-be-profiled>

This will generate a results.csv containing execution duration of kernels.

To simply trace a program, collecting HIP/HSA API calls and kernel execution details, run the following command:

user $rocprof --sys-trace <program-to-be-profiled>

This will further generates results.json, which is a standard trace file.

Full command line arguments can be viewed by running rocprof --help.

Viewing results

There are various viewers, including Chromium and perfetto.

Using chrome://tracing to load and view trace files should be the simplest method, and does not require network.

Note that Debian has stripped the tracing [1] of its chromium package.

Performance Counters

Gentoo has stripped out the proprietary AQL profiler library. AQL may refer to architected queuing language or asynchronous queuing language. Use the following workaround only if you're willing to accept AMD's proprietary EULA at /opt/rocm-5.5.0/share/doc/hsa-amd-aqlprofile after extracting the .deb package.

Accessing performance counters requires the use of the proprietary AQL profiler library, which has been stripped from Gentoo. Thus, rocprofiler will crash when one attempts to list or record performance counters.

user $rocprof --list-basic
RPL: on '230607_034945' from '/usr' in '/root':
Basic HW counters:
/usr/bin/rocprof: line 389:   574 Segmentation fault      (core dumped) /usr/bin/rocprof-ctrl

To restore performance counters, edit dev-libs/rocr-runtime and remove "${FILESDIR}/${PN}-4.3.0_no-aqlprofiler.patch", then edit dev-util/rocprofiler and remove "${FILESDIR}/${PN}-4.3.0-no-aqlprofile.patch" and "${FILESDIR}/${PN}-5.3.3-remove-aql-in-cmake.patch. Finally, obtain a copy of (as of writing, it's from by extracting the apt package. Finally, copy to /usr/lib64/ and create a symlink to the versioned library.

According to Gentoo developer Marek Szuba:

For the record, this profiler has long since been deprecated in favour of RCP ( Between that and it being proprietary, I would very much advise against adding it to the tree. And yes, candrews and I will eventually get to packaging RCP for Gentoo :-)

While it's generally true, if raw performance counters are needed, there's no alternative way than using rocprofile and hsa-amd-aqlprofile. Thus, the writer of this Gentoo Wiki article believes that the long-term solution is making proprietary library an opt-in option by a USE flag.

Legacy GPUs

One can find ebuilds for historical hsa-amd-aqlprofile versions at this 3rd-party overlay: For Polaris (gfx803), it's supported by hsa-amd-aqlprofile-4.3.0.

Furthermore, older graphics cards are not supported by hsa-amd-aqlprofile. Due to its proprietary nature, it's not possible to patch the source to re-enable support (unlike how pre-VEGA hardware is re-enabled in dev-libs/rocm-opencl-runtime). But it's possible to install older versions of the library.

user $rocprof --list-basic
RPL: on '230607_065133' from '/usr' in '/usr/local/lib'
Basic HW counters:
ERROR: rocprofiler_iterate_info(), Translate(), ImportMetrics: bad block name 'GRBM', GFXIP is not supported(gfx803)

External resources