Rocprofiler

From Gentoo Wiki
Jump to:navigation Jump to:search
This article is a stub. Please help out by expanding it - how to get started.
Resources

rocprofiler is the profiler utility for GPGPU programs written in HIP.

Installation

dev-util/rocprofiler pulls HIP toolchain (provided by dev-util/hip) along with profiling/tracing libraries, headers, and scripts.

Emerge

Install dev-util/rocprofiler:

root #emerge --ask dev-util/rocprofiler

Usage

Detailed usage can be found in rocprof document.

Profiling

To simply profile GPU kernels in a program, run the following command:

user $rocprof <program-to-be-profiled>

This will generate a results.csv containing execution duration of kernels.

To simply trace a program, collecting HIP/HSA API calls and kernel execution details, run the following command:

user $rocprof --sys-trace <program-to-be-profiled>

This will further generates results.json, which is a standard trace file.

Full command line arguments can be viewed by running rocprof --help.

Viewing results

There are various viewers, including Chromium and perfetto.

Using chrome://tracing to load and view trace files should be the simplest method, and does not require network.

Note that Debian has stripped the tracing [1] of its chromium package.

Performance Counters

Warning
Gentoo has stripped out the proprietary AQL profiler library. AQL may refer to architected queuing language or asynchronous queuing language. Use the following workaround only if you're willing to accept AMD's proprietary EULA at /opt/rocm-5.5.0/share/doc/hsa-amd-aqlprofile after extracting the .deb package.

Accessing performance counters requires the use of the proprietary AQL profiler library, which has been stripped from Gentoo. Thus, rocprofiler will crash when one attempts to list or record performance counters.

user $rocprof --list-basic
RPL: on '230607_034945' from '/usr' in '/root':
Basic HW counters:
/usr/bin/rocprof: line 389:   574 Segmentation fault      (core dumped) /usr/bin/rocprof-ctrl

To restore performance counters, edit dev-libs/rocr-runtime and remove "${FILESDIR}/${PN}-4.3.0_no-aqlprofiler.patch", then edit dev-util/rocprofiler and remove "${FILESDIR}/${PN}-4.3.0-no-aqlprofile.patch" and "${FILESDIR}/${PN}-5.3.3-remove-aql-in-cmake.patch. Finally, obtain a copy of libhsa-amd-aqlprofile64.so (as of writing, it's libhsa-amd-aqlprofile64.so.1.0.50500) from https://repo.radeon.com/rocm/apt/debian/pool/main/h/hsa-amd-aqlprofile/ by extracting the apt package. Finally, copy libhsa-amd-aqlprofile64.so.1.0.50500 to /usr/lib64/ and create a symlink libhsa-amd-aqlprofile64.so to the versioned library.

According to Gentoo developer Marek Szuba:

For the record, this profiler has long since been deprecated in favour of RCP (https://github.com/GPUOpen-Tools/radeon_compute_profiler). Between that and it being proprietary, I would very much advise against adding it to the tree. And yes, candrews and I will eventually get to packaging RCP for Gentoo :-)

While it's generally true, if raw performance counters are needed, there's no alternative way than using rocprofile and hsa-amd-aqlprofile. Thus, the writer of this Gentoo Wiki article believes that the long-term solution is making proprietary library an opt-in option by a USE flag.

Legacy GPUs

Note
One can find ebuilds for historical hsa-amd-aqlprofile versions at this 3rd-party overlay: https://github.com/justxi/rocm/tree/master/media-libs/hsa-amd-aqlprofile. For Polaris (gfx803), it's supported by hsa-amd-aqlprofile-4.3.0.

Furthermore, older graphics cards are not supported by hsa-amd-aqlprofile. Due to its proprietary nature, it's not possible to patch the source to re-enable support (unlike how pre-VEGA hardware is re-enabled in dev-libs/rocm-opencl-runtime). But it's possible to install older versions of the library.

user $rocprof --list-basic
RPL: on '230607_065133' from '/usr' in '/usr/local/lib'
Basic HW counters:
ERROR: rocprofiler_iterate_info(), Translate(), ImportMetrics: bad block name 'GRBM', GFXIP is not supported(gfx803)

External resources

References