-
Notifications
You must be signed in to change notification settings - Fork 86
PAPI Component ROCM
The ROCM component exposes numerous performance events on AMD GPUs. The component is an adapter to the ROCm profiling library (ROC-profiler) which is included in standard ROCM release.
To enable reading ROCM events the user needs to link against a PAPI library that was configured with the ROCM component enabled. As an example the following command: ./configure --with-components="rocm" is sufficient to enable the component.
Typically, the utility papi_components_avail (available in papi/src/utils/papi_components_avail) will display the components available to the user, and whether they are disabled, and when they are disabled why.
For ROCM, PAPI requires one environment variable: PAPI_ROCM_ROOT.
Typically in Linux one would export these (examples are show below) but some systems have software to manage environment variables (such as modules or spack), so consult with your sysadmin if you have such management software.
Besides the PAPI_ROCM_ROOT environment variable, four more environment variables are required at runtime. The component is just an interface to an AMD utility called rocprofiler and these are used by rocprofiler in it's operation.
These added environment variables are typically set as follows, after PAPI_ROCM_ROOT has been exported. An example is provided below, setting PAPI_ROCM_ROOT to a typical standard value:
export PAPI_ROCM_ROOT=/opt/rocm
export ROCP_METRICS=$PAPI_ROCM_ROOT/rocprofiler/lib/metrics.xml
export ROCPROFILER_LOG=1
export HSA_VEN_AMD_AQLPROFILE_LOG=1
export AQLPROFILE_READ_API=1
The first of these, ROCP_METRICS, must point at a file containing the descriptions of metrics. The standard location is shown above, the final three above are fixed settings.
For a standard installed system, these are the only environment variables that need to be set, for both compile and runtime.
Within PAPI_ROCM_ROOT, we expect the following standard directories:
PAPI_ROCM_ROOT/include
PAPI_ROCM_ROOT/include/hsa
PAPI_ROCM_ROOT/lib
PAPI_ROCM_ROOT/rocprofiler/lib
PAPI_ROCM_ROOT/rocprofiler/include
For the ROCM component to be operational, it must find the dynamic libraries libhsa-runtime64.so and librocprofiler64.so. These are normally found in the above standard directories, or one of the Linux default directories listed by /etc/ld.so.conf, usually /usr/lib64, /lib64, /usr/lib and /lib. If these libraries are not found (or are not functional) then the component will be listed as "disabled" with a reason explaining the problem. If libraries were not found, then they are not in the expected places.
The system will search the directories listed in LD_LIBRARY_PATH, separated by colons :. This can be set using export; e.g.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/WhereALibraryCanBeFound
But be careful to repeat the path so far (the $LD_LIBRARY_PATH part) because it may contain paths needed by other packages. The current path can be viewed with echo $LD_LIBRARY_PATH.
-
Only sets of metrics and events that can be gathered in a single pass are supported.
-
Although AMD metrics may be floating point, all values are recast and returned as long long integers.
The binary image of a
doubleis intact; but users must recast todoublefor display purposes.