resolve contentious AMD components#548
Conversation
2e74c2c to
e3d7a5d
Compare
|
I am reviewing this PR. |
de23255 to
7e93f50
Compare
sbkathpe
left a comment
There was a problem hiding this comment.
Yes, I believe this follows the spirit and intent of this env variable.
I would do a free(penv) though since this allocated space is no longer needed.
cf0616f to
b56a689
Compare
sbkathpe
left a comment
There was a problem hiding this comment.
Just to make sure, the "PAPI_DISABLE_COMPONENTS" code should NOT be compile-time dependent on "#if defined(DEFAULT_TO_ROCP_SDK)" but be unconditionally supported.
It is hard for me to tell if this condition is part of the final code or not.
The code that parses However, there is other code, which is contained within the
Example: Suppose the Summary: |
b3ba6a4 to
b6c7d7f
Compare
b6c7d7f to
8e769c5
Compare
af2ee48 to
0e9389f
Compare
This change introduces an environment variable to allow the user to disable components at runtime. Example Usage: export PAPI_DISABLE_COMPONENTS=rocm,rocm_smi These changes have been tested using ROCm 7.0.2 on the Frontier supercomputer, which contains the AMD MI250X architecture. Signed-off-by: Daniel Barry <dbarry@vols.utk.edu>
Allow the user to configure contentious component pairs (e.g., rocm & rocp_sdk, rocm_smi & amd_smi), but only allow one from each pair to be active at runtime. The ROCm version determines which components are active by default. This can be overridden by the PAPI_DISABLE_COMPONENTS environment variable. These changes have been tested using ROCm 7.0.2 on the Frontier supercomputer, which contains the AMD MI250X architecture.
0e9389f to
fb14952
Compare
Pull Request Description
This pull request resolves issues #416 and #478.
ROCm version >= 6.3.2 being the "cutoff" for making
rocp_sdkactive by default overrocmwas chosen due to known bugs in the ROCProfiler SDK in prior releases.These changes have been tested using ROCm 7.0.2 on the Frontier supercomputer, which contains the AMD MI250X architecture.
Author Checklist
Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
Commits are self contained and only do one thing
Commits have a header of the form:
module: short descriptionCommits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
The PR needs to pass all the tests