amd_smi: String description for each device for string meta data events#550
Conversation
|
I am reviewing this PR. |
| int can_collapse = 1; | ||
| int dev = device_next(event->device_map, first); | ||
| for (; dev >= 0; dev = device_next(event->device_map, dev)) { | ||
| const char *line = fallback; |
There was a problem hiding this comment.
@Treece-Burgess Right now if a per-device description isn’t present we just fall back to event->descr (best-effort), which feels like the reasonable behavior for this description-formatting feature. I’m thinking we keep line = fallback, and add a SUBDBG when device_map includes a device but pd->descrs[dev] is NULL so we can catch that inconsistency during debug. what do you think?
There was a problem hiding this comment.
Yes, that sounds good with me!
5ae910f to
e774181
Compare
Treece-Burgess
left a comment
There was a problem hiding this comment.
I tested these changes on Odyssey at Oregon with ROCm 7.2.0 pre-release, below are the results.
- PAPI build: ✅
- PAPI utilities*: ✅
amd_smicomponent tests: ✅ (I lack the permissions foramdsmi_set_test.c)
For papi_native_avail, we now see the string description for each device and if applicable a collapsed version if all devices repeat the same value:
# Collapsed version
6325 --------------------------------------------------------------------------------
6326 | amd_smi:::board_product_name_hash |
6327 | Hash of Device 0,1,2,3 board product name string 'Aqua Vanjaram [I|
6328 | nstinct MI300A]' |
6329 | :device=0 |
6330 | Mandatory device qualifier [0,1,2,3] |
6331 --------------------------------------------------------------------------------
# Not collapsed version
6499 --------------------------------------------------------------------------------
6500 | amd_smi:::uuid_hash |
6501 | Hash of Device 0 UUID string 'f6ff74a0-0000-1000-808e-1753dc12b715|
6502 | ' |
6503 | Hash of Device 1 UUID string '5bff74a0-0000-1000-80d3-148dc76b831a|
6504 | ' |
6505 | Hash of Device 2 UUID string 'a5ff74a0-0000-1000-807d-04507c371187|
6506 | ' |
6507 | Hash of Device 3 UUID string 'cdff74a0-0000-1000-80cf-b00895cff553|
6508 | ' |
6509 | :device=0 |
6510 | Mandatory device qualifier [0,1,2,3] |
6511 --------------------------------------------------------------------------------
* - papi_component_avail, papi_native_avail, papi_command_line
e774181 to
0e6d64e
Compare
Pull Request Description
Previously, the event description for string events for amd_smi in papi_native_avail only displayed the string for Device 0.
Implemented per-device descriptions. Updated initialization to capture and store specific strings (UUIDs, serials, versions) for every device index. Modified amds_evt_code_to_info and amds_evt_code_to_descr to dynamically display the correct source string for the specific device variant being queried. Each :device=N variant now displays its own specific string in the description, providing necessary context for the hash value.
Author Checklist
Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
Commits are self contained and only do one thing
Commits have a header of the form:
module: short descriptionCommits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
The PR needs to pass all the tests