Skip to content

Comments

amd_smi: String description for each device for string meta data events#550

Merged
Treece-Burgess merged 1 commit intoicl-utk-edu:masterfrom
djwoun:string_descrp
Feb 13, 2026
Merged

amd_smi: String description for each device for string meta data events#550
Treece-Burgess merged 1 commit intoicl-utk-edu:masterfrom
djwoun:string_descrp

Conversation

@djwoun
Copy link
Contributor

@djwoun djwoun commented Feb 1, 2026

Pull Request Description

Previously, the event description for string events for amd_smi in papi_native_avail only displayed the string for Device 0.

Implemented per-device descriptions. Updated initialization to capture and store specific strings (UUIDs, serials, versions) for every device index. Modified amds_evt_code_to_info and amds_evt_code_to_descr to dynamically display the correct source string for the specific device variant being queried. Each :device=N variant now displays its own specific string in the description, providing necessary context for the hash value.

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@djwoun djwoun requested a review from dbarry9 February 1, 2026 22:47
@Treece-Burgess Treece-Burgess self-requested a review February 3, 2026 16:57
@Treece-Burgess
Copy link
Contributor

I am reviewing this PR.

int can_collapse = 1;
int dev = device_next(event->device_map, first);
for (; dev >= 0; dev = device_next(event->device_map, dev)) {
const char *line = fallback;
Copy link
Contributor Author

@djwoun djwoun Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Treece-Burgess Right now if a per-device description isn’t present we just fall back to event->descr (best-effort), which feels like the reasonable behavior for this description-formatting feature. I’m thinking we keep line = fallback, and add a SUBDBG when device_map includes a device but pd->descrs[dev] is NULL so we can catch that inconsistency during debug. what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds good with me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

Copy link
Contributor

@Treece-Burgess Treece-Burgess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested these changes on Odyssey at Oregon with ROCm 7.2.0 pre-release, below are the results.

  • PAPI build: ✅
  • PAPI utilities*: ✅
  • amd_smi component tests: ✅ (I lack the permissions for amdsmi_set_test.c)

For papi_native_avail, we now see the string description for each device and if applicable a collapsed version if all devices repeat the same value:

# Collapsed version
6325 --------------------------------------------------------------------------------
6326 | amd_smi:::board_product_name_hash                                            |
6327 |            Hash of Device 0,1,2,3 board product name string 'Aqua Vanjaram [I|
6328 |            nstinct MI300A]'                                                  |
6329 |     :device=0                                                                |
6330 |            Mandatory device qualifier [0,1,2,3]                              |
6331 --------------------------------------------------------------------------------

# Not collapsed version
6499 --------------------------------------------------------------------------------
6500 | amd_smi:::uuid_hash                                                          |
6501 |            Hash of Device 0 UUID string 'f6ff74a0-0000-1000-808e-1753dc12b715|
6502 |            '                                                                 |
6503 |            Hash of Device 1 UUID string '5bff74a0-0000-1000-80d3-148dc76b831a|
6504 |            '                                                                 |
6505 |            Hash of Device 2 UUID string 'a5ff74a0-0000-1000-807d-04507c371187|
6506 |            '                                                                 |
6507 |            Hash of Device 3 UUID string 'cdff74a0-0000-1000-80cf-b00895cff553|
6508 |            '                                                                 |
6509 |     :device=0                                                                |
6510 |            Mandatory device qualifier [0,1,2,3]                              |
6511 --------------------------------------------------------------------------------

* - papi_component_avail, papi_native_avail, papi_command_line

@Treece-Burgess Treece-Burgess merged commit 90e919b into icl-utk-edu:master Feb 13, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants