Skip to content

Comments

rocp_sdk: enable delayed initialization#533

Open
dbarry9 wants to merge 1 commit intoicl-utk-edu:masterfrom
dbarry9:2026.02.13_rocp-sdk-delayed-init
Open

rocp_sdk: enable delayed initialization#533
dbarry9 wants to merge 1 commit intoicl-utk-edu:masterfrom
dbarry9:2026.02.13_rocp-sdk-delayed-init

Conversation

@dbarry9
Copy link
Contributor

@dbarry9 dbarry9 commented Jan 13, 2026

Pull Request Description

Revert the specific change in commit 258caab that removed delayed initialization from the rocp_sdk component.

This functionality is convenient for tools and applications which use PAPI internally.

These changes have been tested on the Frontier supercomputer, which contains the AMD MI250X architecture.

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@dbarry9 dbarry9 added the status-work-in-progress PR is still being worked on label Jan 13, 2026
@dbarry9 dbarry9 force-pushed the 2026.02.13_rocp-sdk-delayed-init branch from 348ca81 to bf0276f Compare January 13, 2026 19:51
Revert the specific change in commit 258caab that removed delayed
initialization from the rocp_sdk component.

This functionality is convenient for tools and applications which
use PAPI internally.

These changes have been tested on the Frontier supercomputer, which
contains the AMD MI250X architecture.
@dbarry9 dbarry9 force-pushed the 2026.02.13_rocp-sdk-delayed-init branch from bf0276f to 539b198 Compare January 13, 2026 20:04
@dbarry9 dbarry9 requested a review from tokey-tahmid January 13, 2026 20:07
@dbarry9 dbarry9 added status-ready-for-review PR is ready to be reviewed and removed status-work-in-progress PR is still being worked on labels Jan 13, 2026
@Treece-Burgess
Copy link
Contributor

I am reviewing this PR.

// This component needs to be fully initialized from the beginning,
// because interleaving hip calls and PAPI calls leads to errors.
return check_n_initialize();
sprintf(_rocp_sdk_vector.cmp_info.disabled_reason, "Not initialized. Access component events to initialize it.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change sprintf to snprintf and then check the returned value, i.e.

int strLen = snprintf(_rocp_sdk_vector.cmp_info.disabled_reason, PAPI_HUGE_STR_LEN, "%s",  "Not initialized. Access component events to initialize it.");
if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) {
    SUBDBG(...)
    return PAPI_EBUF;
}


* In dispatch mode, PAPI may read zeros if reading takes place immediately after the return of a GPU kernel. This is not a PAPI bug. It may occur because calls such as hipDeviceSynchronize() do not guarantee that ROCprofiler has been called and all counter buffers have been flushed. Therefore, it is recommended that the user code adds a delay between the return of a kernel and calls to PAPI_read(), PAPI_stop(), etc.
* If an application is linked against the static PAPI library libpapi.a, then the application must call PAPI_library_init() before calling any hip routines (e.g. hipInit(), hipGetDeviceCount(), hipLaunchKernelGGL(), etc). If the application is linked against the dynamic library libpapi.so, then the order of operations does not matter.
* If an application is linked against the static PAPI library libpapi.a, then the application must call PAPI_library_init() through PAPI_add_named_event()/PAPI_add_event() before calling any hip routines (e.g. hipInit(), hipGetDeviceCount(), hipLaunchKernelGGL(), etc). If the application is linked against the dynamic library libpapi.so, then the order of operations does not matter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the added "through PAPI_add_named_event()/PAPI_add_event()" was added due to the PAPI_EDELAY_INIT support and doing this workflow would initialize the component.

However, you can also call PAPI_enum_cmp_event(int *EventCode, PAPI_ENUM_FIRST, int cidx) to initialize the component. I would make that aware to users as well.

sprintf(_rocp_sdk_vector.cmp_info.disabled_reason, "Not initialized. Access component events to initialize it.");
_rocp_sdk_vector.cmp_info.disabled = PAPI_EDELAY_INIT;
return PAPI_EDELAY_INIT;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing, if I have the following workflow:

PAPI_library_init(PAPI_VER_CURRENT);
int cidx = API_get_component_index("rocp_sdk");

int EventCode = 0 | PAPI_NATIVE_MASK;
PAPI_enum_cmp_event(&EventCode, PAPI_ENUM_FIRST, cidx);

const PAPI_component_info_t *cmpinfo = PAPI_get_component_info(cidx);

The member variable of PAPI_component_info_t disabled_reason will still show "Not initialized. Access component events to initialize it." This should not be the case anymore as I have initialized rocp_sdk.

To resolve this you need to set disabled_reason to an empty string inside rocp_sdk_init_private (this is what I did in the cuda component see here).

@Treece-Burgess
Copy link
Contributor

In rocp_sdk.c the following block of code exists:

int papi_errno = rocprofiler_sdk_init_pre();
    if (papi_errno != PAPI_OK) {
        _rocp_sdk_vector.cmp_info.initialized = 1;
        _rocp_sdk_vector.cmp_info.disabled = papi_errno;
        const char *err_string;
        rocprofiler_sdk_err_get_last(&err_string);
        snprintf(_rocp_sdk_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s", err_string);
        return papi_errno;
    }

The call to rocprofiler_sdk_init_pre() should never fail as it just returns PAPI_OK, but for semantics the variable _rocp_sdk_vector.cmp_info.initialized should be set to 0 instead of 1 inside the conditional block. As if we did enter the block rocp_sdk would not be initialized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status-ready-for-review PR is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants