Skip to content

Comments

cuda: Verify Cuda Toolkit is Supported by the NVIDIA Architecture#567

Open
Treece-Burgess wants to merge 2 commits intoicl-utk-edu:masterfrom
Treece-Burgess:02-12-verify-cuda-toolkit-supports-arch
Open

cuda: Verify Cuda Toolkit is Supported by the NVIDIA Architecture#567
Treece-Burgess wants to merge 2 commits intoicl-utk-edu:masterfrom
Treece-Burgess:02-12-verify-cuda-toolkit-supports-arch

Conversation

@Treece-Burgess
Copy link
Contributor

Pull Request Description

This PR adds the functionality to check a users Cuda Toolkit to verify that it can be used with the NVIDIA architecture on the machine and provide a helpful disabled message. This is important as:

  1. Older Cuda Toolkits do not support the newer architectures
  2. Newer Cuda Toolkits do not support the older architectures

Testing

Setup

Testing was done on Illyad and Voltar at Oregon.

Illyad:
OS: RHEL 8.10
CPU: AMD EPYC 7402
GPU: 1 * H100
Cuda Toolkit: 11.5.2 and 12.9.0

Voltar:
OS: RHEL 8.10
CPU: Intel Xeon Gold 6226R
GPU: 1 * A100, 1 * V100, and 1 * P100
Cuda Toolkit: 13.0.0

Results

Illyad:

  • With Cuda Toolkit 11.5.2 the cuda component is disabled and this is the correct behavior as the H100 does not support this Cuda Toolkit version
  • With Cuda Toolkit 12.9.0:
    • PAPI Utilities*: ✅
    • Component tests: ✅

Voltar:
The V100 and P100 cause the cuda component to be disabled with Cuda Toolkit 13.0, but setting CUDA_VISIBLE_DEVICES=1,2 will allow for it to be active as this would only "show" the A100.

  • PAPI Utilities*: ✅

* - papi_component_avail, papi_native_avail, and papi_command_line

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@Treece-Burgess Treece-Burgess added component-cuda PRs and Issues related to the cuda component status-ready-for-review PR is ready to be reviewed type-maintenance Update code to keep it compatible, secure, modern. labels Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component-cuda PRs and Issues related to the cuda component status-ready-for-review PR is ready to be reviewed type-maintenance Update code to keep it compatible, secure, modern.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant