Add support for autodetection of gres resources #181

jovial · 2025-04-23T17:07:54Z

Adds support for setting the AutoDetection property on gres resources. This prevents the need to manually specify File in the gres dictionary. You can only use one auto-detection mechanism per node, otherwise slurm will complain - hence why it is a per-nodegroup option and not a per-gres option.

Example:

# group_vars/all/openhpc.yml

openhpc_nodegroups:
    - name: cpu
    - name: gpu
      gres_autodetect: nvml
      gres:
        - conf: "gpu:nvidia_h100_80gb_hbm3:2"
        - conf: "gpu:nvidia_h100_80gb_hbm3_4g.40gb:2"
        - conf: "gpu:nvidia_h100_80gb_hbm3_1g.10gb:6"

NB: autodetection requires rebuild of the OpenHPC packages - this is not provided by this role

templates/gres.conf.j2

README.md

sjpb

Have some concerns

README.md

jovial · 2025-05-08T16:02:03Z

Ready for review but merge #183 first (this PR targets that branch to avoid noise in diff)

sjpb

Few comments, but looks pretty good to me.

README.md

templates/gres.conf.j2

sjpb

Looks great.

jovial requested a review from a team as a code owner April 23, 2025 17:07

jovial marked this pull request as draft April 23, 2025 20:20

jovial commented Apr 24, 2025

View reviewed changes

templates/gres.conf.j2 Outdated Show resolved Hide resolved

jovial commented Apr 24, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

jovial marked this pull request as ready for review April 24, 2025 09:04

jovial commented Apr 24, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

sjpb requested changes Apr 24, 2025

View reviewed changes

README.md Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

jovial commented Apr 28, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

jovial requested a review from sjpb April 28, 2025 08:39

sjpb mentioned this pull request May 7, 2025

Add group.node_params to partitions/groups. #182

Merged

jovial marked this pull request as draft May 8, 2025 14:18

jovial changed the base branch from master to feat/nodegroups May 8, 2025 14:27

jovial marked this pull request as ready for review May 8, 2025 15:59

Base automatically changed from feat/nodegroups to master May 13, 2025 08:19

jovial changed the base branch from master to feat/nodegroups-v2 May 16, 2025 12:48

jovial force-pushed the feature/gres-autodetect branch 2 times, most recently from e3f58ad to 1ca4a4e Compare May 16, 2025 13:24

sjpb requested changes May 16, 2025

View reviewed changes

README.md Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Show resolved Hide resolved

templates/gres.conf.j2 Outdated Show resolved Hide resolved

templates/gres.conf.j2 Outdated Show resolved Hide resolved

Base automatically changed from feat/nodegroups-v2 to master May 16, 2025 14:01

jovial force-pushed the feature/gres-autodetect branch 3 times, most recently from 4ed9a81 to e8c09aa Compare May 20, 2025 12:08

Add support for autodetection of gres resources

facef75

jovial force-pushed the feature/gres-autodetect branch from e8c09aa to facef75 Compare May 21, 2025 09:44

jovial requested a review from sjpb May 21, 2025 09:49

sjpb approved these changes May 21, 2025

View reviewed changes

sjpb merged commit 3354f7f into master May 21, 2025
29 checks passed

sjpb deleted the feature/gres-autodetect branch May 21, 2025 12:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for autodetection of gres resources #181

Add support for autodetection of gres resources #181

Uh oh!

jovial commented Apr 23, 2025 •

edited by sjpb

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjpb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jovial commented May 8, 2025

Uh oh!

sjpb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjpb left a comment

Uh oh!

Uh oh!

Uh oh!

Add support for autodetection of gres resources #181

Add support for autodetection of gres resources #181

Uh oh!

Conversation

jovial commented Apr 23, 2025 • edited by sjpb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjpb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jovial commented May 8, 2025

Uh oh!

sjpb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjpb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jovial commented Apr 23, 2025 •

edited by sjpb

Loading