feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 ) by parsec-renovate[bot] · Pull Request #68 · parsec/home-ops

parsec-renovate · 2025-03-22T12:02:20Z

This PR contains the following updates:

Package	Update	Change
nvidia-device-plugin	minor	`0.14.5` → `0.19.0`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

NVIDIA/k8s-device-plugin (nvidia-device-plugin)

`v0.19.0`

Compare Source

Add --sleep-interval=infinite support to GFD for running as a pod (#1603)
Fix image tag in static deployment (#1604)
Add ownerReference to NodeFeature CRs for garbage collection (#1597)
Change default value for gds, gdrcopy and mofed flags (#1550)
Fix healthchecking on old devices (#1562)
Enable NodeFeature API by default in GFD (#1504)
Build multiarch images on native GitHub runners (#1468)

`v0.18.2`

Compare Source

Ensure that cdi.FeatureFlags are passed to CDI library
Fix race condition in config-manager when label is unset
Fix nested container use cases by ensuring that IPC sockets are not mounted readonly
Bump NVIDIA Container Toolkit to v1.18.2
Bump distroless base image to v3.2.2-dev

`v0.18.1`

Compare Source

Allow CDI feature flags to be set
Pass driver root to nvinfo.New in device plugin main
Bump NVIDIA Container Toolkit to v1.18.1
Bump distroless base image to v3.2.1-dev
Bump github.com/opencontainers/selinux from 1.12.0 to 1.13.1 (#1506)

`v0.18.0`

Compare Source

Rename getHealthCheckXids and clarify documentation
Add support for explicitly enabling XIDs in health checks
Deduplicate requested device IDs
Check for nil before reading boolean config values
Make gated modes (GDS, MOFED, GDRCOPY) optional in CDI
Add support for setting gdrcopyEnabled
Ignore errors getting device memory using NVML
Ensure that directory volumes have Directory type
Switch to plain golang image for builds
Remove unneeded intermediate container
Update CI definitions
Switch to distroless golang image
Update README.md with RuntimeClass
Pass a single context throughout the device-plugin method call stack (#1284)
Remove internal logger in favour of klog (#1277)
Remove FAIL_ON_INIT_ERROR from static examples
Detect blackwell architecture
Updated .release:staging to stage device-plugin images in nvstaging
Use MiB instead of MB for gpu-memory
Ignore XID error 109
Update README.md adjust set docker runtime default
Remove nvidia.com/gpu.imex-domain label
Fix containerd runc config error when creating a kind cluster
Use stable nividia-container-toolkit repo when creating a kind cluster
Switch to context package in go stdlib
Raise a warning instead of an error if GPU mode labeler fails
Add ada-lovelace architecture label for compute capability 8.9
Ensure FAIL_ON_INIT_ERROR boolean env is quoted
Honor fail-on-init-error when no resources are found
Enable hostPID in the mps-control-daemon pod (#1045)

`v0.17.4`

Compare Source

What's Changed

Bump slackapi/slack-github-action from 2.1.0 to 2.1.1 by @dependabot[bot] in #1317
Bump github.com/NVIDIA/go-nvlib from 0.7.2 to 0.7.4 by @dependabot[bot] in #1346
Bump golang from 1.23.11 to 1.23.12 in /deployments/devel by @dependabot[bot] in #1355
Ensure that directory volumes have Directory type by @elezar in #1368
Bump nvidia/cuda from 12.9.1-base-ubi9 to 13.0.0-base-ubi9 in /deployments/container by @dependabot[bot] in #1369
Ignore errors getting device memory using NVML by @elezar in #1374
Bump project version to v0.17.4 by @cdesiniotis in #1402
[no-relnote] update ngc publishing logic for release pipelines by @cdesiniotis in #1406

Full Changelog: NVIDIA/k8s-device-plugin@v0.17.3...v0.17.4

`v0.17.3`

Compare Source

What's Changed

Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.6 to 1.17.8 by @dependabot[bot] in #1275
Bump nvidia/cuda from 12.9.0-base-ubi9 to 12.9.1-base-ubi9 in /deployments/container by @dependabot[bot] in #1300
Bump github.com/NVIDIA/go-nvml from 0.12.4-1 to 0.12.9-0 by @dependabot[bot] in #1287
Bump golang from 1.23.9 to 1.23.10 in /deployments/devel by @dependabot[bot] in #1283
Bump golang from 1.23.10 to 1.23.11 in /deployments/devel by @dependabot[bot] in #1318
Bump release v0.17.3 by @elezar in #1326
Backport: Bump golang.org/x/oauth2 from 0.23.0 to 0.27.0 by @cdesiniotis in #1328
Updated .release:staging to stage device-plugin images in nvstaging by @elezar in #1329

Full Changelog: NVIDIA/k8s-device-plugin@v0.17.2...v0.17.3

`v0.17.2`

Compare Source

What's Changed

Update nvidia.com/gpu.product label to include blackwell architectures
Update documentation to indicate that nvidia.com/gpu.memory label is in MiB instead of MB

Full Changelog: NVIDIA/k8s-device-plugin@v0.17.1...v0.17.2

`v0.17.1`

Compare Source

Ensure that generated CDI specs do not contain enable-cuda-compat hooks
Remove nvidia.com/gpu.imex-domain label
Ignore XID error 109
Add ada-lovelace architecture label for compute capability 8.9
Ensure FAIL_ON_INIT_ERROR boolean env is quoted
Honor fail-on-init-error when no resources are found

`v0.17.0`

Compare Source

Promote v0.17.0-rc.1 to GA

`v0.16.2`

Compare Source

Add CAP_SYS_ADMIN if volume-mounts list strategy is included (fixes #856)
Remove unneeded DEVICE_PLUGIN_MODE envvar
Fix applying SELinux label for MPS

`v0.16.1`

Compare Source

Bump nvidia-container-toolkit to v1.16.1 to fix a bug with CDI spec generation for MIG devices

`v0.16.0`

Compare Source

Fixed logic of atomic writing of the feature file
Replaced WithDialer with WithContextDialer
Fixed SELinux context of MPS pipe directory.
Changed behavior for empty MIG devices to issue a warning instead of an error when the mixed strategy is selected
Added a a GFD node label for the GPU mode.
Update CUDA base image version to 12.5.1

`v0.15.1`

Compare Source

Changelog

Fix inconsistent usage of hasConfigMap helm template. This addresses cases where certain resources (roles and service accounts) would be created even if they were not required.
Raise an error in GFD when MPS is used with MIG. This ensures that the behavior across GFD and the Device Plugin is consistent.
Remove provenance information from published images.
Use half of total memory for size of MPS tmpfs by default.

`v0.15.0`

Compare Source

Moved nvidia-device-plugin.yml static deployment at the root of the repository to deployments/static/nvidia-device-plugin.yml.
Simplify PCI device clases in NFD worker configuration.
Update CUDA base image version to 12.4.1.
Switch to Ubuntu22.04-based CUDA image for default image.
Add new CUDA driver and runtime version labels to align with other NFD version labels.
Update NFD dependency to v0.15.3.

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

github-actions · 2025-03-22T12:02:57Z

--- kubernetes/apps/kube-system/nvidia/device-plugin/app Kustomization: kube-system/nvidia-device-plugin HelmRelease: kube-system/nvidia-device-plugin

+++ kubernetes/apps/kube-system/nvidia/device-plugin/app Kustomization: kube-system/nvidia-device-plugin HelmRelease: kube-system/nvidia-device-plugin

@@ -13,13 +13,13 @@

       chart: nvidia-device-plugin
       interval: 15m
       sourceRef:
         kind: HelmRepository
         name: nvidia-dvp
         namespace: flux-system
-      version: 0.14.5
+      version: 0.19.0
   install:
     createNamespace: true
     remediation:
       retries: 3
   interval: 15m
   maxHistory: 2

github-actions · 2025-03-22T12:02:58Z

--- HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin

+++ HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin

@@ -44,13 +44,13 @@

           value: nvidia.com/device-plugin.config
         - name: CONFIG_FILE_SRCDIR
           value: /available-configs
         - name: CONFIG_FILE_DST
           value: /config/config.yaml
         - name: DEFAULT_CONFIG
-          value: ''
+          value: null
         - name: FALLBACK_STRATEGIES
           value: named,single
         - name: SEND_SIGNAL
           value: 'false'
         - name: SIGNAL
           value: ''
@@ -79,13 +79,13 @@

           value: nvidia.com/device-plugin.config
         - name: CONFIG_FILE_SRCDIR
           value: /available-configs
         - name: CONFIG_FILE_DST
           value: /config/config.yaml
         - name: DEFAULT_CONFIG
-          value: ''
+          value: null
         - name: FALLBACK_STRATEGIES
           value: named,single
         - name: SEND_SIGNAL
           value: 'true'
         - name: SIGNAL
           value: '1'
@@ -100,39 +100,84 @@

           capabilities:
             add:
             - SYS_ADMIN
       - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
         imagePullPolicy: IfNotPresent
         name: nvidia-device-plugin-ctr
+        command:
+        - nvidia-device-plugin
         env:
+        - name: MPS_ROOT
+          value: /run/nvidia/mps
         - name: CONFIG_FILE
           value: /config/config.yaml
         - name: NVIDIA_MIG_MONITOR_DEVICES
           value: all
+        - name: NVIDIA_VISIBLE_DEVICES
+          value: all
+        - name: NVIDIA_DRIVER_CAPABILITIES
+          value: compute,utility
         securityContext:
           capabilities:
             add:
             - SYS_ADMIN
         volumeMounts:
-        - name: device-plugin
+        - name: kubelet-device-plugins-dir
           mountPath: /var/lib/kubelet/device-plugins
+        - name: mps-shm
+          mountPath: /dev/shm
+        - name: mps-root
+          mountPath: /mps
+        - name: cdi-root
+          mountPath: /var/run/cdi
         - name: available-configs
           mountPath: /available-configs
         - name: config
           mountPath: /config
       volumes:
-      - name: device-plugin
+      - name: kubelet-device-plugins-dir
         hostPath:
           path: /var/lib/kubelet/device-plugins
+          type: Directory
+      - name: mps-root
+        hostPath:
+          path: /run/nvidia/mps
+          type: DirectoryOrCreate
+      - name: mps-shm
+        hostPath:
+          path: /run/nvidia/mps/shm
+      - name: cdi-root
+        hostPath:
+          path: /var/run/cdi
+          type: DirectoryOrCreate
       - name: available-configs
         configMap:
           name: nvidia-device-plugin-configs
       - name: config
         emptyDir: {}
       nodeSelector:
         nvidia.feature.node.kubernetes.io/gpu: 'true'
+      affinity:
+        nodeAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/pci-10de.present
+                operator: In
+                values:
+                - 'true'
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/cpu-model.vendor_id
+                operator: In
+                values:
+                - NVIDIA
+            - matchExpressions:
+              - key: nvidia.com/gpu.present
+                operator: In
+                values:
+                - 'true'
       tolerations:
       - key: CriticalAddonsOnly
         operator: Exists
       - effect: NoSchedule
         key: nvidia.com/gpu
         operator: Exists
--- HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin-mps-control-daemon

+++ HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin-mps-control-daemon

@@ -0,0 +1,181 @@

+---
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: nvidia-device-plugin-mps-control-daemon
+  namespace: kube-system
+  labels:
+    app.kubernetes.io/name: nvidia-device-plugin
+    app.kubernetes.io/instance: nvidia-device-plugin
+    app.kubernetes.io/managed-by: Helm
+spec:
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: nvidia-device-plugin
+      app.kubernetes.io/instance: nvidia-device-plugin
+  updateStrategy:
+    type: RollingUpdate
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: nvidia-device-plugin
+        app.kubernetes.io/instance: nvidia-device-plugin
+    spec:
+      priorityClassName: system-node-critical
+      runtimeClassName: nvidia
+      securityContext: {}
+      serviceAccountName: nvidia-device-plugin-service-account
+      hostPID: true
+      initContainers:
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        name: mps-control-daemon-mounts
+        command:
+        - mps-control-daemon
+        - mount-shm
+        securityContext:
+          privileged: true
+        volumeMounts:
+        - name: mps-root
+          mountPath: /mps
+          mountPropagation: Bidirectional
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        name: mps-control-daemon-init
+        command:
+        - config-manager
+        env:
+        - name: ONESHOT
+          value: 'true'
+        - name: KUBECONFIG
+          value: ''
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: NODE_LABEL
+          value: nvidia.com/device-plugin.config
+        - name: CONFIG_FILE_SRCDIR
+          value: /available-configs
+        - name: CONFIG_FILE_DST
+          value: /config/config.yaml
+        - name: DEFAULT_CONFIG
+          value: null
+        - name: FALLBACK_STRATEGIES
+          value: named,single
+        - name: SEND_SIGNAL
+          value: 'false'
+        - name: SIGNAL
+          value: ''
+        - name: PROCESS_TO_SIGNAL
+          value: ''
+        volumeMounts:
+        - name: available-configs
+          mountPath: /available-configs
+        - name: config
+          mountPath: /config
+      containers:
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        name: mps-control-daemon-sidecar
+        command:
+        - config-manager
+        env:
+        - name: ONESHOT
+          value: 'false'
+        - name: KUBECONFIG
+          value: ''
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: NODE_LABEL
+          value: nvidia.com/device-plugin.config
+        - name: CONFIG_FILE_SRCDIR
+          value: /available-configs
+        - name: CONFIG_FILE_DST
+          value: /config/config.yaml
+        - name: DEFAULT_CONFIG
+          value: null
+        - name: FALLBACK_STRATEGIES
+          value: named,single
+        - name: SEND_SIGNAL
+          value: 'true'
+        - name: SIGNAL
+          value: '1'
+        - name: PROCESS_TO_SIGNAL
+          value: /usr/bin/mps-control-daemon
+        volumeMounts:
+        - name: available-configs
+          mountPath: /available-configs
+        - name: config
+          mountPath: /config
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        imagePullPolicy: IfNotPresent
+        name: mps-control-daemon-ctr
+        command:
+        - mps-control-daemon
+        env:
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              apiVersion: v1
+              fieldPath: spec.nodeName
+        - name: CONFIG_FILE
+          value: /config/config.yaml
+        - name: NVIDIA_MIG_MONITOR_DEVICES
+          value: all
+        - name: NVIDIA_VISIBLE_DEVICES
+          value: all
+        - name: NVIDIA_DRIVER_CAPABILITIES
+          value: compute,utility
+        securityContext:
+          privileged: true
+        volumeMounts:
+        - name: mps-shm
+          mountPath: /dev/shm
+        - name: mps-root
+          mountPath: /mps
+        - name: available-configs
+          mountPath: /available-configs
+        - name: config
+          mountPath: /config
+      volumes:
+      - name: mps-root
+        hostPath:
+          path: /run/nvidia/mps
+          type: DirectoryOrCreate
+      - name: mps-shm
+        hostPath:
+          path: /run/nvidia/mps/shm
+      - name: available-configs
+        configMap:
+          name: nvidia-device-plugin-configs
+      - name: config
+        emptyDir: {}
+      nodeSelector:
+        nvidia.com/mps.capable: 'true'
+        nvidia.feature.node.kubernetes.io/gpu: 'true'
+      affinity:
+        nodeAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/pci-10de.present
+                operator: In
+                values:
+                - 'true'
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/cpu-model.vendor_id
+                operator: In
+                values:
+                - NVIDIA
+            - matchExpressions:
+              - key: nvidia.com/gpu.present
+                operator: In
+                values:
+                - 'true'
+      tolerations:
+      - key: CriticalAddonsOnly
+        operator: Exists
+      - effect: NoSchedule
+        key: nvidia.com/gpu
+        operator: Exists
+

parsec-renovate bot added renovate/helm type/minor labels Mar 22, 2025

github-actions bot added the area/kubernetes label Mar 22, 2025

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 200da91 to 154a81f Compare June 23, 2025 00:09

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.1 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.2 ) Jun 23, 2025

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 154a81f to 1b7d41a Compare July 28, 2025 00:10

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.2 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.3 ) Jul 28, 2025

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 1b7d41a to e31c24b Compare September 15, 2025 00:08

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.3 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.4 ) Sep 15, 2025

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from e31c24b to 55d4dbb Compare October 27, 2025 00:08

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.4 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.18.0 ) Oct 27, 2025

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.18.0 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.0 ) Dec 9, 2025

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 55d4dbb to 1aa5cbd Compare December 9, 2025 23:05

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 1aa5cbd to 1bc3f91 Compare January 23, 2026 15:15

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.0 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) Jan 23, 2026

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 )~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) - autoclosed Mar 17, 2026

parsec-renovate bot closed this Mar 17, 2026

parsec-renovate bot deleted the renovate/nvidia-device-plugin-0.x branch March 17, 2026 19:33

feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 )

899d7db

parsec-renovate bot changed the title ~~feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) - autoclosed~~ feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 ) Mar 17, 2026

parsec-renovate bot reopened this Mar 17, 2026

parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch 2 times, most recently from 1bc3f91 to 899d7db Compare March 17, 2026 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 )#68

feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 )#68
parsec-renovate[bot] wants to merge 1 commit intomainfrom
renovate/nvidia-device-plugin-0.x

parsec-renovate bot commented Mar 22, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

parsec-renovate bot commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

What's Changed

What's Changed

What's Changed

Changelog

Configuration

Uh oh!

github-actions bot commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

parsec-renovate bot commented Mar 22, 2025 •

edited

Loading

github-actions bot commented Mar 22, 2025 •

edited

Loading

github-actions bot commented Mar 22, 2025 •

edited

Loading