Skip to content

feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 )#68

Open
parsec-renovate[bot] wants to merge 1 commit intomainfrom
renovate/nvidia-device-plugin-0.x
Open

feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 )#68
parsec-renovate[bot] wants to merge 1 commit intomainfrom
renovate/nvidia-device-plugin-0.x

Conversation

@parsec-renovate
Copy link
Contributor

@parsec-renovate parsec-renovate bot commented Mar 22, 2025

This PR contains the following updates:

Package Update Change
nvidia-device-plugin minor 0.14.50.19.0

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

NVIDIA/k8s-device-plugin (nvidia-device-plugin)

v0.19.0

Compare Source

  • Add --sleep-interval=infinite support to GFD for running as a pod (#​1603)
  • Fix image tag in static deployment (#​1604)
  • Add ownerReference to NodeFeature CRs for garbage collection (#​1597)
  • Change default value for gds, gdrcopy and mofed flags (#​1550)
  • Fix healthchecking on old devices (#​1562)
  • Enable NodeFeature API by default in GFD (#​1504)
  • Build multiarch images on native GitHub runners (#​1468)

v0.18.2

Compare Source

  • Ensure that cdi.FeatureFlags are passed to CDI library
  • Fix race condition in config-manager when label is unset
  • Fix nested container use cases by ensuring that IPC sockets are not mounted readonly
  • Bump NVIDIA Container Toolkit to v1.18.2
  • Bump distroless base image to v3.2.2-dev

v0.18.1

Compare Source

  • Allow CDI feature flags to be set
  • Pass driver root to nvinfo.New in device plugin main
  • Bump NVIDIA Container Toolkit to v1.18.1
  • Bump distroless base image to v3.2.1-dev
  • Bump github.com/opencontainers/selinux from 1.12.0 to 1.13.1 (#​1506)

v0.18.0

Compare Source

  • Rename getHealthCheckXids and clarify documentation
  • Add support for explicitly enabling XIDs in health checks
  • Deduplicate requested device IDs
  • Check for nil before reading boolean config values
  • Make gated modes (GDS, MOFED, GDRCOPY) optional in CDI
  • Add support for setting gdrcopyEnabled
  • Ignore errors getting device memory using NVML
  • Ensure that directory volumes have Directory type
  • Switch to plain golang image for builds
  • Remove unneeded intermediate container
  • Update CI definitions
  • Switch to distroless golang image
  • Update README.md with RuntimeClass
  • Pass a single context throughout the device-plugin method call stack (#​1284)
  • Remove internal logger in favour of klog (#​1277)
  • Remove FAIL_ON_INIT_ERROR from static examples
  • Detect blackwell architecture
  • Updated .release:staging to stage device-plugin images in nvstaging
  • Use MiB instead of MB for gpu-memory
  • Ignore XID error 109
  • Update README.md adjust set docker runtime default
  • Remove nvidia.com/gpu.imex-domain label
  • Fix containerd runc config error when creating a kind cluster
  • Use stable nividia-container-toolkit repo when creating a kind cluster
  • Switch to context package in go stdlib
  • Raise a warning instead of an error if GPU mode labeler fails
  • Add ada-lovelace architecture label for compute capability 8.9
  • Ensure FAIL_ON_INIT_ERROR boolean env is quoted
  • Honor fail-on-init-error when no resources are found
  • Enable hostPID in the mps-control-daemon pod (#​1045)

v0.17.4

Compare Source

What's Changed

Full Changelog: NVIDIA/k8s-device-plugin@v0.17.3...v0.17.4

v0.17.3

Compare Source

What's Changed

Full Changelog: NVIDIA/k8s-device-plugin@v0.17.2...v0.17.3

v0.17.2

Compare Source

What's Changed

  • Update nvidia.com/gpu.product label to include blackwell architectures
  • Update documentation to indicate that nvidia.com/gpu.memory label is in MiB instead of MB

Full Changelog: NVIDIA/k8s-device-plugin@v0.17.1...v0.17.2

v0.17.1

Compare Source

  • Ensure that generated CDI specs do not contain enable-cuda-compat hooks
  • Remove nvidia.com/gpu.imex-domain label
  • Ignore XID error 109
  • Add ada-lovelace architecture label for compute capability 8.9
  • Ensure FAIL_ON_INIT_ERROR boolean env is quoted
  • Honor fail-on-init-error when no resources are found

v0.17.0

Compare Source

  • Promote v0.17.0-rc.1 to GA

v0.16.2

Compare Source

  • Add CAP_SYS_ADMIN if volume-mounts list strategy is included (fixes #​856)
  • Remove unneeded DEVICE_PLUGIN_MODE envvar
  • Fix applying SELinux label for MPS

v0.16.1

Compare Source

  • Bump nvidia-container-toolkit to v1.16.1 to fix a bug with CDI spec generation for MIG devices

v0.16.0

Compare Source

  • Fixed logic of atomic writing of the feature file
  • Replaced WithDialer with WithContextDialer
  • Fixed SELinux context of MPS pipe directory.
  • Changed behavior for empty MIG devices to issue a warning instead of an error when the mixed strategy is selected
  • Added a a GFD node label for the GPU mode.
  • Update CUDA base image version to 12.5.1

v0.15.1

Compare Source

Changelog

  • Fix inconsistent usage of hasConfigMap helm template. This addresses cases where certain resources (roles and service accounts) would be created even if they were not required.
  • Raise an error in GFD when MPS is used with MIG. This ensures that the behavior across GFD and the Device Plugin is consistent.
  • Remove provenance information from published images.
  • Use half of total memory for size of MPS tmpfs by default.

v0.15.0

Compare Source

  • Moved nvidia-device-plugin.yml static deployment at the root of the repository to deployments/static/nvidia-device-plugin.yml.
  • Simplify PCI device clases in NFD worker configuration.
  • Update CUDA base image version to 12.4.1.
  • Switch to Ubuntu22.04-based CUDA image for default image.
  • Add new CUDA driver and runtime version labels to align with other NFD version labels.
  • Update NFD dependency to v0.15.3.

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

@github-actions
Copy link

github-actions bot commented Mar 22, 2025

--- kubernetes/apps/kube-system/nvidia/device-plugin/app Kustomization: kube-system/nvidia-device-plugin HelmRelease: kube-system/nvidia-device-plugin

+++ kubernetes/apps/kube-system/nvidia/device-plugin/app Kustomization: kube-system/nvidia-device-plugin HelmRelease: kube-system/nvidia-device-plugin

@@ -13,13 +13,13 @@

       chart: nvidia-device-plugin
       interval: 15m
       sourceRef:
         kind: HelmRepository
         name: nvidia-dvp
         namespace: flux-system
-      version: 0.14.5
+      version: 0.19.0
   install:
     createNamespace: true
     remediation:
       retries: 3
   interval: 15m
   maxHistory: 2

@github-actions
Copy link

github-actions bot commented Mar 22, 2025

--- HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin

+++ HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin

@@ -44,13 +44,13 @@

           value: nvidia.com/device-plugin.config
         - name: CONFIG_FILE_SRCDIR
           value: /available-configs
         - name: CONFIG_FILE_DST
           value: /config/config.yaml
         - name: DEFAULT_CONFIG
-          value: ''
+          value: null
         - name: FALLBACK_STRATEGIES
           value: named,single
         - name: SEND_SIGNAL
           value: 'false'
         - name: SIGNAL
           value: ''
@@ -79,13 +79,13 @@

           value: nvidia.com/device-plugin.config
         - name: CONFIG_FILE_SRCDIR
           value: /available-configs
         - name: CONFIG_FILE_DST
           value: /config/config.yaml
         - name: DEFAULT_CONFIG
-          value: ''
+          value: null
         - name: FALLBACK_STRATEGIES
           value: named,single
         - name: SEND_SIGNAL
           value: 'true'
         - name: SIGNAL
           value: '1'
@@ -100,39 +100,84 @@

           capabilities:
             add:
             - SYS_ADMIN
       - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
         imagePullPolicy: IfNotPresent
         name: nvidia-device-plugin-ctr
+        command:
+        - nvidia-device-plugin
         env:
+        - name: MPS_ROOT
+          value: /run/nvidia/mps
         - name: CONFIG_FILE
           value: /config/config.yaml
         - name: NVIDIA_MIG_MONITOR_DEVICES
           value: all
+        - name: NVIDIA_VISIBLE_DEVICES
+          value: all
+        - name: NVIDIA_DRIVER_CAPABILITIES
+          value: compute,utility
         securityContext:
           capabilities:
             add:
             - SYS_ADMIN
         volumeMounts:
-        - name: device-plugin
+        - name: kubelet-device-plugins-dir
           mountPath: /var/lib/kubelet/device-plugins
+        - name: mps-shm
+          mountPath: /dev/shm
+        - name: mps-root
+          mountPath: /mps
+        - name: cdi-root
+          mountPath: /var/run/cdi
         - name: available-configs
           mountPath: /available-configs
         - name: config
           mountPath: /config
       volumes:
-      - name: device-plugin
+      - name: kubelet-device-plugins-dir
         hostPath:
           path: /var/lib/kubelet/device-plugins
+          type: Directory
+      - name: mps-root
+        hostPath:
+          path: /run/nvidia/mps
+          type: DirectoryOrCreate
+      - name: mps-shm
+        hostPath:
+          path: /run/nvidia/mps/shm
+      - name: cdi-root
+        hostPath:
+          path: /var/run/cdi
+          type: DirectoryOrCreate
       - name: available-configs
         configMap:
           name: nvidia-device-plugin-configs
       - name: config
         emptyDir: {}
       nodeSelector:
         nvidia.feature.node.kubernetes.io/gpu: 'true'
+      affinity:
+        nodeAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/pci-10de.present
+                operator: In
+                values:
+                - 'true'
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/cpu-model.vendor_id
+                operator: In
+                values:
+                - NVIDIA
+            - matchExpressions:
+              - key: nvidia.com/gpu.present
+                operator: In
+                values:
+                - 'true'
       tolerations:
       - key: CriticalAddonsOnly
         operator: Exists
       - effect: NoSchedule
         key: nvidia.com/gpu
         operator: Exists
--- HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin-mps-control-daemon

+++ HelmRelease: kube-system/nvidia-device-plugin DaemonSet: kube-system/nvidia-device-plugin-mps-control-daemon

@@ -0,0 +1,181 @@

+---
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: nvidia-device-plugin-mps-control-daemon
+  namespace: kube-system
+  labels:
+    app.kubernetes.io/name: nvidia-device-plugin
+    app.kubernetes.io/instance: nvidia-device-plugin
+    app.kubernetes.io/managed-by: Helm
+spec:
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: nvidia-device-plugin
+      app.kubernetes.io/instance: nvidia-device-plugin
+  updateStrategy:
+    type: RollingUpdate
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: nvidia-device-plugin
+        app.kubernetes.io/instance: nvidia-device-plugin
+    spec:
+      priorityClassName: system-node-critical
+      runtimeClassName: nvidia
+      securityContext: {}
+      serviceAccountName: nvidia-device-plugin-service-account
+      hostPID: true
+      initContainers:
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        name: mps-control-daemon-mounts
+        command:
+        - mps-control-daemon
+        - mount-shm
+        securityContext:
+          privileged: true
+        volumeMounts:
+        - name: mps-root
+          mountPath: /mps
+          mountPropagation: Bidirectional
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        name: mps-control-daemon-init
+        command:
+        - config-manager
+        env:
+        - name: ONESHOT
+          value: 'true'
+        - name: KUBECONFIG
+          value: ''
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: NODE_LABEL
+          value: nvidia.com/device-plugin.config
+        - name: CONFIG_FILE_SRCDIR
+          value: /available-configs
+        - name: CONFIG_FILE_DST
+          value: /config/config.yaml
+        - name: DEFAULT_CONFIG
+          value: null
+        - name: FALLBACK_STRATEGIES
+          value: named,single
+        - name: SEND_SIGNAL
+          value: 'false'
+        - name: SIGNAL
+          value: ''
+        - name: PROCESS_TO_SIGNAL
+          value: ''
+        volumeMounts:
+        - name: available-configs
+          mountPath: /available-configs
+        - name: config
+          mountPath: /config
+      containers:
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        name: mps-control-daemon-sidecar
+        command:
+        - config-manager
+        env:
+        - name: ONESHOT
+          value: 'false'
+        - name: KUBECONFIG
+          value: ''
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: NODE_LABEL
+          value: nvidia.com/device-plugin.config
+        - name: CONFIG_FILE_SRCDIR
+          value: /available-configs
+        - name: CONFIG_FILE_DST
+          value: /config/config.yaml
+        - name: DEFAULT_CONFIG
+          value: null
+        - name: FALLBACK_STRATEGIES
+          value: named,single
+        - name: SEND_SIGNAL
+          value: 'true'
+        - name: SIGNAL
+          value: '1'
+        - name: PROCESS_TO_SIGNAL
+          value: /usr/bin/mps-control-daemon
+        volumeMounts:
+        - name: available-configs
+          mountPath: /available-configs
+        - name: config
+          mountPath: /config
+      - image: nvcr.io/nvidia/k8s-device-plugin:v0.17.1
+        imagePullPolicy: IfNotPresent
+        name: mps-control-daemon-ctr
+        command:
+        - mps-control-daemon
+        env:
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              apiVersion: v1
+              fieldPath: spec.nodeName
+        - name: CONFIG_FILE
+          value: /config/config.yaml
+        - name: NVIDIA_MIG_MONITOR_DEVICES
+          value: all
+        - name: NVIDIA_VISIBLE_DEVICES
+          value: all
+        - name: NVIDIA_DRIVER_CAPABILITIES
+          value: compute,utility
+        securityContext:
+          privileged: true
+        volumeMounts:
+        - name: mps-shm
+          mountPath: /dev/shm
+        - name: mps-root
+          mountPath: /mps
+        - name: available-configs
+          mountPath: /available-configs
+        - name: config
+          mountPath: /config
+      volumes:
+      - name: mps-root
+        hostPath:
+          path: /run/nvidia/mps
+          type: DirectoryOrCreate
+      - name: mps-shm
+        hostPath:
+          path: /run/nvidia/mps/shm
+      - name: available-configs
+        configMap:
+          name: nvidia-device-plugin-configs
+      - name: config
+        emptyDir: {}
+      nodeSelector:
+        nvidia.com/mps.capable: 'true'
+        nvidia.feature.node.kubernetes.io/gpu: 'true'
+      affinity:
+        nodeAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/pci-10de.present
+                operator: In
+                values:
+                - 'true'
+            - matchExpressions:
+              - key: feature.node.kubernetes.io/cpu-model.vendor_id
+                operator: In
+                values:
+                - NVIDIA
+            - matchExpressions:
+              - key: nvidia.com/gpu.present
+                operator: In
+                values:
+                - 'true'
+      tolerations:
+      - key: CriticalAddonsOnly
+        operator: Exists
+      - effect: NoSchedule
+        key: nvidia.com/gpu
+        operator: Exists
+

@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 200da91 to 154a81f Compare June 23, 2025 00:09
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.1 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.2 ) Jun 23, 2025
@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 154a81f to 1b7d41a Compare July 28, 2025 00:10
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.2 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.3 ) Jul 28, 2025
@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 1b7d41a to e31c24b Compare September 15, 2025 00:08
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.3 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.4 ) Sep 15, 2025
@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from e31c24b to 55d4dbb Compare October 27, 2025 00:08
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.17.4 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.18.0 ) Oct 27, 2025
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 → 0.18.0 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.0 ) Dec 9, 2025
@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 55d4dbb to 1aa5cbd Compare December 9, 2025 23:05
@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch from 1aa5cbd to 1bc3f91 Compare January 23, 2026 15:15
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.0 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) Jan 23, 2026
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) - autoclosed Mar 17, 2026
@parsec-renovate parsec-renovate bot closed this Mar 17, 2026
@parsec-renovate parsec-renovate bot deleted the renovate/nvidia-device-plugin-0.x branch March 17, 2026 19:33
@parsec-renovate parsec-renovate bot changed the title feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.18.2 ) - autoclosed feat(helm): update chart nvidia-device-plugin ( 0.14.5 ➔ 0.19.0 ) Mar 17, 2026
@parsec-renovate parsec-renovate bot reopened this Mar 17, 2026
@parsec-renovate parsec-renovate bot force-pushed the renovate/nvidia-device-plugin-0.x branch 2 times, most recently from 1bc3f91 to 899d7db Compare March 17, 2026 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants