Skip to content

Releases: oracle-quickstart/oci-hpc-oke

OKE RDMA Quickstart Resource Manager template v26.3.0

31 Mar 23:22
8d4360f

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v26.2.0...v26.3.0

v26.3.0-rc1

17 Mar 22:14
6f6283f

Choose a tag to compare

v26.3.0-rc1 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v26.2.0...v26.3.0-rc1

OKE RDMA Quickstart Resource Manager template v26.2.0

19 Feb 23:51
bbfb72b

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v25.11.0...v26.2.0

OKE RDMA Quickstart Resource Manager template v25.11.0

05 Nov 08:00
8cff2cd

Choose a tag to compare

  • Add option to install OCIR credential helper
  • Fix for Metrics Server
  • Add support to use image URIs

Full Changelog: v25.10.0...v25.11.0

OKE RDMA Quickstart Resource Manager template v25.10.0

30 Oct 19:01
781eaba

Choose a tag to compare

  • Kubernetes upgrade: Added support for Kubernetes v1.34
  • Documentation: New guide — Deploying Prometheus & Grafana Stack with Dashboards and Alerts manually
  • Health checks:
    • Added RCCL tests
    • Added RocM Validation Suite (RVS) gst_single for AMD validation
  • Grafana access link: Default domain updated to endpoint.oci-hpc.ai, configurable for custom domains
  • Component updates: Refreshed dependencies and minor fixes across the stack

Full Changelog: v25.9.0...v25.10.0

OKE RDMA Quickstart Resource Manager template v25.9.0

25 Sep 19:33
185aceb

Choose a tag to compare

  • Option to provision a shared Lustre file system and a PV backed by the Lustre file system
  • Fully private clusters using Resource Manager Private Endpoint for deployment
  • Same dashboards and notifications with the Slurm stack
  • Option to use Oracle Linux for non-RDMA pools
  • Component updates

OKE RDMA Quickstart Resource Manager template v25.5.1

18 Jun 23:13
a3a2d04

Choose a tag to compare

This is a hotfix release to fix the breaking Helm provider change.

More info about the change here: hashicorp/terraform-provider-helm#1637

OKE RDMA Quickstart Resource Manager template v25.5.0

16 May 05:32
0ce2acc

Choose a tag to compare

  • Added AMD Device Metrics Exporter
  • Added AMD dashboards

OKE RDMA Quickstart Resource Manager template v25.4.0

22 Apr 04:20
3fa53ef

Choose a tag to compare

  • Added Kubernetes v1.32
  • Changed the default number of maximum pods per node to 110

OKE RDMA Quickstart Resource Manager template v25.3.1

31 Mar 04:54
6bac725

Choose a tag to compare

  • OKE AMD GPU device plugin is enabled for BM.GPU.MI300X.8 shape
  • OKE DCGM Exporter is disabled (upstream DCGM Exporter is deployed)
  • Helm fix for Grafana load balancer not being deleted properly on Terraform destroy
  • Updated the health checks for Node Problem Detector
  • Updated Grafana dashboards
  • Added the required policies for Oracle Cloud Agent GPU/RDMA monitoring