Skip to content

mahidhar/pearc24_k8s_tutorial

Repository files navigation

AI and Scientific Research Computing with Kubernetes

An upcoming tutorial at PEARC24. To be presented by Mahidhar Tatineni and Dmitry Mishin.

Overview

Kubernetes has emerged as the leading container orchestration solution over the past few years with a very diverse and active development community. In this tutorial the attendees will learn how to run AI and scientific research computing workloads on Kubernetes clusters. Specifically, attention will be given to the storage/IO needs of such workloads. The program will include a Kubernetes architectural overview, an overview of typical job and workflow submission procedures, and examples provided regarding the various options available to enable optimal use of storage resources. Theoretical information will be paired with hands-on sessions operating on the Prototype National Research Platform (PNRP) production Kubernetes cluster which features a variety of compute and storage resources. A demonstration of use of Kubernetes on the innovative AI system Voyager will also be shown.

Agenda

*Introduction and Welcome
*An overview of the Kubernetes architecture 
  -Basic Kubernetes Hands On 
*Kubernetes resource scheduling 
  -Scheduling Hands On 
*AI and science research applications with Kubernetes 
  -AI and Scientific Research Applications examples with Hands On 
    a. AI training using PyTorch example
    b. Text generation inference example
    c. RAG example using Ollama 
    d. Helm based deployment of LLM as service (H2O) 
    e. LAMMPS (molecular dynamics application) example
*Storage
  -Storage hands on
*Monitoring your work 
*Closeout

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages