Migrate cuda based images out of rocker-versioned2 #903

eitsupi · 2025-01-25T02:28:42Z

I am fed up with the amount of questions about cuda and Python setup and the maintenance hassle, and strongly believe that users should install any version of Python using uv on the version of cuda image they want to use. (and then use rig to install and use any version of R)

The situation has changed dramatically from a few years ago when there was no rig or uv, and I think the significance of the old kind of pre-built image is declining.

@cboettig @noamross Thoughts?

cboettig · 2025-01-25T03:32:06Z

makes sense to me -- that's what I've been doing for my needs, e.g. building on top of the jupyterhub cuda images. (e.g. https://github.com/boettiger-lab/k8s/blob/main/images/Dockerfile.gpu#L1 is my current gpu setup)

cboettig · 2025-01-28T16:45:35Z

@eitsupi I'm thinking I'll drop a JupyterHub-based image into the old https://github.com/rocker-org/ml repo.

eitsupi · 2025-01-29T13:25:32Z

Thanks, that might make sense.

However, when we look here, there are multiple images for ML use. Which one is agreed to be the base image?
https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html

Since it is not practical to cover all of these, I imagine it would be probably easiest to provide documentation and sample Dockerfiles explaining how to install R and RStudio on these images.

cboettig · 2025-01-29T15:51:44Z

Yes, great points, thanks for raising these issues! I'll document these things, and I won't intend to cover all those images. You've probably noticed that actually quite few of those images include the NVIDIA CUDA libraries.

I do intend to provide a pre-built image with my recommended configuration as well, which will use the CUDA image on latest Ubuntu, as I indicate above. Jupyter's tensorflow only provides cuda latest, while tagged versions exist only for their pytorch base image. In my experience and surveys I have seen from colleagues at computing centers, pytorch is far more widely used at this time. So while I agree with you that when users look at all the images discussed there it looks intimidating, I think for this the choice I indicated above, quay.io/jupyter/pytorch-notebook:cuda12-ubuntu-24.04 makes sense.

I completely agree that we want to document how to customize this. Given the recent introduction of JupyterHub's Fancy Profiles that can build directly from a Dockerfile, it is easier than ever to bring-your-own Dockerfile (which is a natural pattern for codespaces and gitlab use as well).

There's obviously a lot of ways to set these things up, and just as Rocker has always done the rocker/ml repo will show just one opinionated way to go about it rather than something comprehensive or overly flexible; experienced users will always be able to adapt. e.g. I will go with Dirk's r2u approach, since for users writing their own Dockerfiles for a binder/jupyter experience, having it automatically solve apt dependencies is a significant win.

I know you've grown wary of all the python and cuda issues over here, so it sounds like addressing these in a different repo would be helpful too. For simplicity, the ml/ cuda image will not attempt the strong versioned promises we try and have here.

benz0li · 2025-02-03T06:09:08Z

The situation has changed dramatically from a few years ago when there was no rig or uv, and I think the significance of the old kind of pre-built image is declining.

As a user, pre-built images are easier to work with than using a base image + a virtual environment manager.

IMHO containers [like the ones here] + rig/uv/other [virtual environment manager] are not meant for each other.

makes sense to me -- that's what I've been doing for my needs, e.g. building on top of the jupyterhub cuda images. (e.g. https://github.com/boettiger-lab/k8s/blob/main/images/Dockerfile.gpu#L1 is my current gpu setup)

You could also use b-data's/my CUDA-based JupyterLab R docker stack.

[...] that can build directly from a Dockerfile, it is easier than ever to bring-your-own Dockerfile (which is a natural pattern for codespaces and gitlab use as well).

Most people are simply building on existing Rocker or Jupyter images.
ℹ Like almost all of the few GPU-accelerated [Jupyter-based] images available.

cboettig · 2025-02-03T16:08:02Z

Thanks @benz0li ! Your work is excellent as well. And yes, I totally get where you're coming from on containers vs virtual envs. I think that's definitely true for 'production containers', but perhaps a bit different for these 'dev containers' in which the goal is to support an end user customizing the environment further using patterns with which they are already familiar.

e.g. conda can be pretty cumbersome, especially when it comes to packages that require conda's 'activation' mechanism of shell shims and global env vars (e.g. as in rasterio and other gdal-binding conda packages).

However, as you already know, the official jupyter stacks are conda based, the python geospatial pangeo community is deeply conda based, and users know and expect conda. Hence the design I proposed above. This provides a concise Dockerfile that transparently extends the base Jupyter cuda image. Python installs are handled by conda. Meanwhile R installs are handled by Dirk's excellent r2u / bspm approach -- again based on user considerations. None of us think conda is a nice solution for installing R packages, but bspm handles the binary dependencies nicely (a container build time, during runtime I switch to binary installs from r-universe). In this way, a user can extend the environment with environment.yaml and install.r scripts without manually resolving lib deps.

As I noted above, this is certainly an opinionated setup, a bit different than existing setups but closely aligned with the official Jupyter images. I've tested this in a range of classroom and research settings over the past year or so alongside the other images discussed above. Moreover I think this provides a good way forward to maintain some cuda options in a separate repo in the rocker project while avoiding the headaches @eitsupi noted at top. big thanks to you both!

benz0li · 2025-02-04T08:00:28Z

However, as you already know, the official jupyter stacks are conda based

Yes. That was one reason I created my own docker stacks.

Other reasons: Rocker images' use of s6-overlay and Juypter images' handling of the user's home directory¹.

the python geospatial pangeo community is deeply conda based, and users know and expect conda.

People may install Conda / Mamba at user level.

Both the Version-stable Rocker images and Jupyter Docker Stacks are very popular and @eitsupi as well as @mathbunnyru do a great job improving and maintaining them.

b-data's/my docker stacks allow for a persistent home directory that may be shared among all JupyterLab R/Python/Mojo/MAX/Julia docker images. ↩

benz0li · 2025-02-04T08:07:58Z

Regarding dev containers: (CUDA-based) Data Science dev containers
ℹ Available for R, Python, Mojo/MAX and Julia

(I am trying to serve a larger community with a unified setup)

hute37 · 2025-02-07T18:17:05Z

I worked to have my working project with a custom extension of base ml-verse rocker image with this features:

pyenv based source python setup, from version in .python-version
pipx installed poetry based project definition with automated lock/install phase with pyproject.toml support
rustup rust/cargo availaibility
fnm node-js setup, used for full jupyter lab installation from poetry virtual env and IRKernel support
renv based automated R package installation, based on DESCRIPTION project specification

All running in a "rootless" Podman container, with "NVIDIA Container Toolkit" support.

References:

But ...

What I need now is to be have an image based on current NVIDIA/CUDA images:

see: https://hub.docker.com/r/nvidia/cuda/tags

FROM nvidia/cuda:12.8.0-cudnn-devel-ubuntu24.04

or

FROM nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04

Instead of the ml-verse version

see: https://github.com/rocker-org/rocker-versioned2/blob/master/dockerfiles/cuda_devel.Dockerfile#L3C6-L3C49

FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

In the past, because on my use of obsolete GPU, I had to downgrade CUDA version in the base image, with a patching script that run an apt based reinstallation of all CUDA stack

see: Supportin multiple CUDA versions? (CUDA bumps to 11.8 on the 4.2.2 images) #582

But now, that I run V100 NVIDIA (560 driver) on Ubuntu 24 with CUDA 12 installed (I also use python in native mode, without containers), what is the best approach ?

I strongly would like to avoid forking rocker project, just to patch one line, but it is a "very heavy line of code"

(I fear I can be lost in cuDNN version compatibility with tensorflow/pytorch python libs under R keras lib)

eitsupi · 2025-02-08T01:58:14Z

Thank you.

Anyway, I would like to remove all files associated with cuda from this repository as it appears that @cboettig has started work on https://github.com/rocker-org/ml and pushed the new rocker/ml image to DockerHub.
https://hub.docker.com/layers/rocker/ml/latest/images/sha256-1bfc8ec2179054ffc7d1ed87d2af1990067b85bdfbb846d13775b316f2499967

eitsupi · 2025-02-08T02:03:53Z

The situation has changed dramatically from a few years ago when there was no rig or uv, and I think the significance of the old kind of pre-built image is declining.

As a user, pre-built images are easier to work with than using a base image + a virtual environment manager.

IMHO containers [like the ones here] + rig/uv/other [virtual environment manager] are not meant for each other.

I did not say to use uv for the use of virtual environment manager.
I simply recommended that you install whatever version of Python you want.
I just don't want to guarantee that the version of Python you want is installed on the pre-built image.

cboettig · 2025-02-10T04:17:07Z

Thanks everyone for the discussion!

@hute37 let me know if you test out our setup in rocker-org/ml, the rocker/cuda image there is building on the jupyterhub cuda12 / ubuntu 24.04 image. The recipe can easily be swapped out for one of the other official JupyterHub base images (note their pytorch series includes versioned tags for different cuda and python in the base images.

And of course if you want a solution outside of rocker-org @eitsupi & @benz0li have great suggestions above too.

cboettig · 2025-02-10T04:18:57Z

I think we should close this thread to redirect further discussion of python and cuda issues over to rocker-org/ml repo as @eitsupi has requested. @benz0li I'll be sure to link to your stack and these other options there, I'm still flushing out the readme, and always appreciate your contributions!

…l-verse` (#905) Close #903 Close #756

hute37 · 2025-02-10T11:16:16Z

@cboettig

In my current environment I think I'll consider another path.
For my needs, the major value of rocker images is in their complete a full featured R images.
Not only base+tidyverse, but also full rstudio, knitr (LaTeX+pandoc) and geospatial support (very tricky to setup for system deps)

I need to consider as a traversal Mixin what is all about python, jupyter and CUDA support.

It is too fragile putting these stacks as a "base" image.

My job is to assist researcher in a statistical faculty in R and python projects setup. Every project has its needs (ML: tensorflow, pytorch, RL, etc) or R (stan, sparklyr, etc). In this context, full dependency control and long time reproducibility is strong requirement.
Project definition based on poetry/renv, with pyproject/DESCRIPTION specification is a good solution, also for unofficial packages on GitHub/GitLab, out of standard PyPi/CRAN distribution.

In my current setup, I've already included scripts for pyenv/pipx based python bootstrap and virtualenv based "jupyter lab" setup (with node-js front-end packages).

The missing point here is NVIDIA GPU support.

But this is very problematic one, if considered as a "base" image.

Some projects are CPU only, other require GPU of different architectures and different GPU capabilities. Older projects (tensorflow/RL) need to link older CUDA version (11), while new projects (pytorch) require CUDA-12

I think I'll standardize my containers on the latest (and greatest) geospatial image and I’ll try to add a (configurable) script for an (apt based?) CUDA setup.

cboettig · 2025-02-10T15:57:03Z

Thanks @hute37 , this maps closely to my own use cases. I think the approach I have put in rocker/ml can address this quite well, (though of course there are other ways.) Rather than add the CUDA support 'on top' of geospatial, I think it's now easier to swap the base image of the 'recipe' for the desired configuration (e.g. cuda-12, cuda-11, tensorflow, cpu-only, etc).

Are your users accessing JupyterLab through a hosted jupyterhub system or downloading the docker images to their laptops? I'd love to hear more about your setup.

Want to continue the discussion over in rocker/ml ?

hute37 · 2025-02-12T00:15:39Z

@cboettig

I'll follow up this discussion here:

Alternative containers hierarcy layout for R and Python integration in CUDA enabled projects

I'll write some note related to this different layout

eitsupi added CI pre-built images Related to pre-built images labels Jan 25, 2025

eitsupi pinned this issue Jan 25, 2025

eitsupi mentioned this issue Feb 8, 2025

refactor!: remove files for rocker/cuda, rocker/ml, and rocker/ml-verse #905

Merged

cboettig changed the title ~~Remove cuda based images~~ Migrate cuda based images out of rocker-versioned2 Feb 10, 2025

cboettig closed this as completed Feb 10, 2025

eitsupi added a commit that referenced this issue Feb 10, 2025

refactor!: remove files for rocker/cuda, rocker/ml, and `rocker/m…

7a650c7

…l-verse` (#905) Close #903 Close #756

hute37 mentioned this issue Feb 12, 2025

Alternative containers hierarchy layout for R and Python integration in CUDA enabled projects rocker-org/ml#34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate cuda based images out of rocker-versioned2 #903

Migrate cuda based images out of rocker-versioned2 #903

eitsupi commented Jan 25, 2025

cboettig commented Jan 25, 2025

cboettig commented Jan 28, 2025

eitsupi commented Jan 29, 2025

cboettig commented Jan 29, 2025 •

edited by eddelbuettel

Loading

benz0li commented Feb 3, 2025

cboettig commented Feb 3, 2025

benz0li commented Feb 4, 2025 •

edited

Loading

benz0li commented Feb 4, 2025 •

edited

Loading

hute37 commented Feb 7, 2025

eitsupi commented Feb 8, 2025 •

edited

Loading

eitsupi commented Feb 8, 2025

cboettig commented Feb 10, 2025

cboettig commented Feb 10, 2025

hute37 commented Feb 10, 2025 •

edited

Loading

cboettig commented Feb 10, 2025

hute37 commented Feb 12, 2025

Migrate cuda based images out of rocker-versioned2 #903

Migrate cuda based images out of rocker-versioned2 #903

Comments

eitsupi commented Jan 25, 2025

cboettig commented Jan 25, 2025

cboettig commented Jan 28, 2025

eitsupi commented Jan 29, 2025

cboettig commented Jan 29, 2025 • edited by eddelbuettel Loading

benz0li commented Feb 3, 2025

cboettig commented Feb 3, 2025

benz0li commented Feb 4, 2025 • edited Loading

Footnotes

benz0li commented Feb 4, 2025 • edited Loading

hute37 commented Feb 7, 2025

eitsupi commented Feb 8, 2025 • edited Loading

eitsupi commented Feb 8, 2025

cboettig commented Feb 10, 2025

cboettig commented Feb 10, 2025

hute37 commented Feb 10, 2025 • edited Loading

cboettig commented Feb 10, 2025

hute37 commented Feb 12, 2025

cboettig commented Jan 29, 2025 •

edited by eddelbuettel

Loading

benz0li commented Feb 4, 2025 •

edited

Loading

benz0li commented Feb 4, 2025 •

edited

Loading

eitsupi commented Feb 8, 2025 •

edited

Loading

hute37 commented Feb 10, 2025 •

edited

Loading