Skip to content

Xarray GPU optimization #771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 69 commits into
base: main
Choose a base branch
from
Open

Conversation

negin513
Copy link
Contributor

@negin513 negin513 commented May 1, 2025

Contributors: @negin513, @weiji14 , @TomAugspurger , @maxrjoes, @akshaysubr, @kafitzgerald

Copy link

vercel bot commented May 1, 2025

@negin513 is attempting to deploy a commit to the xarray Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up!

- name: Katelyn Fitzgerald
github: kafitzgerald

summary: 'How to accelerate AI/ML workflows in Earth Sciences with GPU-native Xarray and Zarr.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this more direct? "X% speedup" or "XMBps throughput"?

Comment on lines 158 to 165
(TODO ongoing work) Eventually with this [cupy-xarray Pull Request merged](https://github.com/xarray-contrib/cupy-xarray/pull/70) (based on earlier work at https://xarray.dev/blog/xarray-kvikio), this can be simplified to:

```python
import cupy_xarray

ds = xr.open_dataset(filename_or_obj="/tmp/air-temp.zarr", engine="kvikio")
assert isinstance(ds.air.data, cp.ndarray)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could go in a future work section at the end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not sure if this API is feasible or even desirable (have tried to implement this in xarray-contrib/cupy-xarray#70, but no luck yet patching the buffer protocol). So ok to move this towards the end.

- Consider using GPU Direct Storage (GDS) for optimal performance, but be aware of the setup and configuration required.
- GPU Direct Storage (GDS) can be an improvement for data-intensive workflows, but requires some setup and configuration.
- NVIDIA DALI is a powerful tool for optimizing data loading, but requires some effort to integrate into existing workflows.
- GPU-based decompression is a promising area for future work, but requires further development and testing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Icechunk!

@@ -0,0 +1,223 @@
---
title: 'Accelerating AI/ML Workflows in Earth Sciences with GPU-Native Xarray and Zarr (and more!)'
Copy link
Contributor

@dcherian dcherian May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: 'Accelerating AI/ML Workflows in Earth Sciences with GPU-Native Xarray and Zarr (and more!)'
title: 'GPU-Native Earth Science AI/ML Workflows Xarray, Zarr, DALI, and nvcomp'

better SEO this way?

Comment on lines 158 to 165
(TODO ongoing work) Eventually with this [cupy-xarray Pull Request merged](https://github.com/xarray-contrib/cupy-xarray/pull/70) (based on earlier work at https://xarray.dev/blog/xarray-kvikio), this can be simplified to:

```python
import cupy_xarray

ds = xr.open_dataset(filename_or_obj="/tmp/air-temp.zarr", engine="kvikio")
assert isinstance(ds.air.data, cp.ndarray)
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not sure if this API is feasible or even desirable (have tried to implement this in xarray-contrib/cupy-xarray#70, but no luck yet patching the buffer protocol). So ok to move this towards the end.

Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, this is coming along really nicely already! Just some minor nitpicks, but hope that we can publish this next month!

Copy link

@kafitzgerald kafitzgerald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for putting this together!

Mostly just a few minor suggestions from my end beyond the existing comments / questions.


## TL;DR

Earth science AI/ML workflows are often bottlenecked by slow data loading, leaving GPUs underutilized while CPUs struggle to feed large climate datasets like ERA5. In this blog post, we discuss how to build a GPU-native pipeline using Zarr v3, CuPy, KvikIO, and NVIDIA DALI to accelerate data throughput. We walk through profiling results, chunking strategies, direct-to-GPU data reads, and GPU-accelerated preprocessing, all aimed at maximizing GPU usage and minimizing I/O overhead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Earth science AI/ML workflows are often bottlenecked by slow data loading, leaving GPUs underutilized while CPUs struggle to feed large climate datasets like ERA5. In this blog post, we discuss how to build a GPU-native pipeline using Zarr v3, CuPy, KvikIO, and NVIDIA DALI to accelerate data throughput. We walk through profiling results, chunking strategies, direct-to-GPU data reads, and GPU-accelerated preprocessing, all aimed at maximizing GPU usage and minimizing I/O overhead.
Earth science AI/ML workflows are often limited by slow data loading, leaving GPUs underutilized while CPUs struggle to feed large climate datasets like ERA5. In this blog post, we discuss how to build a GPU-native pipeline using Zarr v3, CuPy, KvikIO, and NVIDIA DALI to accelerate data throughput. We walk through profiling results, chunking strategies, direct-to-GPU data reads, and GPU-accelerated preprocessing, all aimed at maximizing GPU usage and minimizing I/O overhead.

Not committed to this - just trying to vary the language a bit.

Copy link

netlify bot commented Jun 12, 2025

Deploy Preview for xarraydev ready!

Name Link
🔨 Latest commit 17352a3
🔍 Latest deploy log https://app.netlify.com/projects/xarraydev/deploys/684aa60b8599730008229284
😎 Deploy Preview https://deploy-preview-771--xarraydev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

pre-commit-ci bot and others added 23 commits June 12, 2025 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants