Skip to content

Commit ff3760b

Browse files
[docs] uv configuration best practices (#118)
* move managed-python to toml block instead of flag * include example/docs for customizing uv behavior * add custom index example to nav * clean up docslop
1 parent 9410022 commit ff3760b

11 files changed

Lines changed: 472 additions & 4 deletions

File tree

docs/api/environment_variables.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,48 @@ GROUNDHOG_PROXYSTORE_DIR=/scratch/username/proxystore python script.py
6262

6363
!!! warning "Under Construction 👷🚧"
6464
Proxystore integration is currently `.local`-only, this does not (yet) have any effect on `.remote` or `.submit` calls.
65+
66+
## GROUNDHOG_CACHE_DIR
67+
68+
**Type:** path string
69+
70+
**Default:** Falls back to `$SCRATCH`, then `$TMPDIR`, then `/tmp`
71+
72+
Directory where uv caches packages and Python installations on remote endpoints. This is used to set `UV_CACHE_DIR` and `UV_PYTHON_INSTALL_DIR` in the remote environment if they are not already set.
73+
74+
**Example:**
75+
```bash
76+
export GROUNDHOG_CACHE_DIR=/gpfs/shared/uv-cache
77+
```
78+
79+
**Why this matters:** HPC clusters often have NFS-mounted home directories that can cause file locking issues or have limited quotas. Using fast scratch storage or a shared cache directory improves performance and avoids these issues.
80+
81+
**Precedence:** Existing `UV_CACHE_DIR` and `UV_PYTHON_INSTALL_DIR` environment variables take precedence over `GROUNDHOG_CACHE_DIR`. If none are set, Groundhog uses this fallback chain:
82+
1. `$GROUNDHOG_CACHE_DIR` (if set)
83+
2. `$SCRATCH` (HPC scratch space)
84+
3. `$TMPDIR` (temporary directory)
85+
4. `/tmp` (system temp)
86+
87+
## `uv` Environment Variables
88+
89+
Groundhog uses `uv` to manage Python environments on remote endpoints. Any `UV_*` environment variable can be used to override `[tool.uv]` configuration in your script.
90+
91+
**Example - Per-endpoint package index:**
92+
```toml
93+
[tool.hog.cpu_endpoint]
94+
endpoint = "..."
95+
worker_init = """
96+
export UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cpu
97+
"""
98+
99+
[tool.hog.gpu_endpoint]
100+
endpoint = "..."
101+
worker_init = """
102+
export UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu121
103+
"""
104+
```
105+
106+
**See also:**
107+
108+
- [`uv` environment variable reference](https://docs.astral.sh/uv/reference/environment/) - Official documentation of `UV_*` env vars
109+
- [PEP 723 Concepts](../concepts/pep723.md#configuring-uv-via-tooluv) - Configuring uv via `[tool.uv]`

docs/concepts/pep723.md

Lines changed: 86 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -90,13 +90,96 @@ PEP 723 defines standard fields:
9090

9191
Tools can add their own sections under `[tool.*]`:
9292

93-
- `[tool.uv]` - uv-specific settings (e.g., `exclude-newer` for reproducibility)
93+
- `[tool.uv]` - uv package manager configuration (see below)
9494
- `[tool.hog.*]` - Groundhog endpoint configurations
9595

9696
Standard fields control the Python environment. Tool-specific fields configure behavior.
9797

98+
## Configuring `uv` via `[tool.uv]`
99+
100+
Groundhog uses `uv` to manage Python environments on remote endpoints. You can configure `uv`'s behavior through the `[tool.uv]` section in your PEP 723 metadata.
101+
102+
### Common `[tool.uv]` settings example:
103+
104+
```python
105+
# /// script
106+
# requires-python = ">=3.11"
107+
# dependencies = ["numpy", "torch"]
108+
#
109+
# [tool.uv]
110+
# exclude-newer = "2025-12-19T00:00:00Z" # Lock packages to a point in time
111+
# python-preference = "managed" # Use uv-managed Python
112+
# extra-index-url = [ # Additional package indexes
113+
# "https://download.pytorch.org/whl/cpu"
114+
# ]
115+
# ///
116+
```
117+
118+
**See also**: Any [uv settings](https://docs.astral.sh/uv/reference/settings/) can be used in `[tool.uv]` - the configuration is passed through to uv when creating the remote environment.
119+
120+
### Custom package sources with `[tool.uv.sources]`
121+
122+
For finer control over where specific packages come from, use `[tool.uv.sources]`:
123+
124+
```python
125+
# /// script
126+
# requires-python = ">=3.11"
127+
# dependencies = ["torch==2.5.1", "my-internal-lib", "my-github-dependency"]
128+
#
129+
# [[tool.uv.index]]
130+
# name = "pytorch-cpu"
131+
# url = "https://download.pytorch.org/whl/cpu"
132+
#
133+
# [[tool.uv.index]]
134+
# name = "facility-pypi"
135+
# url = "https://pypi.facility.gov/simple"
136+
#
137+
# [tool.uv.sources] (1)
138+
# torch = { index = "pytorch-cpu" }
139+
# my-internal-lib = { index = "facility-pypi" }
140+
# my-github-dependency = { git = "https://github.com/some-org/my-github-dependency", tag = "1.0.0" }
141+
# ///
142+
```
143+
144+
1. See also: [`uv` sources documentation](https://docs.astral.sh/uv/concepts/projects/dependencies/#dependency-sources)
145+
146+
This is useful for:
147+
148+
- Installing PyTorch CPU/CUDA variants from PyTorch's custom wheel server
149+
- Using private package registries for internal packages
150+
- Pulling specific packages from Git repositories or local paths
151+
152+
See the [PyTorch Custom Index Example](../examples/pytorch_custom_index.md) for a complete example.
153+
154+
### Configuration precedence
155+
156+
`uv` reads configuration with precedence: **Environment variables > `[tool.uv]` in script**
157+
158+
This means:
159+
160+
- Settings in `[tool.uv]` become the baseline for your script
161+
- Environment variables like `UV_INDEX_URL` can override them (useful for endpoint-specific configuration)
162+
163+
You can use environment variables to override `[tool.uv]` settings per endpoint:
164+
165+
```toml
166+
[tool.hog.cpu_cluster]
167+
endpoint = "..."
168+
worker_init = """
169+
export UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cpu
170+
"""
171+
172+
[tool.hog.gpu_cluster]
173+
endpoint = "..."
174+
worker_init = """
175+
export UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu121
176+
"""
177+
```
178+
179+
This lets the same script work on both CPU and GPU clusters without code changes.
180+
98181
## Next Steps
99182

100183
- **[Dependencies Example](../examples/dependencies.md)** - Add and use packages
101-
- **[Configuration Example](../examples/configuration.md)** - What else can `[tool.hog]` do?
102-
- **[`uv` Scripts Guide](https://docs.astral.sh/uv/guides/scripts/)** - Complete `uv` documentation for PEP 723
184+
- **[Configuration Example](../examples/configuration.md)** - What `[tool.hog.*]` config blocks do
185+
- **[`uv` Scripts Guide](https://docs.astral.sh/uv/guides/scripts/)** - Official `uv` reference for PEP 723 scripts

docs/examples/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Examples showing how to handle typical workflows:
1717

1818
- **[Parallel Execution](parallel-execution.md)** - Using `.submit()` for concurrent remote execution
1919
- **[Endpoint Configuration](configuration.md)** - How the configuration system merges settings from multiple sources (PEP 723, decorators, call-time overrides)
20+
- **[PyTorch from Custom Sources](pytorch_custom_index.md)** - Configuring uv to install packages from cluster-specific indexes, local paths, or internal mirrors
2021
- **[Importing Groundhog Functions](imported_function.md)** - Calling Groundhog functions from regular Python scripts, REPLs, and notebooks (includes import safety and `mark_import_safe()`)
2122

2223
## Running the Examples
Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
# PyTorch from Custom Package Sources
2+
3+
This example demonstrates how to configure uv to install PyTorch from cluster-specific package sources, such as internal mirrors, pre-built wheels on shared filesystems, or custom builds optimized for specific hardware.
4+
5+
## Common HPC Use Cases
6+
7+
- **Cluster-optimized builds**: System admins provide PyTorch wheels optimized for cluster hardware
8+
- **Internal mirrors**: Packages hosted on internal servers for air-gapped or bandwidth-restricted clusters
9+
- **Shared filesystem wheels**: Pre-built wheels on `/gpfs` or `/scratch` to avoid repeated downloads
10+
- **Custom PyTorch builds**: Modified PyTorch with cluster-specific patches or optimizations
11+
12+
## Full Example
13+
14+
```python title="pytorch_custom_index.py"
15+
# /// script
16+
# requires-python = ">=3.11,<3.13"
17+
# dependencies = [
18+
# "torch==2.5.1",
19+
# "torchvision==0.20.1",
20+
# ]
21+
#
22+
# [tool.uv]
23+
# exclude-newer = "2025-12-19T00:00:00Z"
24+
# python-preference = "managed"
25+
#
26+
# [[tool.uv.index]] (1)
27+
# name = "pytorch-cpu"
28+
# url = "https://download.pytorch.org/whl/cpu"
29+
#
30+
# [tool.uv.sources] (2)
31+
# torch = { index = "pytorch-cpu" }
32+
# torchvision = { index = "pytorch-cpu" }
33+
#
34+
# [tool.hog.my_endpoint]
35+
# endpoint = "your-endpoint-uuid"
36+
# ///
37+
38+
import groundhog_hpc as hog
39+
40+
41+
@hog.function(endpoint="my_endpoint")
42+
def check_pytorch() -> dict[str, str]:
43+
"""Check PyTorch installation details."""
44+
import torch
45+
46+
return {
47+
"version": torch.__version__,
48+
"cuda_available": str(torch.cuda.is_available()),
49+
"device": str(torch.device("cuda" if torch.cuda.is_available() else "cpu")),
50+
}
51+
52+
53+
@hog.function(endpoint="my_endpoint")
54+
def matrix_multiply(size: int = 1000) -> dict[str, float]:
55+
"""Simple PyTorch matrix multiplication benchmark."""
56+
import time
57+
58+
import torch
59+
60+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
61+
62+
start = time.time()
63+
a = torch.randn(size, size, device=device)
64+
b = torch.randn(size, size, device=device)
65+
c = torch.mm(a, b)
66+
elapsed = time.time() - start
67+
68+
return {
69+
"size": size,
70+
"device": str(device),
71+
"time_seconds": elapsed,
72+
"mean": float(c.mean()),
73+
}
74+
75+
76+
@hog.harness()
77+
def main():
78+
"""Run PyTorch functions remotely."""
79+
info = check_pytorch.remote()
80+
print(f"PyTorch {info['version']} on {info['device']}")
81+
82+
result = matrix_multiply.remote(500)
83+
print(f"{result['size']}x{result['size']} matmul: {result['time_seconds']:.3f}s")
84+
```
85+
86+
1. Define a named index pointing to your package source. In this example, PyTorch's public index for CPU wheels. Replace with your cluster's internal index URL.
87+
88+
2. Specify which packages should use which source. This tells uv to fetch `torch` and `torchvision` from the custom index instead of PyPI.
89+
90+
## Configuration Options
91+
92+
### Custom Package Index
93+
94+
For internal PyPI mirrors or cluster-specific package servers:
95+
96+
```toml
97+
[[tool.uv.index]]
98+
name = "cluster-pypi"
99+
url = "https://pypi.internal.mylab.edu/simple"
100+
101+
[tool.uv.sources]
102+
torch = { index = "cluster-pypi" }
103+
```
104+
105+
### Local Filesystem Path
106+
107+
For pre-built wheels on shared storage:
108+
109+
```toml
110+
[tool.uv.sources]
111+
torch = { path = "/gpfs/shared/wheels/torch-2.5.1+cu121-cp311-linux_x86_64.whl" }
112+
```
113+
114+
Or for a local package directory:
115+
116+
```toml
117+
[tool.uv.sources]
118+
torch = { path = "/gpfs/shared/pytorch-build", editable = true }
119+
```
120+
121+
### Direct URL
122+
123+
For wheels hosted on a web server:
124+
125+
```toml
126+
[tool.uv.sources]
127+
torch = { url = "https://internal.server.edu/wheels/torch-2.5.1-custom-py3-none-any.whl" }
128+
```
129+
130+
### Git Repository
131+
132+
For custom builds from Git:
133+
134+
```toml
135+
[tool.uv.sources]
136+
torch = { git = "https://github.com/myorg/pytorch", tag = "v2.5.1-custom" }
137+
```
138+
139+
## Per-Endpoint Configuration
140+
141+
Different endpoints may need different PyTorch builds. Use environment variables to override per endpoint:
142+
143+
```toml
144+
[tool.hog.cluster_a]
145+
endpoint = "cluster-a-uuid"
146+
worker_init = """
147+
# Cluster A has PyTorch wheels on shared storage
148+
export UV_FIND_LINKS=/gpfs/cluster-a/wheels
149+
"""
150+
151+
[tool.hog.cluster_b]
152+
endpoint = "cluster-b-uuid"
153+
worker_init = """
154+
# Cluster B uses an internal PyPI mirror
155+
export UV_INDEX_URL=https://pypi.cluster-b.edu/simple
156+
"""
157+
```
158+
159+
See also: [Environment Variables](../api/environment_variables.md#uv-environment-variables)
160+
161+
## Running the Example
162+
163+
```bash
164+
hog run pytorch_custom_index.py
165+
```
166+
167+
Output:
168+
169+
```
170+
PyTorch 2.5.1 on cuda
171+
500x500 matmul: 0.015s
172+
```
173+
174+
## Next Steps
175+
176+
- **[PEP 723 Concepts](../concepts/pep723.md#configuring-uv-via-tooluv)** - Complete uv configuration reference
177+
- **[Environment Variables](../api/environment_variables.md#uv-environment-variables)** - Override uv settings per endpoint
178+
- **[uv Dependencies](https://docs.astral.sh/uv/concepts/projects/dependencies/)** - Full uv dependency configuration docs

docs/getting-started/quickstart.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ This creates `hello.py` with the following structure:
2424
#
2525
# [tool.uv]
2626
# exclude-newer = "2025-12-10T00:00:00Z"
27+
# python-preference = "managed"
2728
#
2829
# [tool.hog.my-endpoint]
2930
# endpoint = "your-endpoint-uuid"

0 commit comments

Comments
 (0)