Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions _uw-research-computing/apptainer-htc.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ exit
Once you are satisfied that your container is built correctly, copy your `.sif` file to your staging directory.

```
mv my-container.sif /staging/$USER
mv my-container.sif /staging/u/username
```
{:.term}

Expand Down Expand Up @@ -314,7 +314,7 @@ Since Apptainer `.sif` files are routinely more than 1GB in size, we recommend t
It is usually easiest to move the container file directly to staging while still in the interactive build job:

```
mv my-container.sif /staging/$USER
mv my-container.sif /staging/u/username
```
{:.term}

Expand Down
14 changes: 7 additions & 7 deletions _uw-research-computing/check-quota.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,9 @@ This will print a table with your `/home` and `/staging` quotas. An example outp

```
[user@ap2002 ~]$ get_quotas
Path Disk_Used(GB) Disk_Limit(GB) Files_Used File_Limit
/home/user 16.0711 40 8039 N/A
/staging/user 13.4731 100 12 1000
Path Disk_Used(GB) Disk_Limit(GB) Files_Used File_Limit
/home/username 16.0711 40 8039 N/A
/staging/u/username 13.4731 100 12 1000
```
{:.term}

Expand Down Expand Up @@ -119,10 +119,10 @@ _____________________________________________________________________
== NOTICE: THIS NODE IS ON PUPPET ENVIRONMENT "puppet8" ==

Filesystem quota report (last updated 10:33 AM)
Storage Used (GB) Limit (GB) Files (#) File Cap (#) Quota (%)
------------------ ----------- ------------ ----------- -------------- -----------
/home/user 29.38 40 94 0 73.46
/staging/user 50.23 1000 110 10000 5.02
Storage Used (GB) Limit (GB) Files (#) File Cap (#) Quota (%)
------------------ ----------- ------------ ----------- -------------- -----------
/home/username 29.38 40 94 0 73.46
/staging/u/username 50.23 1000 110 10000 5.02
```
{:.term}

Expand Down
28 changes: 17 additions & 11 deletions _uw-research-computing/file-avail-largedata.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ When submitting jobs to the HTC system, large data needs to be stored and handle
* [Intended use](#intended-use)
* [User responsibilities](#user-responsibilities)
- [Stage large data](#stage-large-data)
* [Request a `/staging` directory](#request-a-staging-directory)
* [Your personal `/staging` directory](#your-personal-staging-directory)
* [Reduce file counts](#reduce-file-counts)
* [Use the transfer server](#use-the-transfer-server)
* [Remove files after jobs complete](#remove-files-after-jobs-complete)
Expand Down Expand Up @@ -67,18 +67,24 @@ CHTC staff reserve the right to remove data from our large data staging location

In order to stage large data for use on CHTC's HTC system, users must:

1. **Request a `/staging` directory**: Use our quota request form.
1. **Reduce file counts**: Combine and compress files that are used together.
1. **Transfer files to the HTC system via the transfer server**: Upload your data via our dedicated file transfer server.
1. **Remove files after jobs complete**: Our data staging space is quota-controlled and not backed up.

### Request a `/staging` directory
### Your personal `/staging` directory

Any one with a CHTC account whose data meets the intended use above can request space in our large data staging area by filling out a quota request form. The default quota is 100 GB / 1000 items; if a larger quota is needed, request a higher quota. The created directory will exist at this path: `/staging/username`
Each user should have a personal `/staging` directory. The created directory will exist in an alphabetized subdirectory based on the **first letter** of your NetID. For example:

| NetID | Path to your personal `/staging` directory |
| --- | --- |
| `alice` | `/staging/a/alice` |
| `bucky` | `/staging/b/bucky` |

We can also create group or shared spaces by request.

<p style="text-align: center; margin-bottom: 0; font-weight: bold;">Need a <code>/staging</code> directory or higher quota?</p>
The default quota is 100 GB / 1000 items; if a larger quota is needed, request a higher quota.

<p style="text-align: center; margin-bottom: 0; font-weight: bold;">Need a group <code>/staging</code> directory or higher quota?</p>
<div class="d-flex mb-3">
<div class="p-3 m-auto">
<a class="btn btn-primary" style="text-align: center" href="quota-request">Quota request form</a>
Expand All @@ -88,7 +94,7 @@ We can also create group or shared spaces by request.

### Reduce file counts

The file system backing our `/staging`space is optimized to handle small numbers of large files. If your job requires many small files, we recommend placing these files in the `/home` directory or compressing multiple files into a single zip file or tarball. See [this table](htc-job-file-transfer#data-storage-locations) for more information on the differences between `/staging` and `/home`.
The file system backing our `/staging` space is optimized to handle small numbers of large files. If your job requires many small files, we recommend placing these files in the `/home` directory or compressing multiple files into a single zip file or tarball. See [this table](htc-job-file-transfer#data-storage-locations) for more information on the differences between `/staging` and `/home`.

Data placed in our large data `/staging` location should be stored in as few files as possible (ideally, one file per job), and will be used by a job only after being copied from `/staging` into the job working directory. Similarly, large output should first be written to the job's working directory then compressed in to a single file before being copied to `/staging` at the end of the job.

Expand All @@ -105,7 +111,7 @@ Uploading or downloading data to `/staging` should only be performed via CHTC's

For example, you can use `scp` to transfer files into your `/staging` directory:
```
$ scp large.file netid@transfer.chtc.wisc.edu:/staging/netid/
$ scp large.file username@transfer.chtc.wisc.edu:/staging/u/username/
```
{:.term}

Expand All @@ -123,7 +129,7 @@ Staged files should be specified in the job submit file using the `osdf:///` or
depending on the size of the files to be transferred. [See this table for more information](htc-job-file-transfer#transfer-input-data-to-jobs-with-transfer_input_files).

```
transfer_input_files = osdf:///chtc/staging/username/file1, file:///staging/username/file2, file3
transfer_input_files = osdf:///chtc/staging/u/username/file1, file:///staging/u/username/file2, file3
```
{:.sub}

Expand All @@ -136,7 +142,7 @@ Large outputs should be transferred to staging using the same file transfer prot

```
transfer_output_files = file1, file2, file3
transfer_output_remaps = "file1 = osdf:///chtc/staging/username/file1; file2 = file:///staging/username/file2"
transfer_output_remaps = "file1 = osdf:///chtc/staging/u/username/file1; file2 = file:///staging/u/username/file2"
```
{:.sub}

Expand Down Expand Up @@ -166,14 +172,14 @@ within the user's `/home` directory:
``` {.sub}
### Example submit file for a single job that stages large data
# Files for the below lines MUST all be somewhere within /home/username,
# and not within /staging/username
# and not within /staging/u/username

executable = run_myprogram.sh
log = myprogram.log
output = $(Cluster).out
error = $(Cluster).err

transfer_input_files = osdf:///chtc/staging/username/myprogram, file:///staging/username/largedata.tar.gz
transfer_input_files = osdf:///chtc/staging/u/username/myprogram, file:///staging/u/username/largedata.tar.gz

# IMPORTANT! Require execute servers that can access /staging
Requirements = (Target.HasCHTCStaging == true)
Expand Down
9 changes: 3 additions & 6 deletions _uw-research-computing/high-memory-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,16 +174,13 @@ Altogether, a sample submit file may look something like this:
``` {.sub}
### Example submit file for a single staging-dependent job

universe = vanilla

# Files for the below lines will all be somewhere within /home/username,
# and not within /staging/username
# and not within /staging/u/username
log = run_myprogram.log
executable = run_Trinity.sh
output = $(Cluster).out
error = $(Cluster).err
transfer_input_files = trinityrnaseq-2.0.1.tar.gz
should_transfer_files = YES

# Require execute servers that have large data staging
Requirements = (Target.HasCHTCStaging == true)
Expand Down Expand Up @@ -236,7 +233,7 @@ Altogether, a sample script may look something like this (perhaps called
#!/bin/bash
# Copy input data from /staging to the present directory of the job
# and un-tar/un-zip them.
cp /staging/username/reads.tar.gz ./
cp /staging/u/username/reads.tar.gz ./
tar -xzvf reads.tar.gz
rm reads.tar.gz

Expand All @@ -255,7 +252,7 @@ Trinity --seqType fq --left reads_1.fq \
# Trinity will write output to the working directory by default,
# so when the job finishes, it needs to be moved back to /staging
tar -czvf trinity_out_dir.tar.gz trinity_out_dir
cp trinity_out_dir.tar.gz trinity_stdout.txt /staging/username/
cp trinity_out_dir.tar.gz trinity_stdout.txt /staging/u/username/
rm reads_*.fq trinity_out_dir.tar.gz trinity_stdout.txt

### END
Expand Down
2 changes: 1 addition & 1 deletion _uw-research-computing/htc-docker-to-apptainer.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ INFO: Build complete: container.sif
Because container images are generally large, we require users to move these images into their staging directories. While you are still in your interactive job, move the image to your staging directory.

```
mv container.sif /staging/username/
mv container.sif /staging/u/username/
```
{:.term}

Expand Down
19 changes: 6 additions & 13 deletions _uw-research-computing/htc-job-file-transfer.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,6 @@ The HTC system has two primary locations where users can place their files:

The data management mechanisms behind `/home` and `/staging` are different and are optimized to handle different file sizes and numbers of files. It's important to place your files in the correct location to improve the efficiency at which your data is handled and maintain the stability of the HTC file systems.

<p style="text-align: center; margin-bottom: 0; font-weight: bold;">Need a <code>/staging</code> directory?</p>
<div class="d-flex mb-3">
<div class="p-3 m-auto">
<a class="btn btn-primary" style="text-align: center" href="quota-request">Request one here</a>
</div>
</div>


## Transfer input data to jobs with `transfer_input_files`

Expand All @@ -57,8 +50,8 @@ To transfer files to jobs, we must specify these files with `transfer_input_file
| Input File Size (Per File)* | File Location | Submit File Syntax to Transfer to Jobs |
| ----------- | ----------- | ----------- | ----------- |
| 0 - 1 GB | `/home` | `transfer_input_files = input.txt` |
| 1 - 30 GB | `/staging` | `transfer_input_files = osdf:///chtc/staging/NetID/input.txt` |
| 30 - 100 GB | `/staging` | `transfer_input_files = file:///staging/NetID/input.txt` |
| 1 - 30 GB | `/staging` | `transfer_input_files = osdf:///chtc/staging/u/username/input.txt` |
| 30 - 100 GB | `/staging` | `transfer_input_files = file:///staging/u/username/input.txt` |
| 1 - 100 GB | `/staging/groups`<sup>†</sup> | `transfer_input_files = file:///staging/groups/group_dir/input.txt` |
| 100 GB+ | | Contact the facilitation team about the best strategy to stage your data |

Expand All @@ -73,7 +66,7 @@ Multiple input files and file transfer protocols can be specified and delimited
```
# My job submit file

transfer_input_files = file1, osdf:///chtc/staging/username/file2, file:///staging/username/file3, dir1, dir2/
transfer_input_files = file1, osdf:///chtc/staging/u/username/file2, file:///staging/u/username/file3, dir1, dir2/

requirements = (HasCHTCStaging == true)

Expand Down Expand Up @@ -120,7 +113,7 @@ transfer_output_files = output_file, output/output_file2, output/output_file3

To transfer files back to `/staging` or a specific directory in `/home`, you will need an additional line in your HTCondor submit file, with each item separated by a semicolon (;):
```
transfer_output_remaps = "output_file = osdf:///chtc/staging/NetID/output1.txt; output_file2 = /home/netid/outputs/output_file2"
transfer_output_remaps = "output_file = osdf:///chtc/staging/u/username/output1.txt; output_file2 = /home/u/username/outputs/output_file2"
```
{:.sub}

Expand All @@ -133,7 +126,7 @@ Make sure to only include one set of quotation marks that wraps around the infor
If you want to transfer *all* files to a specific destination, use `output_destination`:

```
output_destination = osdf:///chtc/staging/netid/
output_destination = osdf:///chtc/staging/u/username/
```
{:.sub}

Expand All @@ -146,7 +139,7 @@ The `osdf:///` file transfer plugin is powered by the [Pelican Platform](https:/
To transfer and unpack files, append a `?pack=auto` at the end of the plugin path of the compressed object to be transferred.

```
transfer_input_files = osdf:///chtc/staging/netid/filename.tar.gz?pack=auto, input1.txt, input2.txt
transfer_input_files = osdf:///chtc/staging/u/username/filename.tar.gz?pack=auto, input1.txt, input2.txt
```

This feature is only availble for Pelican-based plugins (`osdf://`, `pelican://`) and is not available for `file://` or normal file transfers. This feature is also not recommended for compressed files larger than 30 GB.
Expand Down
8 changes: 4 additions & 4 deletions _uw-research-computing/htc-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,9 +163,9 @@ Each of the disk space values are given in megabytes (MB), which can be converte

### Check `/staging` Quota and Usage

To see your `/staging` quota and usage, use the `get_quotas <NetID>` command. For example,
To see your `/staging` quota and usage, use the `get_quotas` command. For example,
```
[NetID@ap2001 ~]$ get_quotas /staging/NetID
[NetID@ap2001 ~]$ get_quotas
```
{:.term}

Expand All @@ -178,8 +178,8 @@ Alternatively, the `ncdu` command can also be used to see how many
files and directories are contained in a given path:

```
[NetID@ap2001 ~]$ ncdu /home/NetID
[NetID@ap2001 ~]$ ncdu /staging/NetID
[NetID@ap2001 ~]$ ncdu /home/username
[NetID@ap2001 ~]$ ncdu /staging/u/username
```
{:.term}

Expand Down
2 changes: 1 addition & 1 deletion _uw-research-computing/htc-uwdf-researchdrive.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ transfer_output_files = outputfile1.txt, outputfile2.txt, outputfile3.txt
You can use `transfer_output_remaps` to place files in different locations:

```
transfer_output_remaps = "outputfile1.txt = pelican://chtc.wisc.edu/researchdrive/<PI NetID>/CHTC/outputfile1.txt; outputfile2.txt = osdf:///chtc/staging/<NetID>/outputfile2.txt"
transfer_output_remaps = "outputfile1.txt = pelican://chtc.wisc.edu/researchdrive/<PI NetID>/CHTC/outputfile1.txt; outputfile2.txt = osdf:///chtc/staging/u/username/outputfile2.txt"
```

The example above remaps the output files such that only `outputfile1.txt` is placed in ResearchDrive, `outputfile2.txt` is placed in `/staging`, and `outputfile3.txt` is placed in the submit directory on `/home`.
Expand Down
71 changes: 71 additions & 0 deletions _uw-research-computing/staging-transition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
highlighter: none
layout: guide
title: "Staging directory transition"
guide:
category: General
tag:
- htc
---


<p style="text-align:center"><img src="/images/staging-transition.svg" alt="Illustration of the `/staging` directory structure transition. On the left hand panel, the `/staging` directories has the subdirectories `alice`, `alison`, `bucky`, and `cathy`. An arrow points to the right hand panel, which has the `/staging` directory with the subdirectories `a`, `b`, and `c`. Subdirectories `alice` and `alison` are in the `a` subdirectory, `bucky` in the `b` subdirectory, and `cathy` in the `c` subdirectory." width=500px></p>
<p style="text-align:center"><caption>Illustration of the <code>/staging</code> directory structure transition.</caption></p>


## Transition to a new `/staging` directory structure

Starting Thursday, September 11, 2025, we are transitioning to a new directory structure for personal staging directories. **This affects all users on the HTC system**.

Personal staging directories will now be located in **alphabetized subdirectories** based on the first letter of your NetID. For example:


| Previous `/staging` directory path | New `/staging` directory path |
| --- | --- |
| `/staging/netid` | `/staging/n/netid` |
| `/staging/bucky` | `/staging/b/bucky` |

Group `/staging` directories are not affected and will remain in the `/staging/groups` subdirectory.

## Transition process

CHTC will:

1. Copy your files to your new `/staging` directory.
2. Create a [symlink](https://en.wikipedia.org/wiki/Symbolic_link) at your previous `/staging` directory path that points to the new `/staging` directory path.

## Timeline

* **September 11, 2025**. Transition begins. Users may begin using their new `/staging` directory.
* **December 31, 2025**. Transition ends. Symlinks at previous `/staging` directory paths will be deleted.

## What you should do

If you have an existing `/staging` directory, **between September 11 and December 31**, please review all your files that reference your `/staging` directory. This may include but is not limited to:

* HTCondor submit files
* Executables and scripts
* Environment variables
* DAGMan files

Please change any reference to your personal `/staging` directory to the new path.

For example, in an HTCondor submit file, change:

```
container_image = osdf:///chtc/staging/netid/my-container.sif
```

to:

```
container_image = osdf:///chtc/staging/n/netid/my-container.sif
```

## Why we are transitioning

Our `/staging` directories are backed by the Ceph File System, which has [slower performance when it must load very large directories](https://docs.ceph.com/en/reef/cephfs/app-best-practices/#very-large-directories). To mitigate large loads on the file system, we are sorting users personal `/staging` directories into alphabetized subdirectories.

## Get support

If you have questions or concerns, please email us at [[email protected]](mailto:[email protected]).
Loading
Loading