Grouping of many small files

The raw data (in ~16MB files), and pipeline products, produce too many small files.  It would be useful to have some additional logic to e.g. tar raw dat files, and ideally to be able to find and read the necessary files from the tar files.  We don't want to (won't be able to) store more than a few million files long-term.

Some estimates on number of files (speced for 256 beams = 1/4sky)

- Raw files: ~4000 / beam / day = 1000000 / day
- Candidates: ~2x number of pointings, ~80000 / day
- Logs: number of pointings, ~40000 / day
- Stacks: number of pointings, ~160000 perpetual
- Folding products, very roughly ~number of pointings / 10

I guess the raw files and pipeline products are two separate problems.  The raw files are too cumbersome to keep in large quantities, but we may want to e.g. keep a few days / weeks in long-term storage, without taking tens of millions of files

The candidates, logs, folds, are small enough to store long-term, perhaps tarring every day

The stacks are borderline problematic, perhaps splitting based on ra or dec ranges would clean the structure up

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grouping of many small files #169

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Grouping of many small files #169

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions