Skip to content

Commit

Permalink
Fix api
Browse files Browse the repository at this point in the history
  • Loading branch information
rcannood committed Aug 29, 2024
1 parent 7787759 commit 2b49102
Show file tree
Hide file tree
Showing 2 changed files with 70 additions and 59 deletions.
127 changes: 69 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,28 +36,56 @@ should convince readers of the significance and relevance of your task.

``` mermaid
flowchart LR
file_singlecell("SC Dataset")
comp_data_loader_sc[/"SC Data Loader"/]
file_common_singlecell("Raw SC Dataset")
file_common_spatialdata("Raw Spatial Dataset")
comp_data_preprocessor[/"Data preprocessor"/]
file_singlecell("SC Dataset")
file_spatialdata("Spatial Dataset")
comp_data_loader_sc[/"SC Data Loader"/]
file_common_spatialdata("Raw Spatial Dataset")
comp_data_loader_sp[/"iST Data Loader"/]
comp_data_preprocessor[/"Data preprocessor"/]
comp_data_loader_sc-->file_common_singlecell
file_common_singlecell---comp_data_preprocessor
comp_data_preprocessor-->file_singlecell
comp_data_preprocessor-->file_spatialdata
file_common_spatialdata---comp_data_preprocessor
comp_data_loader_sp-->file_common_spatialdata
```

## File format: SC Dataset
## Component type: SC Data Loader

A single-cell reference dataset, preprocessed for this benchmark.
A component to download and store single-cell data.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--output` | `file` | (*Output*) An unprocessed dataset as output by a dataset loader. |
| `--dataset_id` | `string` | NA. |
| `--dataset_name` | `string` | NA. |
| `--dataset_url` | `string` | (*Optional*) NA. |
| `--dataset_reference` | `string` | (*Optional*) NA. |
| `--dataset_summary` | `string` | NA. |
| `--dataset_description` | `string` | NA. |
| `--dataset_organism` | `string` | (*Optional*) NA. |

</div>

## File format: Raw SC Dataset

An unprocessed dataset as output by a dataset loader.

Example file:
`resources_test/preprocessing_imagingbased_st/2023_yao_mouse_brain_scrnaseq_10xv2/dataset.h5ad`
`resources_test/common/2023_yao_mouse_brain_scrnaseq_10xv2/dataset.h5ad`

Description:

This dataset contains preprocessed counts and metadata for single-cell
RNA-seq data.
This dataset contains raw counts and metadata as output by a dataset
loader.

The format of this file is mainly derived from the [CELLxGENE schema
v4.0.0](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/4.0.0/schema.md).

Format:

Expand Down Expand Up @@ -118,20 +146,34 @@ Data structure:

</div>

## File format: Raw SC Dataset
## Component type: Data preprocessor

An unprocessed dataset as output by a dataset loader.
Preprocess a common dataset for the benchmark.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--input_sp` | `file` | An unprocessed spatial imaging dataset stored as a zarr file. |
| `--input_sc` | `file` | An unprocessed dataset as output by a dataset loader. |
| `--output_sp` | `file` | (*Output*) A spatial transcriptomics dataset, preprocessed for this benchmark. |
| `--output_sc` | `file` | (*Output*) A single-cell reference dataset, preprocessed for this benchmark. |

</div>

## File format: SC Dataset

A single-cell reference dataset, preprocessed for this benchmark.

Example file:
`resources_test/common/2023_yao_mouse_brain_scrnaseq_10xv2/dataset.h5ad`
`resources_test/preprocessing_imagingbased_st/2023_yao_mouse_brain_scrnaseq_10xv2/dataset.h5ad`

Description:

This dataset contains raw counts and metadata as output by a dataset
loader.

The format of this file is mainly derived from the [CELLxGENE schema
v4.0.0](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/4.0.0/schema.md).
This dataset contains preprocessed counts and metadata for single-cell
RNA-seq data.

Format:

Expand Down Expand Up @@ -192,17 +234,17 @@ Data structure:

</div>

## File format: Raw Spatial Dataset
## File format: Spatial Dataset

An unprocessed spatial imaging dataset stored as a zarr file.
A spatial transcriptomics dataset, preprocessed for this benchmark.

Example file:
`resources_test/common/2023_10x_mouse_brain_xenium/dataset.zarr`
`resources_test/preprocessing_imagingbased_st/2023_10x_mouse_brain_xenium/dataset.zarr`

Description:

This dataset contains raw images, labels, points, shapes, and tables as
output by a dataset loader.
This dataset contains preprocessed images, labels, points, shapes, and
tables for spatial transcriptomics data.

Format:

Expand All @@ -216,17 +258,17 @@ Data structure:

</div>

## File format: Spatial Dataset
## File format: Raw Spatial Dataset

A spatial transcriptomics dataset, preprocessed for this benchmark.
An unprocessed spatial imaging dataset stored as a zarr file.

Example file:
`resources_test/preprocessing_imagingbased_st/2023_10x_mouse_brain_xenium/dataset.zarr`
`resources_test/common/2023_10x_mouse_brain_xenium/dataset.zarr`

Description:

This dataset contains preprocessed images, labels, points, shapes, and
tables for spatial transcriptomics data.
This dataset contains raw images, labels, points, shapes, and tables as
output by a dataset loader.

Format:

Expand All @@ -240,27 +282,6 @@ Data structure:

</div>

## Component type: SC Data Loader

A component to download and store single-cell data.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--output` | `file` | (*Output*) An unprocessed dataset as output by a dataset loader. |
| `--dataset_id` | `string` | NA. |
| `--dataset_name` | `string` | NA. |
| `--dataset_url` | `string` | (*Optional*) NA. |
| `--dataset_reference` | `string` | (*Optional*) NA. |
| `--dataset_summary` | `string` | NA. |
| `--dataset_description` | `string` | NA. |
| `--dataset_organism` | `string` | (*Optional*) NA. |

</div>

## Component type: iST Data Loader

A component to download and store iST data.
Expand All @@ -282,13 +303,3 @@ Arguments:

</div>

## Component type: Data preprocessor

Preprocess a common dataset for the benchmark.

Arguments:

<div class="small">

</div>

2 changes: 1 addition & 1 deletion src/api/comp_data_preprocessor.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ info:
description: |
This component processes a common single-cell and a common spatial transcriptomics
dataset for the benchmark.
arguments:
arguments:
- name: "--input_sp"
__merge__: file_common_spatialdata.yaml
direction: input
Expand Down

0 comments on commit 2b49102

Please sign in to comment.