Skip to content

Commit

Permalink
wf fixed
Browse files Browse the repository at this point in the history
  • Loading branch information
janursa committed Feb 6, 2025
1 parent 3ff38ca commit 2c98e0f
Show file tree
Hide file tree
Showing 24 changed files with 145 additions and 103 deletions.
13 changes: 8 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ flowchart TB

Chromatin accessibility data

Example file: `resources_test/grn_benchmark/inference_datasets//op_atac.h5ad`
Example file:
`resources_test/grn_benchmark/inference_data//op_atac.h5ad`

Format:

Expand Down Expand Up @@ -120,7 +121,7 @@ Arguments:

File indicating the inferred GRN.

Example file: `resources/grn_models/op/collectri.h5ad`
Example file: `resources_test/grn_models/op/collectri.h5ad`

Format:

Expand All @@ -139,7 +140,7 @@ Data structure:
|:---|:---|:---|
| `uns["dataset_id"]` | `string` | A unique identifier for the dataset. |
| `uns["method_id"]` | `string` | A unique identifier for the inference method. |
| `uns["prediction"]` | `DataFrame` | Inferred GRNs in the format of source, target, weight. |
| `uns["prediction"]` | `object` | Inferred GRNs in the format of source, target, weight. |

</div>

Expand Down Expand Up @@ -245,7 +246,8 @@ Data structure:

Perturbation dataset for benchmarking.

Example file: `resources_test/grn_benchmark/evaluation_data//op.h5ad`
Example file:
`resources_test/grn_benchmark/evaluation_data/op_bulk.h5ad`

Format:

Expand Down Expand Up @@ -275,7 +277,8 @@ Data structure:

RNA expression data.

Example file: `resources_test/grn_benchmark/inference_datasets//op_rna.h5ad`
Example file:
`resources_test/grn_benchmark/inference_data/op_rna.h5ad`

Format:

Expand Down
2 changes: 1 addition & 1 deletion _viash.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ info:
```bash
viash run src/control_methods/pearson_corr/config.vsh.yaml -- \
--rna resources/grn_benchmark/inference_datasets/norman_rna.h5ad --prediction output/net.h5ad
--rna resources/grn_benchmark/inference_data/norman_rna.h5ad --prediction output/net.h5ad
```
## Evaluate a GRN
Expand Down
156 changes: 94 additions & 62 deletions runs.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion scripts/add_a_method.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ viash run src/methods/$method_id/config.vsh.yaml -- \

# run the inference using the method for op dataset using only RNA data. Add more aurguments if needed.
viash run src/methods/$method_id/config.vsh.yaml -- \
--rna "resources/grn_benchmark/inference_datasets/op_rna.h5ad" \
--rna "resources/grn_benchmark/inference_data/op_rna.h5ad" \
--prediction "output/prediction.h5ad"

# run evaluation metrics
Expand Down
2 changes: 1 addition & 1 deletion scripts/download_resources.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ set -e

# common/scripts/sync_resources

aws s3 sync s3://openproblems-data/resources/grn/grn_benchmark resources/grn_benchmark --delete --no-sign-request
aws s3 sync s3://openproblems-data/resources/grn/grn_benchmark resources/grn_benchmark --delete --no-sign-request
4 changes: 2 additions & 2 deletions scripts/run_benchmark_all.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ param_list:
metric_ids: $metric_ids
method_ids: $method_ids
evaluation_data: ${resources_dir}/evaluation_data/${dataset}.h5ad
rna: ${resources_dir}/inference_datasets/${dataset}_rna.h5ad
atac: ${resources_dir}/inference_datasets/${dataset}_atac.h5ad
rna: ${resources_dir}/inference_data/${dataset}_rna.h5ad
atac: ${resources_dir}/inference_data/${dataset}_atac.h5ad
reg_type: $reg_type
subsample: $subsample
num_workers: $num_workers
Expand Down
13 changes: 7 additions & 6 deletions scripts/run_grn_evaluation.sh
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,16 @@ append_entry() {
cat >> $param_file << HERE
- id: ${reg_type}_${1}
metric_ids: ${metric_ids}
evaluation_data: ${resources_dir}/evaluation_data/${dataset}.h5ad
evaluation_data_sc: ${resources_dir}/evaluation_data/${dataset}_sc_counts.h5ad
evaluation_data: ${resources_dir}/grn_benchmark/evaluation_data/${dataset}.h5ad
evaluation_data_sc: ${resources_dir}/grn_benchmark/evaluation_data/${dataset}_sc.h5ad
reg_type: $reg_type
method_id: $1
dataset_id: $dataset
num_workers: $num_workers
tf_all: ${resources_dir}/prior/tf_all.csv
regulators_consensus: ${resources_dir}/prior/regulators_consensus_${dataset}.json
ws_consensus: ${resources_dir}/prior/ws_consensus_${dataset}.json
ws_distance_background: ${resources_dir}/prior/ws_distance_background_${dataset}.json
tf_all: ${resources_dir}/grn_benchmark/prior/tf_all.csv
regulators_consensus: ${resources_dir}/grn_benchmark/prior/regulators_consensus_${dataset}.json
ws_consensus: ${resources_dir}/grn_benchmark/prior/ws_consensus_${dataset}.csv
ws_distance_background: ${resources_dir}/grn_benchmark/prior/ws_distance_background_${dataset}.csv
prediction: ${grn_models_folder}/${dataset}/$1.h5ad
layer: "X_norm"
HERE
Expand Down
2 changes: 1 addition & 1 deletion src/api/file_atac_h5ad.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: resources_test/grn_benchmark/inference_datasets//op_atac.h5ad
example: resources_test/grn_benchmark/inference_data//op_atac.h5ad
label: chromatin accessibility data
summary: "Chromatin accessibility data"
info:
Expand Down
2 changes: 1 addition & 1 deletion src/api/file_rna_h5ad.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: resources_test/grn_benchmark/inference_datasets/op_rna.h5ad
example: resources_test/grn_benchmark/inference_data/op_rna.h5ad
label: gene expression data
summary: "RNA expression data."
info:
Expand Down
6 changes: 3 additions & 3 deletions src/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def analyse_meta_cells(task_grn_inference_dir):


par = {
'rna': f'{task_grn_inference_dir}/resources/grn_benchmark/inference_datasets/{dataset}_rna.h5ad',
'rna': f'{task_grn_inference_dir}/resources/grn_benchmark/inference_data/{dataset}_rna.h5ad',
"evaluation_data": f"{task_grn_inference_dir}/resources/grn_benchmark/evaluation_data//{dataset}.h5ad",

'layer': 'X_norm',
Expand Down Expand Up @@ -123,7 +123,7 @@ def analyse_imputation(task_grn_inference_dir):


par = {
'rna': f'{task_grn_inference_dir}/resources/grn_benchmark/inference_datasets/{dataset}_rna.h5ad',
'rna': f'{task_grn_inference_dir}/resources/grn_benchmark/inference_data/{dataset}_rna.h5ad',
"evaluation_data": f"{task_grn_inference_dir}/resources/grn_benchmark/evaluation_data//{dataset}.h5ad",

'layer': 'X_norm',
Expand Down Expand Up @@ -204,7 +204,7 @@ def analyse_imputation(task_grn_inference_dir):
def analyse_corr_vs_tfmasked_corr(task_grn_inference_dir):
for i_run, dataset in enumerate(['op', 'replogle', 'nakatake', 'norman', 'adamson']):
par = {
'rna': f'{task_grn_inference_dir}/resources/grn_benchmark/inference_datasets/{dataset}_rna.h5ad',
'rna': f'{task_grn_inference_dir}/resources/grn_benchmark/inference_data/{dataset}_rna.h5ad',
"evaluation_data": f"{task_grn_inference_dir}/resources/grn_benchmark/evaluation_data//{dataset}.h5ad",

'layer': 'X_norm',
Expand Down
4 changes: 2 additions & 2 deletions src/methods/multi_omics/celloracle/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@

## VIASH START
par = {
"rna": "resources/grn_benchmark/inference_datasets/op_rna.h5ad",
"atac": "resources/grn_benchmark/inference_datasets/op_atac.h5ad",
"rna": "resources/grn_benchmark/inference_data/op_rna.h5ad",
"atac": "resources/grn_benchmark/inference_data/op_atac.h5ad",
"base_grn": 'output/celloracle/base_grn.csv',
"temp_dir": 'output/celloracle/',
"num_workers": 10,
Expand Down
2 changes: 1 addition & 1 deletion src/methods/single_omics/scgpt/run.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
viash run src/methods/single_omics/scgpt/config.vsh.yaml -- \
--rna resources_test/grn_benchmark/inference_datasets//op_rna.h5ad \
--rna resources_test/grn_benchmark/inference_data//op_rna.h5ad \
--tf_all resources/grn_benchmark/prior/tf_all.csv \
--prediction output/prediction.h5ad
4 changes: 2 additions & 2 deletions src/methods/single_omics/scprint/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@


# viash run src/methods/single_omics/scprint/config.vsh.yaml -- \
# --rna resources_test/grn_benchmark/inference_datasets//op_rna.h5ad \
# --rna resources_test/grn_benchmark/inference_data//op_rna.h5ad \
# --tf_all resources/grn_benchmark/prior/tf_all.csv \
# --prediction output/prediction.h5ad


# python src/methods/single_omics/scprint/script.py \
# --rna resources/grn_benchmark/inference_datasets/op_rna.h5ad \
# --rna resources/grn_benchmark/inference_data/op_rna.h5ad \
# --tf_all resources/grn_benchmark/prior/tf_all.csv \
# --prediction output/prediction.h5ad

Expand Down
2 changes: 1 addition & 1 deletion src/methods/single_omics/scprint/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

## VIASH START
par = {
'rna': 'resources/grn_benchmark/inference_datasets/op_rna.h5ad',
'rna': 'resources/grn_benchmark/inference_data/op_rna.h5ad',
'tf_all': 'resources/grn_benchmark/prior/tf_all.csv',
'prediction': 'output/grn.h5ad',
'filtration': 'top-k',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ library(tibble)

## VIASH START
par <- list(
multiomics_atac = "resources/grn_benchmark/inference_datasets/op_atac.h5ad",
multiomics_atac = "resources/grn_benchmark/inference_data/op_atac.h5ad",
annot_peak_database = "resources/grn_benchmark/prior/peak_annotation.csv"
)
## VIASH END
Expand Down
4 changes: 2 additions & 2 deletions src/process_data/op_multiomics/format_data/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
## VIASH START
par = {
'multiome_counts': 'resources/datasets_raw/op_multiome_sc_counts.h5ad',
'multiomics_rna': 'resources/grn_benchmark/inference_datasets/op_rna.h5ad',
'multiomics_atac': 'resources/grn_benchmark/inference_datasets/op_atac.h5ad'
'multiomics_rna': 'resources/grn_benchmark/inference_data/op_rna.h5ad',
'multiomics_atac': 'resources/grn_benchmark/inference_data/op_atac.h5ad'
}
## VIASH END

Expand Down
2 changes: 1 addition & 1 deletion src/process_data/pereggrn/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def process_dataset(file_name):
adata.write(f'resources/grn_benchmark/evaluation_data/{file_name}_sc.h5ad')

adata_bulked.write(f'resources/extended_data/{file_name}_bulk.h5ad')
adata_train.write(f'resources/grn_benchmark/inference_datasets/{file_name}_rna.h5ad')
adata_train.write(f'resources/grn_benchmark/inference_data/{file_name}_rna.h5ad')
adata_test.write(f'resources/grn_benchmark/evaluation_data/{file_name}_bulk.h5ad')


Expand Down
2 changes: 1 addition & 1 deletion src/process_data/replogle_k562_gwps/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,5 @@ python src/process_data/replogle_k562_gwps/script.py \
--adata_bulk resources/extended_data/replogle_bulk.h5ad \
--adata_test_sc resources/grn_benchmark/evaluation_data/replogle_sc.h5ad \
--adata_test_bulk resources/grn_benchmark/evaluation_data/replogle_bulk.h5ad \
--adata_train_bulk resources/grn_benchmark/inference_datasets/replogle_rna.h5ad \
--adata_train_bulk resources/grn_benchmark/inference_data/replogle_rna.h5ad \
--adata_train_sc resources/extended_data/replogle_train_sc.h5ad \
4 changes: 2 additions & 2 deletions src/process_data/test_data/run.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
viash run src/process_data/test_data/config.novsh.yaml -- \
--rna resources/grn_benchmark/inference_datasets/op_rna.h5ad --rna_test resources_test/grn_benchmark/inference_datasets//op_rna.h5ad \
--atac resources/grn_benchmark/inference_datasets/op_atac.h5ad --atac_test resources_test/grn_benchmark/inference_datasets//op_atac.h5ad \
--rna resources/grn_benchmark/inference_data/op_rna.h5ad --rna_test resources_test/grn_benchmark/inference_data//op_rna.h5ad \
--atac resources/grn_benchmark/inference_data/op_atac.h5ad --atac_test resources_test/grn_benchmark/inference_data//op_atac.h5ad \
--perturbation_data resources/grn_benchmark/evaluation_data/op.h5ad --perturbation_data_test resources_test/grn_benchmark/evaluation_data/op.h5ad \
--multiomics_counts resources/grn_benchmark/datasets_raw/op_multiome_sc_counts.h5ad --multiomics_counts_test resources_test/grn_benchmark/datasets_raw/op_multiome_sc_counts.h5ad \
# --perturbation_counts resources/datasets_raw/op_perturbation_sc_counts.h5ad --perturbation_counts_test resources_test/datasets_raw/op_perturbation_sc_counts.h5ad \
Expand Down
8 changes: 4 additions & 4 deletions src/process_data/test_data/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@
## VIASH START

par = {
'rna': 'resources/grn_benchmark/inference_datasets/op_rna.h5ad',
'rna_test': 'resources_test/grn_benchmark/inference_datasets//op_rna.h5ad',
'rna': 'resources/grn_benchmark/inference_data/op_rna.h5ad',
'rna_test': 'resources_test/grn_benchmark/inference_data//op_rna.h5ad',

'atac': 'resources/grn_benchmark/inference_datasets/op_atac.h5ad',
'atac_test': 'resources_test/grn_benchmark/inference_datasets//op_atac.h5ad',
'atac': 'resources/grn_benchmark/inference_data/op_atac.h5ad',
'atac_test': 'resources_test/grn_benchmark/inference_data//op_atac.h5ad',

'perturbation_data': 'resources/grn_benchmark/evaluation_data//op.h5ad',
'perturbation_data_test': 'resources_test/grn_benchmark/evaluation_data//op.h5ad',
Expand Down
4 changes: 4 additions & 0 deletions src/workflows/run_grn_evaluation/config.vsh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ argument_groups:
type: string
required: true
direction: input
- name: --dataset_id
type: string
required: true
direction: input
- name: --tf_all
type: file
direction: input
Expand Down
2 changes: 2 additions & 0 deletions src/workflows/run_grn_evaluation/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,13 @@ workflow run_wf {
// use 'fromState' to fetch the arguments the component requires from the overall state
fromState: [
evaluation_data: "evaluation_data",
evaluation_data_sc: "evaluation_data_sc",
prediction: "prediction",
ws_distance_background: "ws_distance_background",
subsample: "subsample",
reg_type: "reg_type",
method_id: "method_id",
dataset_id: "method_id",
num_workers: "num_workers",
regulators_consensus: "regulators_consensus",
ws_consensus: "ws_consensus",
Expand Down
4 changes: 2 additions & 2 deletions src/workflows_local/benchmark/methods/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@
def run_grn_inference(par, dataset='op', subsample=None):
par_local = {
'models_dir': f'resources/grn_models/{dataset}/',
'rna': f'resources/grn_benchmark/inference_datasets/{dataset}_rna.h5ad',
'atac': f'resources/grn_benchmark/inference_datasets/{dataset}_atac.h5ad',
'rna': f'resources/grn_benchmark/inference_data/{dataset}_rna.h5ad',
'atac': f'resources/grn_benchmark/inference_data/{dataset}_atac.h5ad',
'rna_positive_control': f'resources/datasets_raw/{dataset}.h5ad',
'num_workers': 10,
'tmp_dir': 'output/grn_inference'
Expand Down
2 changes: 1 addition & 1 deletion test.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
}
],
"source": [
"adata = ad.read_h5ad('resources/grn_benchmark/inference_datasets/replogle_rna.h5ad')\n",
"adata = ad.read_h5ad('resources/grn_benchmark/inference_data/replogle_rna.h5ad')\n",
"adata[adata.obs['is_control']]"
]
}
Expand Down

0 comments on commit 2c98e0f

Please sign in to comment.