submission prep script

Overview

python prepare_C2M2_submission.py (previously build_term_tables.py) is a Python script that automatically builds controlled-vocabulary (CV) term usage tables for C2M2 datapackage preparation, as well as performing some pre-submission data integrity checks.

The following files are built automatically by this script and should not be hand-created or edited; submit them along with the other required TSVs as part of your datapackage.

analysis_type.tsv
anatomy.tsv
assay_type.tsv
biofluid.tsv
compound.tsv
data_type.tsv
disease.tsv
file_format.tsv
gene.tsv
ncbi_taxonomy.tsv
phenotype.tsv
phenotype_disease.tsv
phenotype_gene.tsv
protein.tsv
protein_gene.tsv
sample_prep_method.tsv
substance.tsv

The following pre-submission validation checks are currently performed:

Ensure that for any file with a non-null persistent ID, a checksum is also provided.
Ensure that all (non-null) persistent IDs are unique (both within and across tables).

Usage

First build your dcc.tsv, id_namespace.tsv, project.tsv, project_in_project.tsv, file.tsv, file_describes_biosample.tsv, file_describes_collection.tsv, file_describes_subject.tsv, file_in_collection.tsv, biosample.tsv, biosample_disease.tsv, biosample_from_subject.tsv, biosample_gene.tsv, biosample_in_collection.tsv, biosample_substance.tsv, subject.tsv, subject_disease.tsv, subject_in_collection.tsv, subject_phenotype.tsv, subject_race.tsv, subject_role_taxonomy.tsv, subject_substance.tsv, collection.tsv, collection_anatomy.tsv, collection_biofluid.tsv, collection_compound.tsv, collection_defined_by_project.tsv, collection_disease.tsv, collection_gene.tsv, collection_in_collection.tsv, collection_phenotype.tsv, collection_protein.tsv, collection_ptm.tsv, collection_substance.tsv, collection_taxonomy.tsv, biosample_ptm.tsv, collection_ptm.tsv, ptm.tsv, ptm_type.tsv, ptm_subtype.tsv and domain_location.tsv tables. (Some of these can be left empty (as header-only TSVs) if desired: see the C2M2 table wiki for requirements. A zipped-folder containing empty core (and core-associated) tables can be downloaded from OSF.)
Download the script [Last updated 25 Feb 2025] at OSF
Download the CV reference files [Last updated 27 Nov 2024] at OSF (select external_CV_reference_files and then 'Download as zip'.)
Unzip the external_CV_reference_files folder
Put external_CV_reference_files and prepare_C2M2_submission.py into the same folder
Create a subdirectory containing your pre-built file.tsv, biosample.tsv, etc., then edit line 44 of prepare_C2M2_submission.py to match.
Use the command line to run the script: python prepare_C2M2_submission.py

This script is under active development: please contact us with any questions by emailing the helpdesk at [email protected] or posting to Discussions

submission prep script

Overview

Usage

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally