This repository is a forked repository of Nextstrain ncov. This repository contains a collection of tools to perform customized phylogenomic analysis on Illinois covid19 viral sequences hosted at Chicago land Pandemic Response Commons (PRC). The visualization of this workflow output can be deployed at PRC data commons using gen3-auspice
Detailed instruction about workflow setup, data preparation, customizing analysis, as well as results interpretation can be found HERE
For faster installation, update Conda to the latest version and install Mamba
.
conda update -n base conda
conda install -n base -c conda-forge manba
Create a virtual conda environment. The command below installs all the nexstrain tools as well as gen3-augur query tools
# change directory under gen3-ncov folder
cd gen3-ncov
mamba env create -n {env_name} -f ./environment.yml
Confirm that installation works
conda activate {env_name}
nextstrain check-setup --set-default
Edit file of set_env_var.sh
and add the path of PRC credentials to the variable GEN3_API_KEY
. The PRC credentials (json format) can be downloaded from the profile page after Login to PRC data commons. After saving the file, run the command line below.
source set_env_var.sh
# To confirm env variable
echo $GEN3_API_KEY
echo $project_id
To get the phylogenetic tree including all Illinois covid19 strains hosted at PRC data commons. Simply run
bash build_il_siu_tree.sh
- This bash script uses the profile of
IL_SIU_tree
under./my_profiles
folder - This script uses the gen3-client command-line tool to download object file from PRC commons. The tool included in this repo is compatible with linux system. To get the gen3-client for windows and OSX, visit cdis-data-client
- This workflow performs the analysis with all IL covid19 strains submitted to PRC data commons without subsampling scheme.
- To run a quicker analysis with subsampling scheme, run the command below after downloading step is done.
nextstrain build . --configfile my_profiles/IL_SIU_tree_subsampling/builds.yaml