GitHub

Pipeline that visualizes the location of virulence genes in a taxonomic and phylogenetic tree

Introduction

This pipeline has two workflows. One workflow a taxonomic tree with a phylogenetic profile next to it. It uses gene names (can be changed in the config file) to look into the PATRIC database. With this intormation it takes all the entries that contain the gene name and makes a tree with this. Than it looks at which species have the input gene and which don't. Than it can make a phylogenetic profile. This workflow is named USE_PATRIC. The second workflow makes a taxonomic tree and a phylogenetic tree of the input gene(s). It also has to option to make a taxonomic tree with a phylogenetic profile if more than one gene is in the input file. This workflow is named USE_BLAST

Author: Aldo Vree Date last updated: 16 of June 2021

Technologies

Python 3.6
Snakemake 5.10
R 4.0.3
ETE3

How to launch the pipeline?

The pipeline runs in Snakemake so if you want to run it, just type 'snakemake' in the commandline:

snakemake # Make sure you are in the directory with the Snakefile.

Note: The first time you run the USE_PATRIC workflow it will take a really long time (up to 10 hours) because the whole PATRIC database has to be downloaded. If you are on the alive server of the UU, the PATRIC database is on the server and than it takes about 10 min to finish the run.

Config file

In the config file are all the variables that can be changed. For the USE_PATRIC workflow the names of the searchgenes can be changed in this file. Also the columns in which the program searches can be changed here. In the directory with all the scripts is also a file named config.ini. In this file all the variables that can be changed are stated. If you want to change the variable you must only change the part after the = sign. For example:

searchword   =   transduction  #Only change transduction into what you want.

USE_BLAST workflow

For the USE_BLAST workflow one adjustment needs to be made in a script before running it. In the Snakefile line 11; "expand("tree_total_tax.png")" need to be commented out if you run it for the first time. The workflow does not make the total taxonomic tree with the phylogenetic profile. When you want the taxonomic tree with the phylogenetic profile to be made, "expand("tree_total_tax.png")" in line 11 needs to be uncommented. Also a file needs to be made named 'total_tax_genes.txt'. This file must contain the output file '{genename}_gene_patric.txt' of all the genes that you want in the total taxonomic tree. So, if you want to put 3 genes in the tree, you have to make the file manualy by putting '{genename_1}_gene_patric.txt', '{genename_2}_gene_patric.txt' and '{genename_3}_gene_patric.txt' in the 'total_tax_genes.txt' file. The commands that can be used:

cat {genename_1}_gene_patric.txt >> 'total_tax_genes.txt'
cat {genename_2}_gene_patric.txt >> 'total_tax_genes.txt'
cat {genename_3}_gene_patric.txt >> 'total_tax_genes.txt'

conda environment

For the USE_BLAST workflow it is nessesery to use a conda environment if blast10 is not installed. To make the environment use the following commands:

conda create -n blast10 -c bioconda blast=2.10.1

To activate the conda environment use the following command:

conda activate blast10

To deactivate the conda environment use the following command:

conda activate

IQtree must be installed on the conda environment aswell. Use the following command to do that:

conda install iqtree

Install dependencies

How the needed dependencies are installed is explaind in this section.

Snakemake

To install snakemake run the following commands in the commandline. (miniconda must be installed for this)

conda install -c conda-forge mamba

mamba create -c conda-forge -c bioconda -n snakemake snakemake

conda activate snakemake

ETE3

To install ete3 run the following commands in the commandline. (miniconda must be installed for this)

conda install -c etetoolkit ete3 ete_toolchain

# To check if it is installed:
ete3 build check

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
workflow		workflow
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pipeline that visualizes the location of virulence genes in a taxonomic and phylogenetic tree

Introduction

Technologies

How to launch the pipeline?

Config file

USE_BLAST workflow

conda environment

Install dependencies

Snakemake

ETE3

About

Uh oh!

Releases

Packages

Languages

MGXlab/Treemaker

Folders and files

Latest commit

History

Repository files navigation

Pipeline that visualizes the location of virulence genes in a taxonomic and phylogenetic tree

Introduction

Technologies

How to launch the pipeline?

Config file

USE_BLAST workflow

conda environment

Install dependencies

Snakemake

ETE3

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages