Skip to content

Setup and customize TransVar

Wanding Zhou edited this page Apr 8, 2016 · 4 revisions

Use environment variable to direct download and configuration

TRANSVAR_CFG

store the path to transvar.cfg

export TRANSVAR_CFG=path_to_transvar.cfg

TRANSVAR_DOWNLOAD_DIR

store the path to the directory where auto-download of annotation and reference go

export TRANSVAR_DOWNLOAD_DIR=path_to_transvar_download_directory

Install and specify reference genome assembly

For some genome assembly (currently hg18, hg19, hg38, mm9 and mm10) we provide download via transvar config --download_ref --refversion [reference name]. See transvar config -h for all choices of [reference name]). For other genome assemblies,one could download the genome and index it manually by, transvar index --reference [fasta]. Under the hood, TransVar uses the samtools faidx. So one could use any existing faidx indices without a glitch. Once downloaded and indexed, the genome can be used through the "--reference" option followed by path to the genome.

To set the default location of genome file for a reference version, say, to ./hg19.fa,

transvar config -k reference -v ./hg19.fa --refversion hg19

will create in transvar.cfg an entry

[hg19]
reference = hg19.fa

so that there is no need to specify the location of reference on subsequent usages.

Install and specify transcript annotations

TransVar provides automatic download of transcript annotations. E.g., transvar config --download_anno --refversion hg19 will automatically download annotation from Ensembl, RefSeq etc. to [installdir]/lib/transvar/transvar.download directory or your local ~/.transvar.download if the installation directory is inaccessible. See transvar config -h for all version names. These will also create default mappings under the corresponding reference version section of transvar.cfg like

[hg19]
ucsc = /home/wzhou1/download/hg19.ucsc.txt.gz

One also has the option of downloading from Ensembl collection.

transvar config --download_ensembl --refversion mus_musculus

Without specifying the refversion, user will be prompted a collection of options to choose from.

View current configuration

One can read the transvar.cfg file for the information. Alternatively one may run

transvar current

which returns information about the setup regarding to the current reference selection, including the location of the reference file and database file.

Current reference version: mm10
reference: /home/wzhou/genomes_link/mm10/mm10.fa
Available databases:
refseq: /home/wzhou/tools/transvar/transvar/transvar.download/mm10.refseq.gff.gz
ccds: /home/wzhou/tools/transvar/transvar/transvar.download/mm10.ccds.txt
ensembl: /home/wzhou/tools/transvar/transvar/transvar.download/mm10.ensembl.gtf.gz

specifying --refversion displays the information under that reference version (without changing the default reference version setup).