Skip to content

tongshiyuan/VarTools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VarTools

requests

  • python 3.x
  • numpy
  • pandas
  • scipy
  • java 1.8 for vardict and GATK
  • R(version>=3.6.3) for vardict
  • samtools

funcion

  1. f2v: analysis from fastq to gvcf.
  2. tGT: from gvcf created by GATK to vcf in trio mode.
  3. sGT: from gvcf created by GATK to vcf in single mode.
  4. cc: case-control analysis with 2 ways.
  5. fp: build false positive database for filter.
  6. bqc: bam quality check.
  7. gd: gender identify.

f2v

python3 VarTools.py f2v -i in_dir -o out_dir -b bed -p prefix --vcf --fastqc --qualimap --keep_tmp -t 6

trio_genotype

python3 VarTools tGT -p .gvcf -f .gvcf -m .gvcf -s .gvcf,.gvcf -o ./result/outName

single_genotype

python3 VarTools sGT -p .gvcf -o ./result/outName

case-control

python3 VarTools cc --case case_dir --control control_dir --mode AD -o result

build false positive databse

python3 VarTools fp -i indir -o ./result --snvdb false_positive.txt --overlap_rate 0.3 --file_type vcf

bam quality check

python3 VarTools bqc -b in.bam --bed bed

gender identify

python VarTools gd -b in.bam -d bed

anno

python VarTools.py anno -i cohort.vcf.gz -o anno_result/ --prefix cohort -t 8 

software

database

some files in lib

  • Hg19.genome.bed/Hg38.genome.bed
# Hg38 from GATK
cut -f 1,2 Hg38.fasta.fai | head -n 24 | sort -k1,1  > Hg38.genome
# Hg19 from GATK
cut -f 1,2 Hg19.fasta.fai | sed -n '2,25p' | sort -k1,1  > Hg19.genome
zcat gap.txt.gz | awk -F"\t" '{if ($2~/^.{4,5}$/) print $2"\t"$3"\t"$4 }' | bedtools sort -i stdin > gap.bed
bedtools complement -i gap.bed -g Hg38.genome > Hg38.genome.bed
  • VarDict_assembly19_fromBroad_5k_150bpOL_seg.bed
# a bed file for VarDict WGS calling 
bedtools makewindows -g human.hg38.fa.fai -w 50150 -s 50000 > hg38.wgs.bed

some databases built

  • ReVe
zcat hg19_ReVe.txt.gz | cut -f 1-5,8 > hg19_ReVe_tmp1.txt
perl index_annovar.pl hg19_ReVe_tmp1.txt -outfile hg19_ReVe.txt -comment comment.txt

About

Tools for whole genome sequence analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages