TCS.rb -The general script to construct TCS
Dr.rb -The script to construct TCS for the MPID-HIVDR MiSeq sequencing. Post-TCS QC will check if the TCSs are in the correct sequencing regions
log_multi.rb -Format and sort TCSs and libraries after TCS.rb or DR.rb
SDRM.rb -Surveillance Drug Resistance Mutation (SDRM) analysis using TCSs from the Dr.rb pipeline, also generate N-J trees and calculate Pi and first quintile of pairwise comparison.
viral_seq.rb -viral_seq version 0.1.0. Core functions for TCS/DR pipeline. Latest RubyGem 'viral_seq' at
end_join.rb -join paired-end sequence after TCS pipeline. Consensus model is used.
mut_table.rb -Use sample sequence consensus as a reference to calculate mutation types and mutation rate.
- fix a bug reading fastq raw sequence files by different MiSeq naming system.
- fix another bug for reading R1 and R2 files
- bug fix for reading R1 and R2 files
1. Add rescue for :sequence_locator, in case of rare alignment issues.
1. Improved performance.
1. Input files can be either .fastq or .fastq.gz, will unzip if it is .gz file
2. Minor improvement of efficiency
1. Remove the temp_dir if fail to create TCS
1. Fix a bug of method #sequence_locator. Refine the alignment if the ref sequence restarts and/or ends with "-"
1. Fix a bug of method #sequence_locator
1. Update new V3 DR primer.
1. If the forward primer does not contain "N"s, the whole sequence will be used as the biological forward primer.
1. consensus cut-off model based on 3 levels of error rate (0.02, 0.01, 0.005). By default 0.02.
1. Adapted to TCS website
2. Compress output directory in .tar.gz file
Patch Notes
1. Compare PID with sequences which have identical sequences.
2. PIDs differ by 1 base will be recognized. If PID1 is x time greater than PID2, PID2 will be disgarded
3. PID factor x is 10 by default.
4. PID filter only apply when the number of potential consensus sequences is less than 0.3% of the maximum capacity of PID.
Patch Notes:
1.Add Primer ID filter after consensus creation. Compare PID with sequences which have identical sequences. PIDs differ by 1 base will be recognized. If PID1 is x time greater than PID2, PID2 will be disgarded. PID factor x is 10 by default.
Patch Notes:
1.Allow ambituities of bases in the gene specific sequences.
Patch Notes:
1.Now allow multiplexed Primer ID sequencing system. Input primers in pairs for all sets.
2.Add option to ignore the 1st nucleotide of the Primer ID.
Create Primer ID template consensus sequences from raw MiSeq FASTq file
Input = directory of raw sequences of two ends (R1 and R2 fasta files)
Require parameters:
list of Primer Sequence of cDNA primer and 1st round PCR forward Primer, including a tag for the pair name
ignore the first nucleotide of Primer ID: Yes/No (default: Yes)
Patch Notes:
1. consensus cut-off calculation using average number of top 5 abundant Primer ID
2. Add 'resampling indicator' = consensus without ambuiguities / all consensus including ambuiguities.
Create Primer ID template consensus sequences from raw MiSeq FASTq file
Input = directory of raw sequences of two ends (R1 and R2 fasta files)
Require parameters:
Length of Primer ID
Primer Sequence of cDNA primer and 1st round PCR forward Primer