valX files vs trimmed files? diff output same code? #162

desmodus1984 · 2023-04-27T15:35:11Z

Hi,

I want to trim EM-SEQ fastq files.
I used the same code, first for a single pair, and then for a batch.
The code for the first pair was:

trim_galore --2colour 20 --illumina -o trim --paired V00001_R1.fastq.gz V00001_R2.fastq.gz

and the output was:
V00001_R1_val_1.fq.gz
V00001_R2_val_2.fq.gz

The summary stated trimming mode - paired end:

SUMMARISING RUN PARAMETERS

Input filename: V00001_R1.fastq.gz
Trimming mode: paired-end
Trim Galore version: 0.6.10
Cutadapt version: 1.18
Number of cores used for trimming: 1
Quality encoding type selected: ASCII+33
Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; user defined)
2-colour high quality G-trimming enabled, with quality cutoff: --nextseq-trim=20
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
Output file will be GZIP compressed

Then, for a second pair I used the code:

trim_galore --2colour 20 --illumina --output_dir=trim -j 4 --paired V00021_R1.fastq.gz V00021_R2.fastq.gz

The output files were:
V00021_R1_trimmed.fq.gz
V00021_R2_trimmed.fq.gz

And the summary:

SUMMARISING RUN PARAMETERS

Input filename: V00021_R1.fastq.gz
Trimming mode: paired-end
Trim Galore version: 0.6.10
Cutadapt version: 1.18
Python version: could not detect
Number of cores used for trimming: 4
Quality encoding type selected: ASCII+33
Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; user defined)
2-colour high quality G-trimming enabled, with quality cutoff: --nextseq-trim=20
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
Output file will be GZIP compressed

Why the first pair had the prefix val* while the second just trimmed?

Is there something in the code that I didn't know or was it an effect of using multithreaded mode?

Thanks;

FelixKrueger · 2023-04-27T18:57:25Z

If you still have files called *trimmed.fq.gz around in paired-end mode, it is likely that the run hasn't completely finished. Once the validation process is complete, both intermediate trimmed.fq.gz files will be deleted.

As a side note, if this trimming is for methylation alignments, I would recommend the trimming setting described here: http://felixkrueger.github.io/Bismark/bismark/library_types/#em-seq-neb

tamuanand · 2023-05-18T23:38:30Z

Hi @FelixKrueger

Related questions specific to EM-Seq:

I assume one has to explicitly use trim_galore first on the R1/R2 files and then pass the trimmed R1/R2 files to bismark
Based on your comment above, should I explicitly call out --clip_R1 10 --clip_R2 10 --three_prime_clip_R1 10 --three_prime_clip_R2 10 when using trim_galore or should I not - the legend below the table at https://felixkrueger.github.io/Bismark/bismark/library_types/ suggests Default settings (nothing in particular is required, just use Trim Galore or Bismark default parameters)
If OK with you, would you know what would be the equivalent command with bbduk.sh - given that bbduk is java based, I would expect this step will be much faster

Thanks.

FelixKrueger · 2023-05-19T08:39:57Z

You don't necessarily have to use Trim Galore, but yes some trimming is recommended. the nf-core/methylseq pipeline has an EM-seq switch which should work equally:

--EM-seq

tamuanand · 2023-05-19T08:44:56Z

I think this still uses Trim Galore under the hood

the nf-core/methylseq pipeline has an EM-seq switch which should work equally:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

valX files vs trimmed files? diff output same code? #162

valX files vs trimmed files? diff output same code? #162

desmodus1984 commented Apr 27, 2023

FelixKrueger commented Apr 27, 2023

tamuanand commented May 18, 2023 •

edited

Loading

FelixKrueger commented May 19, 2023 •

edited

Loading

tamuanand commented May 19, 2023

valX files vs trimmed files? diff output same code? #162

valX files vs trimmed files? diff output same code? #162

Comments

desmodus1984 commented Apr 27, 2023

SUMMARISING RUN PARAMETERS

SUMMARISING RUN PARAMETERS

FelixKrueger commented Apr 27, 2023

tamuanand commented May 18, 2023 • edited Loading

FelixKrueger commented May 19, 2023 • edited Loading

tamuanand commented May 19, 2023

tamuanand commented May 18, 2023 •

edited

Loading

FelixKrueger commented May 19, 2023 •

edited

Loading