Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baltica with human genome from GENECODE #28

Open
sailorjupiter87 opened this issue Feb 18, 2025 · 0 comments
Open

baltica with human genome from GENECODE #28

sailorjupiter87 opened this issue Feb 18, 2025 · 0 comments

Comments

@sailorjupiter87
Copy link

Hi everyone,

I am new to the forum and I am still getting familiar with using Baltica (Ubuntu 22.04.5 LTS, Python 3.7.12, baltica 1.2.4). I encountered an issue during the final stages of analysis with my RNA-seq data (second strand library prep) aligned with STAR using the human genome GENECODE GRCh38.p4 release47 primary assembly (gencode.v47.primary_assembly.annotation.gtf and GRCh38.primary_assembly.genome.fa). I was able to obtain the outputs from the DJU method and the StringTie output, but at this point, baltica stopped with the following error message:

"Activating singularity image /home/cappelli/.baltica/singularity/8c8a3574e99286bac514a846a13eea14.simg
Processing DJU method output
Processing de novo annotation
Warning message:
In .Seqinfo.mergexy(x, y) :
Each of the 2 combined objects has sequence levels not in the other:

  • in 'x': KI270438.1, KI270512.1, KI270709.1, KI270729.1, KI270732.1, KI270747.1
  • in 'y': GL000216.2, KI270710.1, KI270713.1, KI270714.1, KI270719.1, KI270720.1, KI270726.1, KI270746.1, KI270749.1, KI270753.1, KI270755.1
    Make sure to always combine/compare objects based on the same reference
    genome (use suppressWarnings() to suppress this warning).
    Preparing annotation
    Proceding with integration
    Error in validObject(x) : invalid class “IRanges” object:
    'width(x)' cannot contain negative integers
    Calls: lapply ... end<- -> end<- -> end<- -> .set_IRanges_end -> validObject
    In addition: Warning message:
    In .Seqinfo.mergexy(x, y) :
    Each of the 2 combined objects has sequence levels not in the other:
  • in 'x': chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM, GL000008.2, GL000009.2, GL000195.1, GL000205.2, GL000213.1, GL000214.1, GL000216.2, GL000218.1, GL000219.1, GL000221.1, GL000224.1, GL000225.1, KI270442.1, KI270706.1, KI270710.1, KI270711.1, KI270712.1, KI270713.1, KI270714.1, KI270717.1, KI270718.1, KI270719.1, KI270720.1, KI270722.1, KI270726.1, KI270727.1, KI270728.1, KI270731.1, KI270734.1, KI270741.1, KI270742.1, KI270743.1, KI270744.1, KI270745.1, KI270746.1, KI270748.1, KI270749.1, KI270750.1, KI270751.1, KI270753.1, KI270755.1
  • in 'y': 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, M
    Make sure to always combine/compare objects based on the same reference
    genome (use suppressWarnings() to suppress this warning).
    Execution halted"

Upon reopening the outputs (junction.csv), I noticed that in some files the chromosomes were labeled with just numbers, while in others they had the "chr" prefix. I standardized the chromosome names by adding the "chr" prefix to all chromosomes in each file and ran Baltica analysis. However, I encountered the same error again. Since the error also referred to StringTie, I checked the folder generated by StringTie to look at the merged.combined.gtf file and compared it with the Baltica analysis run on the same RNA-seq data, but aligned to a human genome downloaded from Ensembl (it is hg38, but I don't know the release because it was provided by the company that sequenced my samples). Essentially, these two merged.combined.gtf files differ in the presence of additional columns and also in the inclusion of "scaffolds" in the file generated from the genome downloaded from GENECODE.
How can I solve this problem? Is it possible to use a genome downloaded from GENECODE to perform the assembly for use in Baltica?

Many thanks in advance,

Sara

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant