Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

three different results for the same canno input - which one is correct? #30

Open
orthoceros opened this issue Feb 27, 2019 · 0 comments

Comments

@orthoceros
Copy link

orthoceros commented Feb 27, 2019

Dear TransVar developers,
I have found a canno annotation problem with 2.4.8.20190122 that did not occur with 2.4.1.20180815 (same reference fasta and annotation files). After downgrading, it is working again:

With the old version, variants 'NM_001760.3:c.786_796dup' and 'NM_001760.3:c.758_759dupAG' from a SangerSeq experiment annotated without any errors, for example:
columns = input transcript gene strand coordinates(gDNA/cDNA/protein) region info.
values = 'NM_001760.3:c.758_759dupAG' 'NM_001760 (protein_coding)' 'CCND3' '-' 'chr6:g.41936061_41936062dupTC/c.758_759dupAG/p.S254Rfs*103' 'inside_[cds_in_exon_5]' 'CSQN=Frameshift;left_align_gDNA=g.41936057_41936058insCT;unalign_gDNA=g.41936058_41936059dupCT;left_align_cDNA=c.756_757insGA;unalign_cDNA=c.758_759dupAG;dbxref=GeneID:896,HGNC:HGNC:1585,MIM:123834;aliases=NP_001751;source=RefSeq'

With the new version, both dup variants result in no_valid_transcript_found. Querying SNVs 'NM_001760.3:c.758A>C' and 'NM_001760.3:c.759G>T' works with the new version, so theindividual reference bases seem to be correct (while, e.g., 'NM_001760.3:c.759C>G' or 'NM_001760.3:c.759A>G' fail as expected). Interestingly, querying the dup equivalent 'NM_001760.3:c.758_759AG>AGAG' still works with the new version:
'NM_001760.3:c.758_759AG>AGAG' 'NM_001760.4 (protein_coding)' 'CCND3' '-' 'chr6:g.41936061_41936062dupTC/c.760_761dupAG/p.S254Rfs*103' 'inside_[cds_in_exon_5]' 'CSQN=Frameshift;left_align_gDNA=g.41936057_41936058insCT;unalign_gDNA=g.41936058_41936059dupCT;left_align_cDNA=c.756_757insGA;unalign_cDNA=c.758_759dupAG;dbxref=GeneID:896,HGNC:HGNC:1585,MIM:123834;aliases=NP_001751;source=RefSeq'
This output suggests that after the transcript update from NM_001760.3 to NM_001760.4, this variant should now be named 'NM_001760.4:c.760_761dupAG'. But using this TransVar output as input (same new version 2.4.8.20190122) results in no_valid_transcript_found, again...?

Which of the three results is correct? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant