You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I saw that pyrodigal can use different translation tables (issue #34), which is great!
Do you know if pyrodigal can correctly identify selenocysteine insertions? (And pyrrolysine too)
I have multiple MAGs containing tRNA-Sec and the SelA protein, but prokka/prodigal gene prediction of these MAGs is giving a lot of partial ORFs. I would love to be able to annotate the ORFs of these MAGs with a tool that can output the selenoprotein ORFs correctly (requires recognising the UGA stop codon AND also the bacterial SEC insertion sequence (SECIS), which can be a bit divergent among different taxa/genes).
Any help or suggestions would be great :)
The text was updated successfully, but these errors were encountered:
Would also love to see this on the gene prediction side. However, I can image, this is a non-trivial task.
A bit of #shameless-plug since I'm the main developer but FYI:
We have implemented such a feature for selenocysteine proteins in Bakta. Bakta detects cis-regulatory recoding stimulation ncRNA regions. And if two adjacent, proximate, in-frame CDS are alo detected, it is able to merge the ORFs of both and recodes the stop codon of the upstram ORF to a selenocystein codon. Thus it is able to predict and annotate such proteins - also for MAGs.
At the moment Prodigal (and thus Pyrodigal) doesn't support selenoproteins at all. It would require some extra work to recognize and score the SECIS that would probably warrant some fundamental changes in the Prodigal node scoring algorithm :(
Hello,
I saw that pyrodigal can use different translation tables (issue #34), which is great!
Do you know if pyrodigal can correctly identify selenocysteine insertions? (And pyrrolysine too)
I have multiple MAGs containing tRNA-Sec and the SelA protein, but prokka/prodigal gene prediction of these MAGs is giving a lot of partial ORFs. I would love to be able to annotate the ORFs of these MAGs with a tool that can output the selenoprotein ORFs correctly (requires recognising the UGA stop codon AND also the bacterial SEC insertion sequence (SECIS), which can be a bit divergent among different taxa/genes).
Any help or suggestions would be great :)
The text was updated successfully, but these errors were encountered: