-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent gene predictions for Pyrodigal and *fixed* Prodigal #21
Comments
Hi Oliver, thanks a lot for reaching out, I'm happy to hear you're considering Pyrodigal for super useful tools like these! I've been advocating to Bakta for some time 😉 I'm very thankful for the test set and the gene prediction comparison analysis, to be fair this is something I should have done myself earlier. I'm not yet entirely sure whether it's related to #19 or not because #19 actually affects the already computed genes by recording the wrong scores for display, but it shouldn't change the prediction of the genes themselves. Nevertheless I'll have a look at these genomes, it could be that in the v1 refactor I did some mistakes while adapting the original code. Just so that I can reproduce, is this in meta or single mode? I also recently found out that Prodigal was doing some out-of-bound reads of sequence data while training, so I'll see about patching that, but I don't think that this is causing the discrepancies you've been seeing. |
Just found the comparison repository, thanks @jhahnfeld for setting this up! |
Hi Martin, I just wanted to provide you with the link to the comparison repository, but you already found it :)
We used the single mode. |
I just tested the comparison by using Pyrodigal where instead of training I loaded the training file from Prodigal: the results matched 100%. So it looks like Pyrodigal has some discrepancies in the training process, I'll have a look ASAP! |
I updated the prediction mismatches and removed the '-c' flag since the tested genomes are closed. Now there are some more mismatches and completely missing predictions. |
I just want to +1 this effort. Finding a few [minor] differences in our tests as well in meta mode. |
@zdk123 : If you have some example sequences with inconsistencies in meta mode please share them here! I have identified a bug in the training process that I am working on fixing, but meta mode shouldn't be concerned, so if you see discrepancies there too that another bug that i have to fix 😅 |
Just shipped pre-release Still seeing some mismatches in |
Okay, found and patched a new bug (#22) which fixes the remaining issues on |
Thanks a lot for this. We'll have a look at this ASAP. |
@oschwengers : There are breaking changes coming from #18 because I'm going to add a mandatory argument to all In general choosing the versioning scheme is always complicated for bindings, but I think in here I'm just going to let Pyrodigal have its own versioning scheme following semver :) And actually Prodigal is at version |
Ahh, I see. OK, that makes totally sense. Thanks! |
Hi, thank you for looking into the issues. |
@jhahnfeld : I'm not seeing any mismatch in |
Sorry, I forgot to delete the |
Now, that we all see the same gene predictions between Prodigal and Pyrodigal, I'll close this issue. Just in case I've missed something, please do not hesitate to re-open it. Great effort and results! Thanks a lot everyone! Can't wait to use Pyrodigal in our projects... |
Thanks a lot for the extensive bug report and the comparison repository! |
Hi and thanks so much for working on this. A patched, accelerated and potentially multi-threaded Prodigal alternative is very much needed and your effort very much appreciated!
We considered
Pyrodigal
as a promising alternative forProdigal
within Bakta and Platon. To add an extra layer of confidence we started a gene prediction comparison on 49 bacterial genomes. The large majority of all genes are equally predicted but unfortunately, some prediction slightly differ and some genes are not predicted byPyrodigal
at all.We waited to double check everything but as I saw your issue #19 today, I thought this could maybe help to further debug it. @jhahnfeld has conducted the comparison and could provide more info on this.
Here are the results using a locally compiled and patched
Prodigal
version:The text was updated successfully, but these errors were encountered: