-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
false negatives? #35
Comments
Hi Gulya, Your reasoning makes total sense. It does seem like something is going wrong with the clustering step, and I suspect that is where the problem lies. If the three EfeUOB genes are encoded next to each other, they should definitely be picked up by FeGenie (without the --all_results flag). Are you by chance running FeGenie with the --orfs or --gbk mode? And which MAG is it? Welcome to the iron metabolism world :) it gets confusing at times, but everyone gets along and helps each other out. Plus, we have good coffee. Arkadiy |
I was using --orfs (which, now when I think about it, would mean that fegenie might not know that these genes are next to each other?). And I looked into private_T1916_metawrap_bin.6 |
Thanks, Gulya. You are exactly correct. When providing the --orfs flag, FeGenie skips the step where it clusters genes based on where they are encoded on the genome/contig. I need to make this clear in the README, or implement into FeGenie some kind of way to guess coordinates based on the order in which ORFs are listed in the FASTA file. Although, with the latter, there is potential to run into issues if the provided ORFs come from a highly fragmented assembly. If you provide genbank files, along with the --gbk flag, that should allow FeGenie to keep track of the relative positions of ORFs on each contig. Otherwise, contigs are also another potential input, but in this case, FeGenie will run prodigal and generate new gene calls. From the MAGs that you emailed me, it seems that you annotated with Prokka? Prokka also uses prodigal for ORF prediction, so the gene calls should be same, but with a different name. In any case, it wouldn't be very difficult to consolidate the two sets of ORFs. Let me know if you have any other questions, or if anything here doesn't make sense! |
Hey! I do have another question!
After annotating my MAGs, I saw that FeGenie didn't find any transport-related clusters in any of my MAGs, which wouldn't make sense biologically (I have, among others, several cyanobacterial MAGs, and they must get their iron somewhere, right?). If I use the --all_results flag, I get some transport genes, but I'm not sure I should use them, since you mention in a different thread that this flag can create false-positives.
I imagine something goes wrong during the clustering step? I looked into one MAG specifically. According to the output produced by --all_results, it has the three EfeUOB genes, all next to each other, but they don't show up when I run the same MAG in strict mode. Are the other genes that should be present for the cluster to be complete?
Sorry for the basic question, I'm very new to the iron metabolism world :)
I can send you the MAG I looked into, or the output files, if needed.
Thanks!
The text was updated successfully, but these errors were encountered: