-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation error with --ont and --ul #777
Comments
Hi @VanessaUg, thanks for letting me know. Could you please share the entire log file? It seems that hifiasm crashed after nearly completing everything, which I haven’t encountered before. For each sample, the read file doesn’t seem too large. Would you be able to share one of the read files with me for debugging? That would help me quickly identify the issue. I can provide Globus endpoints if that works for you. I’m also curious—have you tried assembling all reads directly using the --ont option without --ul? |
Thank you @chhylp123 ! Yes, I have tried also assembling with only --ont with all reads, which run successfully. This is the log file for a failed --ont --ul run: And the log file for a --ont run: |
Can you use Globus? Globus is easier and faster. |
@chhylp123 Yes, Globus will work. Could you please send me your Globus endpoints? |
Which email are you using? I could add your email to our endpoints. |
Could I send you the details privately? |
I have send you our Globus ID via email. |
Hello,
I'm trying to run hifiasm v0.23.0 on multiple species, using --ont and --ul, with expected genome sizes ranging from 350Mb to1.3Gb. While the script completed successfully for one of my species, resulting in a t2t assembly, which had not been possible before, I encountered segmentation faults during the final steps for the others. For --ont, I use ONT reads ranging 15-40kb and for --ul I use ONT data filtered for >40kb reads. Here is the error message:
The same error occurs using hifiasm v0.24.0.
I investigated the error further and found that the error is not caused by a specific read length used in --ul, as it did not occur when only longer reads (e.g. only >120kb reads), or only shorter reads (e.g. 40-120kb reads) are used for --ul within the same species with the same settings. The issue only occurs if I combine them.
Since the maximum number of sequences accepted for --ul before encountering a segmentation fault varies by species (e.g. species 1 completed successfully using 62,381 reads as --ul input, while species 2 failed using only 49992 reads as input), the error also seems to be influenced by the data distribution rather than a fixed sequence count number.
Therefore I investigated the read length distribution of the input used for --ul.
For the species that did not trigger the error, the read length composition of the data used as input for --ul looked as follows:
For a species which did trigger the segmentation fault, the read length composition of the input for --ul looked as follows:
I also tried using only 80% of the data for --ul, to have a lower coverage, as shown in the table below. However, I still encountered the same error.
I was wondering what could cause this issue.
Thank you very much for your help!
The text was updated successfully, but these errors were encountered: