Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does metheor handle the two overlapped reads originated from the same fragment? #4

Open
ryansohny opened this issue Nov 11, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@ryansohny
Copy link

Hi, Dr. Lee

First off, thank you for developing this incredible toolkit for calculating several measures from which we are able to infer DNA methylation heterogeneity of the sample. Just like the paper's title (Lee et al., bioRxiv, 2022), it IS "ultrafast" and is quite easy to use.

Due to my complete lack of understanding of the language (Rust) metheor is built upon, I was wondering if you could tell me how metheor handles the two overlapped reads (Read1 and Read2 from paired-end reads) when it measures, say, epipolymorphism (PM). By overlapped reads, I mean they are mapped onto basically the same region owing to the short size of the fragment from which they originated. 
For example, when bismark (Link) extracts the DNA methylation information from these overlapped reads (by bismark_methylation_extractor), you would be well aware that, due to the redundancy of the information they contain, it ignores the DNA methylation of one of the reads. 

I tried to get the answer for this question from the original paper that first introduced Epipolymorphism (Landan et al., 2012), to no avail.

Does metheor take account of this redundancy?

Best regards, 
Sohn

@dohlee
Copy link
Owner

dohlee commented Nov 11, 2022

Hi, thanks for opening this issue.
It's glad to know that metheor works well (and is fast) on your data!

Yes, you're definitely right. Existing CpG-wise methylation level extractors such as bismark methylation extractor you've said and also MethylDackel (link) deals with overlapping paired-end reads. It is quite straightforward since we can just consider overlapping CpGs once for the computation.

Unfortunately, the current version of metheor (and perhaps all other tools that calculate methylation heterogeneity measures) actually does not take care of this problem, but I think I can implement it with slight modification. I see that it is indeed a crucial and urgent feature for accurate results for paired-end results.

I'll make it fixed in the next update, and will keep this issue open until then. Thank again!

Best regards,
Dohoon

@ryansohny
Copy link
Author

ryansohny commented Nov 11, 2022

Thank you for the quick response and your willingness to update the program with the new feature which can deal with the redundancy of the CpG information.

I've always wondered if they're considering this when they compute the DNA methylation heterogeneity, and if they don't, it would be great to see the result when they actually do.

Again, thank you very much for your help. This tool is awesome!

​Kind regards,
Sohn

@dohlee dohlee added the enhancement New feature or request label Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants