-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diff_methylsig not working on tile_by_windows() object #47
Comments
Hello, I'm glad methyldackel helped! I will address your questions a little out of order.
I can't recall whether this was allowed then, but, conceptually, tiling the data is taking local information with a different kernel function (one that is evenly weighted). I don't think it makes sense to both tile the data and explicitly use the local information in the old As to the problem of NAs, when tiling data I usually do the following order of calls:
In other words, I don't throw out any data until I've grouped the CpG-level data into tiles. The presence of regions like
It's hard to know without looking at some examples in your data. It might be helpful to look at the distribution of methylation differences in your regions and at volcano plots for CpG-level and region-level results. It might also be helpful to look at the number of significant regions with different thresholds. Thanks. |
Thank you for your prompt answer! Regarding this aspect of using local information on the old methylSigCalc function, we have some example of its utility on old analysis.
You can find attached the file1 with p-value histograms:
**As to the problem of NAs, when tiling data I usually do the following order of calls:
Correct suggestion! We have now performed again the analysis following your suggested order of calls:
Anyway, we are still getting no significant DMRs using fdr≤ 0.1 and meth_diff≥|25|.
It's hard to know without looking at some examples in your data. It might be helpful to look at the distribution of methylation differences in your regions and at volcano plots for CpG-level and region-level results. It might also be helpful to look at the number of significant regions with different thresholds. You can find attached the volcano plot for our CpGs file2 and DMRs file3. Thank you very much! |
Hi Anair,
I noticed that you used a window size of 200. This is quite large, so you may want to try a shorter length, like 50 or 100. Maybe this could be leading to fewer significant?
Best, Maureen
|
Dear Maureen, thank you for your suggestion. We've tried to perform the analysis using 25 bp tiling window size, as usually done in the old methylsig version. Best, Anair |
Dear Maureen and Raymond, We have performed a comparison analysis on the same data using an old (v0.4.5beta) and new version of methylsig. As you can see, with these new version we are able to capture less differences in DMRs using these specific data. Since I have read that you're going to do a new release around October,I really would like to ask you to consider the possibility to use local information just for the dispersion calculation, just for the methylation or for both dispersion and methylation. It will be very helpful when analyzing data like ours. Thank you very much! Best, |
Hi Anair, Indeed, the plan is to add back the option of specifying using local information for methylation, dispersion, or both. However, the sites allowed for use in local information changed from 0.4.4 to the present version, so your results with local information will still likely differ. We do not plan to revert that change. I'll give an example to illustrate how the local information differs, apologies if you read this in another issue. Say there is a CpG within a 200bp window of the CpG being tested, but that it only has data for 2 samples and the min.per.group is 4. In the old version (0.4.4), that local CpG would be used to estimate the methylation/dispersion, but in the current release it is not. In effect, we are constraining local information to higher quality CpGs. Thanks, |
I met a totally same issue as anairlema described, got no DMRs using fdr although the |meth_diff| > 25. And the tile size is 25bp, with no local information. Actually the same datasets did give me perfect results with old version of methylSig (0.4.4) |
Hi, Thanks, |
Dear Raymond,
Thank you for your previous help on "input methylRaw object #46": methyldacker works very good and fast for us!
I'm sorry but we need again your help!
We have done the analysis for DMCs and it works good. Here you can find our workflow:
seqinfo: 25 sequences from an unspecified genome; no seqlengths
We have good results for DMCs, in line with the biology of our experiment. Now, we would like to analyse also DMRs, so we have applied the "diff_methylsig" function to the tile data , as described in the manual:
If we looked at the created object we get:
seqinfo: 25 sequences from an unspecified genome; no seqlengths
As you can see we have almost "NA" values and when we filter the data based on fdr and meth_diff (fdr<0.1 and meth_diff≥|25|) we have no results.
seqinfo: 25 sequences from an unspecified genome; no seqlengths
seqinfo: 25 sequences from an unspecified genome; no seqlengths
What are we missing? Is it possible to have 11482 significant DMCs and No significant DMRs?
Moreover, we have also tried to use local information when analysing the tile data but we get an error:
If I understand well, in the old version of methylsig (we usually use version 0.4.5beta) we can set "local.disp" and "local.meth" in the "methylSigCalc" function when doing it on tile data but it seems to be not possible on the new version: Is it correct?
Thank you in advance for all your support!
Anair
The text was updated successfully, but these errors were encountered: