-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seurat and genefu #22
Comments
Hi @ksaunders73, This is not a straight forward question to answer. All of the cluster centroids in the I recommend reading the PAM50 subtype paper, specifically the Methods section:
My understanding is that they used log2 transformed expression ratios to conduct their clustering analysis. Therefore the centroids of their clusters will also be indicated in these units. From the supplementary Methods section for the aforementioned paper, the expression ratios were calculated as:
I was unable to find the definition of the baseline condition in the paper, maybe you can find it? Without knowing what the baseline for the expression ratios were it is hard to say how to make an analogous metric from counts/TPM. My instinct would be to divide the TPM by the average or median for each gene across your patient cohort, but whether this is scientifically valid or not is a call you will need to make. It is possible they used a normal sample for their baseline. Once you decide on how to get a log expression ratio from your Seurat data, you should apply the Information about different centroids can be found in the Given that this package was designed for classifying data from Affymetrix microarrays, I am not sure it is optimal to adapt it for use on RNA sequencing data. You may want to consider an RNA seq based clustering algorithm due to the above technical considerations. Hopefully that helps. Best, |
Thank you very much @ChristopherEeles! |
Hi @ksaunders73, I am going to close this issue. If you have further questions feel free to re-open this thread or file a new issue. Best, |
Excuse me, how to use single-cell data for PAM50 analysis, what does the input expression matrix look like, and which normalization method should be used? |
Excuse me, how to use single-cell data for PAM50 analysis, what does the input expression matrix look like, and which normalization method should be used? |
It has come to my attention that the paper I cited above is not the original PAM50 publication. However, the discussion still applies. |
Hello!
Thank you for the excellent package! I would like to use genefu's molecular.subtyping() function (using the pam.50.robust model) on my Seurat object, and was wondering whether the Seurat object should be
Thank you for reading!
The text was updated successfully, but these errors were encountered: