Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggi() and gene70() commands input files #18

Closed
matteo95serra opened this issue Feb 17, 2021 · 2 comments
Closed

ggi() and gene70() commands input files #18

matteo95serra opened this issue Feb 17, 2021 · 2 comments

Comments

@matteo95serra
Copy link

Hello,

I'm working on different bulk RNA-seq dataset and I've tried to compute GENE70, GGI and PAM50 classifications with "genefu", but I was only able to obtain the PAM50 classification. Both GENE70 and GGI there was no way to make them work.
If I've understood well, the commands to compute PAM50, GGI and GENE70 scores are (respectively) the following:

  • molecular.subtyping(sbt.model = "pam50", data=matrix, annot=annotation, do.mapping=F)
  • ggi(data=matrix, annot=annotation)
  • gene70(data=matrix, annot=annotation)

where "matrix" is my expression matrix with rownames as sample names and colnames as gene names (in my case NCBI gene symbols), and "annotation" is a dataframe with a column containing the NCBI gene symbols and a column containing the respective EntrezGene.ID.

As I've said, the PAM50 classification works, but the other two commands no.

In particular, "ggi" command runs but I guess it's not able to map any of the genes (all the GGI scorse are "NA"). If I put "do.mapping = T", I obtain a error saying:
"Error in data1[, gg.uniq, drop = FALSE] : subscript out of bounds").

For "gene70", if I don't specify "do.mapping = T" I obtain the error:
"Error in gene70(data = t(as.matrix(visium_brain@norm_expr)), annot = sig.ggi) :
object 'res' not found
In addition: Warning message:
In gene70(data = t(as.matrix(visium_brain@norm_expr)), annot = sig.ggi) :
No overalp between the gene signature EntrezGene.IDsand the colnames of your data... Returning all NAs."
If I put "do.mapping = T", I obtain the error:
"Error in data1[, gg.uniq, drop = FALSE] : subscript out of bounds".

The "matrix" and the "annotation" that I've used for PAM50 are exactly the same as the ones used for ggi and gene70.

Could someone help me to solve this issue? I guess there could be something wrong in the "annotation" file, but the weird thing is that it works well with the PAM50 command.

Thank you in advance

@ChristopherEeles
Copy link
Contributor

Hi @matteo95serra,

The genefu package was designed for use with Affymetrix microarray data, so adapting it for RNA-sequencing data may not be straight forward. See #22 for more information on this.

I suspected that your errors are due to mismatching between your feature names and those of the corresponding gene signature object. The centroid genes are labelled with the gene symbol from their Affymetrix probe annotations, and as such may be outdated. When you set do.mapping=TRUE, the gene labels should be Entrez Gene id, and probe gene symbol will be mapped using that.

You can check that your expression matrix has the correct features for a given signature by loading that signature and matching the feature names. For example, with the ggi function:

data(sig.ggi)
# Assuming Entrez IDs
matching_genes <- intersect(colnames(your_matrix), sig.ggi$centroids.map$EntrezGene.ID)

For the functions to work, you need at least a few features to match.

To answer data specific questions, I need the code to reproduce some of your data so I can debug the functions and see what is going wrong. You can provide this to me here by replying with the output from:

dput(head(your_matrix))

Best,
Christopher Eeles
Software Developer
BHK Lab | PM-Research | UHN

@ChristopherEeles
Copy link
Contributor

Hi @matteo95serra,

I am closing this issue due to inactivity. If you have further questions feel free to re-open it.

Best,
Chris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants