Calculate ortholog relationships in Reactome#38
Open
cthoyt wants to merge 23 commits into
Open
Conversation
Member
Author
|
@bgyori this has been cleaned up and now has a more specific scope. What do you think about adding these ~100K orthology mappings between Reactome pathways? In comparison to the 5K we've done manually, this vastly changes the scale of the data here. Do you think this is a problem? Maybe we should have yet another category for large scale calculated mappings that can automatically be trusted (as opposed to predicted, which still need to be checked) so we can keep these separate |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a script for identifying orthologous pathways in Reactome based on lexical matching. All pathway identifiers have the form R-{SPECIES CODE}-{PATHWAY CODE}, so any two species' pathways can be matched by splitting on the dash
-then maching pathways codes.The KEGG and Reactome matching are correct by definition, but do not exist in any primary source, so could be added directly. This PR doesn't (yet) contain the results because I'm not really sure if this should be in scope of the repo. There are quite a few (30K+) to add.