-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph database? #35
Comments
It's a pain partly because of the GFF specification. GFFs only encode trees, so full graph support is not needed, but the format is bad at supporting them. To somewhat get around this, I clean up ("sanitize") all my GFFs to have a field that runs through each Also, there's some support for iterating over parent-child pairs for canonical hierarchies that might make what you're trying to do easier: |
I hadn't heard of graph databases until you brought them up. After reading up on them a little, I'm pretty sure they would provide a substantial performance boost. But I wasn't able to find a file-based implementation either, Python or otherwise. Currently for me, managing a separate graph database and server is too much overhead compared to the almost transparent method of using a file-based database. As Yarden alluded, yes the SQL can be awkward. But ideally, as many manipulations as possible would be hidden to the end-user. In previous Anyway, if I hit upon a use-case that's not already implemented, then I'll typically add a method to And if you ever find a file-based graph db, please let me know! |
Apparently there's a python implemented graph db that's a layer over SQLite: https://github.com/eugene-eeo/graphlite |
Thanks, nice find. So to use this in gffutils it would take some playing around to figure out if 2 databases are needed or if graphlite can work with an existing db (my hunch is the latter based on the docs). Then any of the logic that touches the current Have you run across cases where gffutils currently doesn't work well or that you think would benefit from a graph db? |
For me specifically, I operate mostly on exons so getting an exon from a On Fri, Aug 14, 2015 at 9:22 AM Ryan Dale [email protected] wrote:
|
This is not an issue, more a question. It takes some serious SQL-wrangling to get parent-child or grandparent-child information about gene-transcript-exon relationships. Have you thought about using a graph database for gffutils? There doesn't seem to be a SQLite equivalent for Node.js or TitanDB so you wouldn't have to open up a separate port, so that could be a drawback.
The text was updated successfully, but these errors were encountered: