-
Notifications
You must be signed in to change notification settings - Fork 62
Add support for Jelly input and output #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Could someone with the power please remove me as a reviewer - the UI isn't letting me do that (maybe because I'm not an official collaborator on this repo). |
The UI for this doesn't look great. You end up on the reviewers' list because you left comments and reviewed changes in the past, and there is no way to remove yourself from there. Anyways, you won't be asked to do the final review before merging the PR. |
This PR adds support in the CLI tool for reading and writing files in Jelly, a high-performance binary RDF format.
Jelly is faster to parse and write than Turtle, so it may be useful when working with larger files.
It's a non-indexed streaming binary format, so it can work with arbitrarily large files, using constant memory (unlike HDT).
Implementation
I added the
jelly-jenadependency, which integrates nicely with the Jena RIOT API. The remaining changes are to autodetect file format based on the extension, and to add the-outputFormatparameter.From what I can see, adding support for more formats (e.g., JSON-LD, NT, RDF/XML) should be relatively simple, as these are already bundled with Jena.
Tests
I added a test that runs the infer and validate commands in an end-to-end setting. I don't think there was an end-to-end CLI test before in the suite, so this should help a little bit with test coverage.
I'm testing inference with some RDF data I created in a previous project (it's fine to be included here, I release it under Apache 2.0). The validation test case comes from RiverBench. This code is already under Apache 2.0, but just in case – I wrote the whole thing myself, so it's also fine to be included.
Using Jelly files
jelly-clito convert to/from Jelly files.Dependencies
This effectively only adds two small dependencies:
jelly-core– generic serialization code for Jelly (178 KB JAR)jelly-jena– integration ofjelly-corewith Jena (33 KB JAR)jelly-corealso depends onprotobuf-java, but that dependency is already included with Jena.The Jelly libraries are extensively tested (8000+ test cases in the main suite) and have mitigations for known security risks tested in CI. They are production-grade and are currently being used for example in the nanopublication services for inter-service communication.