Skip to content

Commit b71e2ae

Browse files
feat: RDF4J Client (#3306)
* feat: add RDF4JClient, RepositoryManager, and initial Repository implementation Also set up and add tests * test: add test for repo listing format error and repo not healthy error * chore: add testcontainers comment * docs: add google style docstring for mkdocs * test: add rdf4j client test * chore: add todo to only run rdf4j tests on python 3.9 or greater * chore: add todo to only run rdf4j tests on python 3.9 or greater * refactor: organise rdf4j package into rdflib.contrib * feat: add http_client property * feat: bootstrap a minimal graphdb client * chore: remove unused rdf4j testcontainer * chore: add testcontainer pytest marker, refactor rdf4j test structure, add overwrite and get methods on Repository class and add tests for them * test: rearrange unit tests * feat: Repository get method * feat: Repository delete method * feat: Repository size method * test: add e2e tests for the new Repository methods * feat: add Repository upload and graphs method * test: fix up and improve existing tests * test: add tests for Repository.graphs * chore: add docstring to overwrite and upload methods * feat: defer repository manager creation until accessed, and also amend some docstring content * feat: add RDF4J NamespaceManager * chore: fix mypy issues * test: add e2e tests for RDF4J Repository NamespaceManager * chore: doc improvements * feat: the repository's namespace prefixes are now bound to the return object of Repository.get() method * feat: add Repository GraphStoreManager * test: add e2e test for Repository GraphStoreManager Also fixes a httpx limitation with key-only query params. * chore: remove redundant re-raise of httpx exceptions * docs: add docstring to protocol method * style: formatting * feat: add Repository.query method * feat: add Repository.update method * feat: add Repository Transaction with ping and commit * refactor: prep Repository.size method to be used by transactions as well * test: refactor e2e test file names to avoid clash * test: fix test * Revert "refactor: prep Repository.size method to be used by transactions as well" This reverts commit 323f303. * test: fix test * feat: Add Transaction.size method * feat: Add Transaction rollback, add, and query methods * feat: Add Transaction delete, and also Transaction upload test * feat: Add Transaction update * feat: Add Transaction get * test: update transaction tests for upload and delete * fix: improve error handling * feat: dynamically import httpx by checking its existence with find_spec. Update tests to skip if httpx is not available * test: put httpx import behind has_httpx condition * test: fix test annotations * fix: add conditional import for GraphDBClient * test: ignore rdf4j and graphdb client.py for docstring tests * feat: RDF4J store (#3316) * fix: handle graph_name when it's a str * feat: wip RDF4JStore Implements: - init/open - close - add - addN - contexts - add_graph - remove_graph - __len__ * feat: RDF4J Store now supports handling namespaces and prefixes * feat: RDF4J Store triples and quads querying * feat: ensure no bnodes are used to cross document/query boundaries * chore: formatting * test: improve e2e test speed by reusing the same container and cleaning up the repo between each tests * feat: add RDF4JStore remove * feat: add RDF4JStore triples_choices tests * feat: add RDF4JStore SPARQL query and update tests * chore: fix mypy issues * test: error handling on client fixture * test: mark testcontainer tests and put test imports behind the has_httpx flag * build: remove upper python bound, bump testcontainers, and revert back to stable v7 poetry.lock * test: put testcontainer tests behind a flag for unsupported python versions * test: install rdf4j extras for python 3.9 and above * ci: skip testcontainer tests on non-linux runners * chore: rename RDF4J's NamespaceManager class --------- Co-authored-by: Nicholas Car <[email protected]>
1 parent d1391f2 commit b71e2ae

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+6099
-4
lines changed

.github/workflows/validate.yaml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ env:
1212
POETRY_CACHE_DIR: ${{ github.workspace }}/.var/cache/pypoetry
1313
PIP_CACHE_DIR: ${{ github.workspace }}/.var/cache/pip
1414

15-
1615
concurrency:
1716
group: ${{ github.workflow }}-${{ github.ref }}
1817
cancel-in-progress: true
@@ -52,7 +51,7 @@ jobs:
5251
PREPARATION: "sudo apt-get install -y firejail"
5352
extensive-tests: true
5453
TOX_TEST_HARNESS: "firejail --net=none --"
55-
TOX_PYTEST_EXTRA_ARGS: "-m 'not webtest'"
54+
TOX_PYTEST_EXTRA_ARGS: "-m 'not (testcontainer or webtest)'"
5655
steps:
5756
- uses: actions/checkout@v4
5857
- name: Cache XDG_CACHE_HOME
@@ -84,6 +83,13 @@ jobs:
8483
shell: bash
8584
run: |
8685
${{ matrix.PREPARATION }}
86+
- name: Set testcontainer exclusion for non-Linux
87+
if: ${{ matrix.os != 'ubuntu-latest' }}
88+
shell: bash
89+
run: |
90+
if [ -z "${{ matrix.TOX_PYTEST_EXTRA_ARGS }}" ]; then
91+
echo "TOX_PYTEST_EXTRA_ARGS=-m 'not testcontainer'" >> $GITHUB_ENV
92+
fi
8793
- name: Run validation
8894
shell: bash
8995
run: |
@@ -97,7 +103,7 @@ jobs:
97103
gha:validate
98104
env:
99105
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
100-
TOX_PYTEST_EXTRA_ARGS: ${{ matrix.TOX_PYTEST_EXTRA_ARGS }}
106+
TOX_PYTEST_EXTRA_ARGS: ${{ matrix.TOX_PYTEST_EXTRA_ARGS || env.TOX_PYTEST_EXTRA_ARGS }}
101107
TOX_TEST_HARNESS: ${{ matrix.TOX_TEST_HARNESS }}
102108
TOX_EXTRA_COMMAND: ${{ matrix.TOX_EXTRA_COMMAND }}
103109
- uses: actions/upload-artifact@v4

Taskfile.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -378,3 +378,11 @@ tasks:
378378
sys.stderr.write(f"removing {path}\n")
379379
shutil.rmtree(path, ignore_errors=True)
380380
' {{.RIMRAF_TARGET}}
381+
382+
test:rdf4j:
383+
desc: Run fast tests against rdflib.contrib.rdf4j package
384+
cmd: '{{.TEST_HARNESS}}{{.RUN_PREFIX}} pytest -m "not (testcontainer or webtest)" test/test_rdf4j'
385+
386+
test:rdf4j:all:
387+
desc: Run all tests against rdflib.contrib.rdf4j package
388+
cmd: '{{.TEST_HARNESS}}{{.RUN_PREFIX}} pytest test/test_rdf4j'

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ nav:
4343
- Container: apidocs/rdflib.container.md
4444
- Collection: apidocs/rdflib.collection.md
4545
- Paths: apidocs/rdflib.paths.md
46+
- RDF4J: apidocs/rdflib.contrib.rdf4j.md
4647
- Util: apidocs/rdflib.util.md
4748
- Plugins:
4849
- Parsers: apidocs/rdflib.plugins.parsers.md

poetry.lock

Lines changed: 328 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ networkx = {version = ">=2,<4", optional = true}
4949
html5rdf = {version = ">=1.2,<2", optional = true}
5050
lxml = {version = ">=4.3,<6.0", optional = true}
5151
orjson = {version = ">=3.9.14,<4", optional = true}
52+
httpx = {version = "^0.28.1", optional = true}
5253

5354
[tool.poetry.group.dev.dependencies]
5455
black = "24.8.0"
@@ -63,6 +64,7 @@ coverage = {version = "^7.0.1", extras = ["toml"]}
6364
types-setuptools = ">=68.0.0.3,<72.0.0.0"
6465
setuptools = ">=68,<72"
6566
wheel = ">=0.42,<0.46"
67+
testcontainers = {version = "^4.13.2", python = ">=3.9.2"}
6668

6769
[tool.poetry.group.docs.dependencies]
6870
typing-extensions = "^4.11.0"
@@ -85,6 +87,7 @@ html = ["html5rdf"]
8587
# lxml support is optional, it is used only for parsing XML-formatted SPARQL results
8688
lxml = ["lxml"]
8789
orjson = ["orjson"]
90+
rdf4j = ["httpx"]
8891

8992
[build-system]
9093
requires = ["poetry-core>=1.4.0"]
@@ -207,6 +210,8 @@ addopts = [
207210
"--ignore=admin",
208211
"--ignore=devtools",
209212
"--ignore=rdflib/extras/external_graph_libs.py",
213+
"--ignore=rdflib/contrib/graphdb/client.py",
214+
"--ignore=rdflib/contrib/rdf4j/client.py",
210215
"--ignore-glob=docs/*.py",
211216
"--ignore-glob=site/*",
212217
"--strict-markers",
@@ -218,6 +223,7 @@ filterwarnings = [
218223
"ignore:Code. _pytestfixturefunction is not defined in namespace .*:UserWarning",
219224
]
220225
markers = [
226+
"testcontainer: mark a test that uses testcontainer",
221227
"webtest: mark a test as using the internet",
222228
]
223229
# log_cli = true

rdflib/contrib/__init__.py

Whitespace-only changes.

rdflib/contrib/graphdb/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
from rdflib.contrib.rdf4j import has_httpx
2+
3+
if has_httpx:
4+
from .client import GraphDBClient
5+
6+
__all__ = ["GraphDBClient"]

rdflib/contrib/graphdb/client.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
import httpx
2+
3+
import rdflib.contrib.rdf4j
4+
from rdflib.contrib.rdf4j import RDF4JClient
5+
from rdflib.contrib.rdf4j.exceptions import (
6+
RepositoryNotFoundError,
7+
RepositoryNotHealthyError,
8+
)
9+
10+
11+
class Repository(rdflib.contrib.rdf4j.client.Repository):
12+
"""GraphDB Repository"""
13+
14+
def health(self, timeout: int = 5) -> bool:
15+
"""Repository health check.
16+
17+
Parameters:
18+
timeout: A timeout parameter in seconds. If provided, the endpoint attempts
19+
to retrieve the repository within this timeout. If not, the passive
20+
check is performed.
21+
22+
Returns:
23+
bool: True if the repository is healthy, otherwise an error is raised.
24+
25+
Raises:
26+
RepositoryNotFoundError: If the repository is not found.
27+
RepositoryNotHealthyError: If the repository is not healthy.
28+
httpx.RequestError: On network/connection issues.
29+
httpx.HTTPStatusError: Unhandled status code error.
30+
"""
31+
try:
32+
params = {"passive": str(timeout)}
33+
response = self.http_client.get(
34+
f"/repositories/{self.identifier}/health", params=params
35+
)
36+
response.raise_for_status()
37+
return True
38+
except httpx.HTTPStatusError as err:
39+
if err.response.status_code == 404:
40+
raise RepositoryNotFoundError(
41+
f"Repository {self._identifier} not found."
42+
)
43+
raise RepositoryNotHealthyError(
44+
f"Repository {self._identifier} is not healthy. {err.response.status_code} - {err.response.text}"
45+
)
46+
except httpx.RequestError:
47+
raise
48+
49+
50+
class RepositoryManager(rdflib.contrib.rdf4j.client.RepositoryManager):
51+
"""GraphDB Repository Manager"""
52+
53+
def get(self, repository_id: str) -> Repository:
54+
_repo = super().get(repository_id)
55+
return Repository(_repo.identifier, _repo.http_client)
56+
57+
58+
class GraphDBClient(RDF4JClient):
59+
"""GraphDB Client"""
60+
61+
# TODO: GraphDB specific API methods.

rdflib/contrib/rdf4j/__init__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
from importlib.util import find_spec
2+
3+
has_httpx = find_spec("httpx") is not None
4+
5+
if has_httpx:
6+
from .client import RDF4JClient
7+
8+
__all__ = ["RDF4JClient", "has_httpx"]
9+
else:
10+
__all__ = ["has_httpx"]

0 commit comments

Comments
 (0)