Skip to content

Commit aa30c2a

Browse files
feat(trusted_endpoints): support {id} and {path:path} placeholders in registered URLs
A registered URL may now contain FastAPI/Express-style path placeholders so a single entry covers a family of concrete URLs: {name} - matches exactly one path segment (no '/'). e.g. https://api.example.com/customers/{id} matches /customers/42 but NOT /customers/42/orders. {name:path} - matches any subtree, including '/' separators. e.g. https://api.example.com/customers/{rest:path} matches both /customers/42 and /customers/42/orders. Closes #14. Why: customer-support-sdk-demo had to enumerate ~70 concrete URLs at startup for templated routes (/customers/{id}). Runtime-generated ids (e.g. POST /tickets returning a fresh id) couldn't be trusted until manually registered. A single placeholder entry replaces the enumeration. Implementation: - Plain URLs without '{' keep exact-match semantics. No schema change. No migration needed for existing rows. Existing exact-match tests unchanged. - Pattern matching is auto-detected from URL content. Pattern compilation is LRU-cached so repeated lookups don't recompile the regex. - is_trusted_endpoint uses a two-phase lookup: exact match first (single indexed query, fast path), then a pattern-only scan (LIKE '%{%' filter) for rows containing placeholders. Plain registries see no perf regression. - The snapshot tamper-check inside check_claim_endpoints_are_trusted honors the same syntax — a payload built against a pattern entry verifies cleanly on the receiver side. Tests: 12 new (94 total). Ruff clean.
1 parent b889269 commit aa30c2a

4 files changed

Lines changed: 250 additions & 3 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Changelog
22

3+
## Unreleased
4+
5+
- `trusted_endpoints`: registered URLs may now contain FastAPI/Express-style path placeholders. `{id}` matches exactly one path segment, `{rest:path}` matches any subtree. Plain URLs without `{` keep exact-match semantics — no migration needed for existing rows. Both `is_trusted_endpoint` and the snapshot tamper-check inside `evaluate_handoff` honor the new syntax. Closes #14.
6+
37
## 0.2.0
48

59
- Added `provably.configure_indexing(enable_indexing: bool)`: one-call bootstrap (`initialize_runtime` + `init_interceptor` + `enable` / `disable`) for sender agents.

README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -346,6 +346,31 @@ URLs are normalized (lowercase scheme + host, default ports collapsed, trailing
346346
slash dropped) before any read or write so that `https://API.EXAMPLE.COM/x/`
347347
and `https://api.example.com/x` collide on the same row.
348348

349+
#### Path-pattern entries
350+
351+
Concrete URLs match exactly. To authorize a family of URLs with a single entry —
352+
useful for templated routes like `/customers/{id}` or runtime-generated ids —
353+
register the URL with FastAPI/Express-style placeholders:
354+
355+
| Placeholder | Matches | Example |
356+
|---|---|---|
357+
| `{name}` | exactly one path segment (no `/`) | `https://api.example.com/customers/{id}` matches `…/customers/42` but **not** `…/customers/42/orders` |
358+
| `{name:path}` | any subtree (including `/` separators) | `https://api.example.com/customers/{rest:path}` matches both `…/customers/42` and `…/customers/42/orders` |
359+
360+
The placeholder name (`id`, `rest`, …) is purely descriptive and does not affect
361+
matching. Plain URLs without `{` characters keep exact-match semantics — no
362+
behavior change for existing entries.
363+
364+
```sql
365+
-- Register a templated route once instead of enumerating every concrete id
366+
INSERT INTO trusted_endpoints (org_id, normalized_url, display_label, entry_type)
367+
VALUES ('my-org', 'https://api.example.com/customers/{id}', 'Customers (by id)', 'endpoint');
368+
```
369+
370+
`is_trusted_endpoint` and the snapshot tamper-check inside `evaluate_handoff`
371+
both honor the same matching rules, so a claim against `…/customers/42` will
372+
pass both gates when only the templated entry is registered.
373+
349374
## Public API
350375

351376
All public symbols are re-exported from the top-level `provably` namespace. See

src/provably/trusted_endpoints.py

Lines changed: 86 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
from __future__ import annotations
44

5+
import re
6+
from functools import lru_cache
57
from typing import TYPE_CHECKING
68
from urllib.parse import urlparse
79

@@ -12,6 +14,58 @@
1214

1315
_DDL_DONE = False
1416

17+
# ---------------------------------------------------------------------------
18+
# Pattern matching
19+
#
20+
# A registered URL may contain FastAPI/Express-style path placeholders so a single
21+
# entry can authorize a family of concrete URLs:
22+
#
23+
# {name} — matches one path segment (no '/'). E.g. /customers/{id} matches
24+
# /customers/123 but NOT /customers/123/orders.
25+
# {name:path} — matches any subtree, including '/' separators. E.g.
26+
# /customers/{rest:path} matches both /customers/123 and
27+
# /customers/123/orders.
28+
#
29+
# Plain URLs (no '{' character) keep exact-match semantics — no behavior change for
30+
# existing entries.
31+
# ---------------------------------------------------------------------------
32+
33+
_PLACEHOLDER_RE = re.compile(r"\{[^}/]+(?::path)?\}")
34+
35+
36+
@lru_cache(maxsize=512)
37+
def _compile_pattern(registered: str) -> re.Pattern[str] | None:
38+
"""Compile a registered URL into a regex if it has placeholders, else return None.
39+
40+
Cache keeps regex compilation off the hot per-request path.
41+
"""
42+
if "{" not in registered:
43+
return None
44+
parts: list[str] = []
45+
cursor = 0
46+
has_placeholder = False
47+
for match in _PLACEHOLDER_RE.finditer(registered):
48+
parts.append(re.escape(registered[cursor : match.start()]))
49+
is_path = ":path" in match.group(0)
50+
parts.append(".+?" if is_path else "[^/]+?")
51+
cursor = match.end()
52+
has_placeholder = True
53+
if not has_placeholder:
54+
return None
55+
parts.append(re.escape(registered[cursor:]))
56+
try:
57+
return re.compile(f"^{''.join(parts)}$")
58+
except re.error:
59+
return None
60+
61+
62+
def _matches_registered(claim_url: str, registered: str) -> bool:
63+
"""``True`` when ``claim_url`` exactly matches ``registered`` or matches its pattern."""
64+
if claim_url == registered:
65+
return True
66+
pattern = _compile_pattern(registered)
67+
return pattern is not None and pattern.match(claim_url) is not None
68+
1569

1670
def normalize_url_for_trust(url: str) -> str:
1771
"""Return the canonical form of ``url`` used for trust look-ups.
@@ -74,14 +128,21 @@ def ensure_trusted_endpoints_table(conn: psycopg2.extensions.connection) -> None
74128

75129

76130
def is_trusted_endpoint(url: str, org_id: str, conn: psycopg2.extensions.connection) -> bool:
77-
"""Return whether ``url`` is currently allowlisted for ``org_id``; normalizes URL before look-up."""
131+
"""Return whether ``url`` is currently allowlisted for ``org_id``.
132+
133+
Two-phase lookup: exact match first (fast path, single indexed query), then a
134+
pattern-match scan over only the rows containing ``{`` in their ``normalized_url``.
135+
Plain URLs without placeholders never enter the slow path, so existing exact-match
136+
registries see no perf regression.
137+
"""
78138
if not url or not org_id:
79139
return False
80140
norm = normalize_url_for_trust(url)
81141
if not norm:
82142
return False
83143
_ensure_trusted_table(conn)
84144
with conn.cursor() as cur:
145+
# Fast path: exact match.
85146
cur.execute(
86147
"""
87148
SELECT 1 FROM trusted_endpoints
@@ -90,7 +151,21 @@ def is_trusted_endpoint(url: str, org_id: str, conn: psycopg2.extensions.connect
90151
""",
91152
(org_id, norm),
92153
)
93-
return cur.fetchone() is not None
154+
if cur.fetchone() is not None:
155+
return True
156+
# Slow path: pattern entries only.
157+
cur.execute(
158+
"""
159+
SELECT normalized_url FROM trusted_endpoints
160+
WHERE org_id = %s AND entry_type = 'endpoint' AND revoked_at IS NULL
161+
AND normalized_url LIKE '%%{%%'
162+
""",
163+
(org_id,),
164+
)
165+
for (registered,) in cur.fetchall():
166+
if _matches_registered(norm, str(registered or "")):
167+
return True
168+
return False
94169

95170

96171
def list_trusted_endpoints(
@@ -208,7 +283,15 @@ def check_claim_endpoints_are_trusted(
208283

209284
registry = {n for url in hp.trusted_endpoint_registry if (n := normalize_url_for_trust(str(url)))}
210285
if registry:
211-
missing = list(dict.fromkeys(u for u in claim_urls if u not in registry))
286+
pattern_entries = [r for r in registry if "{" in r]
287+
missing: list[str] = []
288+
for claim_url in claim_urls:
289+
if claim_url in registry:
290+
continue
291+
if any(_matches_registered(claim_url, entry) for entry in pattern_entries):
292+
continue
293+
missing.append(claim_url)
294+
missing = list(dict.fromkeys(missing))
212295
if missing:
213296
raise ValueError(f"handoff has endpoints missing from trusted snapshot: {', '.join(missing)}")
214297

tests/unit/test_trusted_endpoints.py

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
import pytest
66

77
from provably.trusted_endpoints import (
8+
_compile_pattern,
9+
_matches_registered,
810
is_trusted_endpoint,
911
list_trusted_endpoints,
1012
normalize_url_for_trust,
@@ -46,6 +48,139 @@ def test_is_trusted_queries_normalized_row(monkeypatch: pytest.MonkeyPatch) -> N
4648
assert args[1][1] == "https://x.com/a"
4749

4850

51+
# ---------------------------------------------------------------------------
52+
# Pattern matching ({name} and {name:path} placeholders)
53+
# ---------------------------------------------------------------------------
54+
55+
56+
@pytest.mark.parametrize(
57+
"registered",
58+
[
59+
"https://api.example.com/customers",
60+
"https://api.example.com/customers/123",
61+
"https://example.com",
62+
],
63+
)
64+
def test_compile_pattern_returns_none_for_plain_urls(registered: str) -> None:
65+
assert _compile_pattern(registered) is None
66+
67+
68+
def test_pattern_single_segment_matches_one_path_segment() -> None:
69+
pattern = _compile_pattern("https://api.example.com/customers/{id}")
70+
assert pattern is not None
71+
assert pattern.match("https://api.example.com/customers/123") is not None
72+
assert pattern.match("https://api.example.com/customers/abc-DEF") is not None
73+
# Must NOT swallow additional path segments
74+
assert pattern.match("https://api.example.com/customers/123/orders") is None
75+
# Must NOT match a different prefix
76+
assert pattern.match("https://api.example.com/clients/123") is None
77+
# Must NOT match the bare prefix without an id segment
78+
assert pattern.match("https://api.example.com/customers/") is None
79+
80+
81+
def test_pattern_path_placeholder_matches_subtree() -> None:
82+
pattern = _compile_pattern("https://api.example.com/customers/{rest:path}")
83+
assert pattern is not None
84+
assert pattern.match("https://api.example.com/customers/123") is not None
85+
assert pattern.match("https://api.example.com/customers/123/orders/456") is not None
86+
# Still anchored at the prefix
87+
assert pattern.match("https://api.example.com/clients/123") is None
88+
89+
90+
def test_pattern_multiple_placeholders() -> None:
91+
pattern = _compile_pattern("https://api.example.com/customers/{cust}/orders/{order}")
92+
assert pattern is not None
93+
assert pattern.match("https://api.example.com/customers/c1/orders/o9") is not None
94+
assert pattern.match("https://api.example.com/customers/c1/orders/o9/items/x") is None
95+
96+
97+
def test_matches_registered_falls_back_to_exact() -> None:
98+
assert _matches_registered("https://x.com/a", "https://x.com/a") is True
99+
assert _matches_registered("https://x.com/a", "https://x.com/b") is False
100+
101+
102+
def test_matches_registered_uses_pattern_when_present() -> None:
103+
assert _matches_registered("https://x.com/customers/9", "https://x.com/customers/{id}") is True
104+
assert _matches_registered("https://x.com/customers/9/orders", "https://x.com/customers/{id}") is False
105+
106+
107+
def test_is_trusted_endpoint_matches_pattern_entry(monkeypatch: pytest.MonkeyPatch) -> None:
108+
"""A claim URL matching a registered ``{id}`` pattern is trusted via the slow path."""
109+
monkeypatch.setattr("provably.trusted_endpoints._ensure_trusted_table", lambda _c: None)
110+
conn = MagicMock()
111+
cur = MagicMock()
112+
conn.cursor.return_value.__enter__ = lambda *_: cur
113+
conn.cursor.return_value.__exit__ = lambda *_: None
114+
# First query (exact match) misses; second query (pattern entries) returns one row.
115+
cur.fetchone.return_value = None
116+
cur.fetchall.return_value = [("https://api.example.com/customers/{id}",)]
117+
118+
assert is_trusted_endpoint("https://api.example.com/customers/42", "org-1", conn) is True
119+
# Exact-then-pattern: two execute calls.
120+
assert cur.execute.call_count == 2
121+
122+
123+
def test_is_trusted_endpoint_rejects_nonmatching_pattern(monkeypatch: pytest.MonkeyPatch) -> None:
124+
monkeypatch.setattr("provably.trusted_endpoints._ensure_trusted_table", lambda _c: None)
125+
conn = MagicMock()
126+
cur = MagicMock()
127+
conn.cursor.return_value.__enter__ = lambda *_: cur
128+
conn.cursor.return_value.__exit__ = lambda *_: None
129+
cur.fetchone.return_value = None
130+
# Registered pattern allows /customers/{id} only — claim hits a deeper path.
131+
cur.fetchall.return_value = [("https://api.example.com/customers/{id}",)]
132+
133+
assert is_trusted_endpoint("https://api.example.com/customers/42/orders", "org-1", conn) is False
134+
135+
136+
def test_snapshot_check_accepts_pattern_match(monkeypatch: pytest.MonkeyPatch) -> None:
137+
"""The snapshot tamper-check must honor pattern entries the same way the live DB check does."""
138+
from provably.handoff.types import HandoffClaim, HandoffPayload
139+
from provably.trusted_endpoints import check_claim_endpoints_are_trusted
140+
141+
# Live DB check is exercised separately; stub it as trusting whatever made it past
142+
# the snapshot check (returns True).
143+
monkeypatch.setattr("provably.trusted_endpoints.is_trusted_endpoint", lambda *_a, **_kw: True)
144+
monkeypatch.setattr("psycopg2.connect", lambda *_a, **_kw: MagicMock())
145+
146+
payload = HandoffPayload(
147+
provably_org_id="org-1",
148+
trusted_endpoint_registry=["https://api.example.com/customers/{id}"],
149+
claims=[
150+
HandoffClaim(
151+
action_name="get_customer",
152+
request_payload={"url": "https://api.example.com/customers/42", "method": "GET"},
153+
)
154+
],
155+
)
156+
157+
# Should NOT raise — pattern entry covers the concrete URL.
158+
check_claim_endpoints_are_trusted(payload, postgres_url="postgresql://x")
159+
160+
161+
def test_snapshot_check_rejects_url_outside_pattern(monkeypatch: pytest.MonkeyPatch) -> None:
162+
from provably.handoff.types import HandoffClaim, HandoffPayload
163+
from provably.trusted_endpoints import check_claim_endpoints_are_trusted
164+
165+
monkeypatch.setattr("provably.trusted_endpoints.is_trusted_endpoint", lambda *_a, **_kw: True)
166+
monkeypatch.setattr("psycopg2.connect", lambda *_a, **_kw: MagicMock())
167+
168+
payload = HandoffPayload(
169+
provably_org_id="org-1",
170+
trusted_endpoint_registry=["https://api.example.com/customers/{id}"],
171+
claims=[
172+
HandoffClaim(
173+
action_name="get_orders",
174+
# Goes one segment deeper than {id} permits.
175+
request_payload={"url": "https://api.example.com/customers/42/orders", "method": "GET"},
176+
)
177+
],
178+
)
179+
180+
with pytest.raises(ValueError, match="missing from trusted snapshot"):
181+
check_claim_endpoints_are_trusted(payload, postgres_url="postgresql://x")
182+
183+
49184
def test_list_trusted_endpoints_excludes_given_urls(monkeypatch: pytest.MonkeyPatch) -> None:
50185
monkeypatch.setattr("provably.trusted_endpoints._ensure_trusted_table", lambda _c: None)
51186
conn = MagicMock()

0 commit comments

Comments
 (0)