Human-facing setup: README.md. This file is for anyone (or any agent)
changing behavior in this SDK: what must stay true, where code lives, and how
pieces talk. Prefer editing code over guessing.
The SDK ships five pillars and only five:
init— one-time runtime bootstrap (provably.handoff.client.initialize_runtime).intercept— globalrequests/httpxmonkey-patch + storage of intercepted rows + simulation hook.handoff— PydanticHandoffPayloadv2 + JSON transport (post_handoff).eval— deterministic evaluator (evaluate_handoff) with four verification modes.trusted_endpoints— registry DDL, normalization, and the policy edge.
Anything outside these five is out of scope for the SDK and lives in consumer repos. In particular: agent orchestration (LangGraph, OpenAI Agents, etc.), web servers (FastAPI / Flask), dashboards, deployment configuration, and demo audit trails.
provably-python-sdk/
pyproject.toml name="provably-sdk", import path="provably", src layout
README.md install + quickstart + public surface
CONTEXT.md this file
CHANGELOG.md
LICENSE.md
src/provably/
__init__.py public surface; only re-exports the documented API
log.py structlog wrapper used SDK-internally
common/env.py env-var helpers
trusted_endpoints.py registry DDL + normalization + policy check
intercept/ global requests/httpx monkey-patch + storage
handoff/ types, transport, evaluator, eval modes, bootstrap
tests/
conftest.py docs the two-layer setup
unit/ fast, hermetic; mocks for httpx + psycopg2
e2e/ real loopback http.server; real requests + httpx
docs/ architecture, per-pillar deep dives, historical plans
.github/workflows/ ci.yml (active, runs lint + tests + docker), publish.yml (manual stub)
Dockerfile multi-stage: builder → test → runtime
docker-compose.yml sdk container + Postgres for integration tests
.dockerignore
What is allowed in src/provably/ |
What is forbidden |
|---|---|
| stdlib | fastapi, flask, any web framework |
httpx, requests |
langgraph, langchain, crewai, autogen |
pydantic, jsonschema |
openai, anthropic, any LLM-vendor SDK |
psycopg2-binary (see issue #1 for planned optional extras) |
uvicorn, gunicorn, any server |
structlog |
python-dotenv, app-level config helpers |
CI should fail any PR that adds a forbidden import to src/provably/. Until a
ruff rule is configured, a grep -R "from fastapi\|import langgraph\|import openai\|import anthropic" src/ check is sufficient.
The contract lives in src/provably/__init__.py. Everything in __all__ is a
public API and changing its signature is a breaking change. Anything that
starts with an underscore, or any module imported via from provably.handoff._x
etc., is internal and may change without notice.
When adding a new public symbol:
- Define it in the relevant subpackage.
- Re-export it from
src/provably/__init__.pyand add it to__all__. - Document it in
README.md(and the relevantdocs/<pillar>.md). - Add at least one unit test and, where the symbol crosses an I/O boundary, one e2e test.
- Add a
CHANGELOG.mdentry under the next unreleased version.
The SDK touches the outside world in exactly four places. Every external call must go through one of them. Adding a fifth is a design decision, not an implementation detail.
| Module | What it does | How |
|---|---|---|
provably.intercept._storage |
Insert into provably_intercepts |
psycopg2.connect(POSTGRES_URL) |
provably.trusted_endpoints |
Read / write trusted_endpoints table |
psycopg2.connect(POSTGRES_URL) (in check_claim_endpoints_are_trusted); caller-provided conn elsewhere |
provably.handoff._preprocess |
One-time intercept-table padding at startup | psycopg2.connect(POSTGRES_URL) |
provably.handoff.evaluator + provably.handoff.transport + provably.handoff._bootstrap |
All HTTP egress | httpx.Client / httpx.post |
The interceptor monkey-patches requests and httpx for everyone else in
the process; SDK-internal HTTP calls always use httpx directly so they are
not double-counted.
The SDK reads from environment variables only. The full set is documented in
README.md. Changing this contract — adding, removing, or renaming a variable
the SDK reads — is a breaking change.
A typed Provably(...) client that owns configuration explicitly is planned
(issue #2). When that lands, the env-var path should remain functional via a
default singleton.
tests/unit/ is the fast inner loop. It must:
- Run in under one second total.
- Use no real Postgres connection.
- Use no real network sockets.
- Mock at the
httpx.Client/psycopg2.connectionboundary, not deeper.
tests/e2e/ is the contract layer. It must:
- Drive the real monkey-patched
requestsandhttpxagainst a real loopbackhttp.server. - Patch only the storage layer (
provably.intercept.interceptor._insert_row) to keep the suite Postgres-free. - Stay deterministic: no real DNS, no public network, no time-dependent assertions.
A new pillar without at least one unit test and one e2e test is not considered done.
The repo is dockerised so any contributor (or CI) can reproduce the test matrix without a local Python toolchain.
Dockerfile— three stages:builderbuilds the wheel/sdist fromsrc/into/dist.testinstalls the wheel + dev tools and defaults toruff check && pytest -q.runtimeis a slim image with only the wheel installed, suitable as a base for services that consume the SDK.
docker-compose.yml— pairs thetestimage with a Postgres 16 service (db) and wiresPOSTGRES_URLinto thesdkcontainer, so future integration tests that exercise the real psycopg2 path can use it without extra plumbing.- CI —
.github/workflows/ci.ymlhas adockerjob that builds bothtestandruntimetargets and runs them, so the Docker layout cannot silently rot.
If you change pyproject.toml (deps, optional extras, name, license,
build-system), bump the Docker layer that copies that file and run
docker run --rm provably-sdk:test locally before pushing.
- Removing or renaming a public symbol — bump the minor version and add a
deprecation shim under the old name. Document the migration in
CHANGELOG.md. - Changing the wire format of
HandoffPayload— bumphandoff_contract_version(currently"2.0") and document both versions during the transition. - Adding a Postgres dependency to a new module — don't. Open an issue blocked on #1; the plan is to invert this so callers pass a connection in.
- Adding an LLM SDK dependency — also don't. The SDK stays vendor-neutral.
- Agent frameworks. The SDK has zero awareness of LangGraph nodes, OpenAI Agents, CrewAI roles, etc. Consumers wire those up themselves.
- Dashboards, runners, simulation orchestrators. These belong to the consumer
monorepo (e.g.
verifiable-state-demo). - Demo-only audit trails. The
demo_audittable lives in the consumer monorepo (agents/pipeline/demo_audit.py), not in this SDK.
See docs/historical-plans/ — in particular
split-from-monorepo.md, which
records why this SDK was carved out of the demo monorepo and what was
deliberately left behind.