Redesign policy callback context (leaky grounding + missing edge identity)

## Background

Surfaced during the codebase-audit cleanup pass (branch `chore/audit-cleanup`). Two related concerns about the policy-callback API came up — both touching `PolicyCallContext`. They're being deferred so the audit-cleanup PR can stay narrow, and so the redesign happens as one coherent pass instead of two.

## Concern 1 — Leaky `grounding: GroundingResult`

`PolicyCallContext` currently exposes the full `GroundingResult`:

```python
# src/vre/core/policy/callback.py
class PolicyCallContext(BaseModel):
    tool_name: str
    grounding: GroundingResult            # ← full epistemic trace, gaps, internals
    call_args: tuple[Any, ...]
    call_kwargs: dict[str, Any]
```

This gives user-supplied callbacks access to `.grounding.trace.result.primitives`, `.gaps`, `.pathway`, etc. — the whole internal trace shape. Refactoring grounding (e.g. changing the trace structure) could silently break callbacks.

**No callbacks in src/, tests/, or examples/ currently read `context.grounding`** — the leak is forward-looking. A real callback-side use case did surface in design discussion, though: a policy on edge `E` may want to behave differently depending on which **other** root concepts were grounded alongside it (e.g. allow `Delete→File` unless `protected` was also grounded in the same call). That argues for *some* grounding signal, not zero.

Minimum useful surface, scoped to the actual use case:

```python
class GroundingContext(BaseModel):
    agent_id: UUID | None = None
    resolved_concepts: list[str]   # canonical names grounded in this call
```

Notably **dropped from the audit's original suggestion**: `is_grounded` and `gap_count`. By the time a callback fires, grounding has already succeeded; those fields are invariant and dead weight.

Open question: should the facade also let callbacks inspect *primitive properties* of co-occurring concepts (e.g. `get_primitive(name).depths[3].properties["sensitivity"]`)? My lean is no until a real callback needs it — the facade is non-breaking to extend.

## Concern 2 — Same `call_context` reused across all triggering edges

`PolicyGate._collect_violations` walks `(primitive, depth, relatum, policy)` tuples and invokes the callback once per match — but builds `call_context` once at the top and reuses the same instance for every iteration:

```python
# src/vre/core/policy/gate.py
for primitive in response.result.primitives:
    for depth in primitive.depths:
        for relatum in depth.relata:
            for policy in relatum.policies:
                cb_result = cb(call_context)   # same call_context every time
```

A callback registered on multiple edges has no way to know **which edge** triggered the current invocation. The callback only sees `tool_name`, `call_args`, `call_kwargs`, and (today) the full `GroundingResult`.

Three patterns this blocks:

1. **Edge-aware logging.** An `audit_log` callback on `Delete→File`, `Modify→File`, `Read→File` can fire three times per call and produce three indistinguishable log lines.
2. **Source-specific rate limiting.** A `rate_limit` callback on `Send→Email` (5/hr) and `Send→SMS` (20/hr) can't pick the right counter — forcing two callbacks instead of one parameterized one.
3. **Depth-discriminating policies.** Same callback on a primitive's D2 (CAPABILITIES) edge vs. its D3 (CONSTRAINTS) edge can't tell them apart.

Workaround today: write one callback per edge and let the dotted-path callback string carry the discrimination. Works, but pushes complexity into graph configuration.

## Proposed combined shape (sketch)

```python
class GroundingContext(BaseModel):
    agent_id: UUID | None = None
    resolved_concepts: list[str]

class TriggeringEdge(BaseModel):
    source_name: str
    target_name: str
    source_depth: DepthLevel
    target_depth: DepthLevel

class PolicyCallContext(BaseModel):
    tool_name: str
    grounding: GroundingContext
    call_args: tuple[Any, ...]
    call_kwargs: dict[str, Any]
    triggering_edge: TriggeringEdge   # new — built per iteration in gate.py
```

`PolicyGate._collect_violations` builds a fresh `PolicyCallContext` per `(primitive, depth, relatum)` tuple instead of reusing one.

## Open design questions

- Should `Policy` itself (or `policy.metadata`) be on the context? Useful for the rate-limiting case ("stuff the limit into `policy.metadata`, read it from the callback").
- Pass triggering edge **via** the context, or as a **second positional arg** to the callback (would change the `PolicyCallback` Protocol signature — breaking for any existing callbacks)?
- Per-iteration `PolicyCallContext` construction cost — cheap, but worth measuring on a graph with many policy edges.
- Backwards-compatibility story: is this a 1.0-blocker breaking change, or do we ship it as a 0.5.x bump with a migration note?

## Notes

- `PolicyCallContext` and `PolicyCallback` are exported from `vre.__all__` (public API).
- We're at 0.4.x — pre-1.0 latitude is available but worth using deliberately.
- Policy wizard (`core/policy/wizard.py`) and any UI work (mentioned: a future "VRE Workstation UI" for policy creation) should be considered when shaping the final API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign policy callback context (leaky grounding + missing edge identity) #58

Background

Concern 1 — Leaky `grounding: GroundingResult`

Concern 2 — Same `call_context` reused across all triggering edges

Proposed combined shape (sketch)

Open design questions

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Redesign policy callback context (leaky grounding + missing edge identity) #58

Description

Background

Concern 1 — Leaky grounding: GroundingResult

Concern 2 — Same call_context reused across all triggering edges

Proposed combined shape (sketch)

Open design questions

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Concern 1 — Leaky `grounding: GroundingResult`

Concern 2 — Same `call_context` reused across all triggering edges