You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most JSON-serialized blobs on Primitive nodes and outgoing relationships use the *_json suffix to signal "this is a stringified JSON payload, not a native Neo4j scalar." Two properties — provenance (on nodes and relationships) and policies (on relationships) — break this convention. Both were added after the original schema (provenance in 41b9d77, policies later) and did not conform to the existing pattern.
The result is a storage layer where the property name no longer reliably tells a reader whether they're looking at a native value or a JSON blob that needs json.loads() before use.
Problem Statement
Current property naming on stored primitives and relationships:
Property
Storage form
Suffix follows convention?
depths_json
JSON string
yes
metrics_json
JSON string
yes
metadata_json
JSON string (on rel)
yes
provenance
JSON string
no (node + rel)
policies
JSON string (on rel)
no
Concretely from src/vre/core/graph.py:
save_primitive() writes p.provenance = $provenance (line 328) where $provenance is the output of _dump_model_json() — a JSON string.
Relationship CREATE writes provenance: $provenance and policies: $policies (lines 360–361), both JSON strings.
The hydration path then has to _parse_json_field() these properties to get usable dicts (lines 198–201, 217–220).
Why this matters:
Reader cognitive load — p.metrics_json clearly needs decoding; p.provenance does not visually advertise that it does too. Future contributors querying the graph in Cypher (debugging, ad-hoc reporting) will be surprised.
Schema drift signal — the inconsistency hints that the convention isn't load-bearing, which makes future additions less likely to follow it.
Documentation accuracy — the docstring at the top of the module describes "embedded depth JSON" but doesn't enumerate the schema; users reading the code rely on naming to understand storage shape.
Proposed Solution
Standardize on the *_json suffix for every property whose value is a JSON-serialized Pydantic model.
save_primitive() (lines 326–371) — write side: rename $provenance → $provenance_json and the SET/CREATE clauses to match.
_record_to_node_data() (lines 134–145) — rename the dict key returned to hydration.
_record_to_relationships() (lines 147–166) — rename provenance and add policies_json style if we also rename the local dict shape.
_hydrate_primitive() (lines 168–238) — read side: pull from the renamed fields.
find_by_id() Cypher (lines 470–513) — RETURN p.provenance_json AS provenance_json, r.provenance_json, r.policies_json in the collect({...}).
find_by_name() Cypher (lines 515–559) — same rename.
resolve_subgraph() Cypher (lines 578–685) — same rename in both the per-node projection and the per-edge projection.
No changes outside graph.py are needed: the Pydantic field names on Primitive, Depth, and Relatum remain provenance and policies — only the on-disk Neo4j property names change.
Migration of existing graphs
This is a breaking change for any existing Neo4j database that was written before the rename. A one-shot Cypher migration is required:
// NodesMATCH (p:Primitive)
WHEREp.provenanceISNOTNULLSETp.provenance_json=p.provenanceREMOVEp.provenance;
// Relationships — applies to every relation type, so iterate via type filterMATCH ()-[r]->()
WHEREr.provenanceISNOTNULLSETr.provenance_json=r.provenanceREMOVEr.provenance;
MATCH ()-[r]->()
WHEREr.policiesISNOTNULLSETr.policies_json=r.policiesREMOVEr.policies;
Ship this as a script under scripts/migrate_property_names.py (matching the style of clear_graph.py and seed_all.py) so users can run it once against their graph.
VRE Design Alignment
This is purely a persistence-layer cleanup. It does not affect:
The agent–VRE contract — agents never see Neo4j property names; they work with Primitive, Depth, Relatum Python objects.
Grounding, policy evaluation, or gap detection — all read hydrated Pydantic models.
Epistemic semantics — provenance still attaches to nodes, depths, and relata exactly as before.
The change reinforces a principle that's already in the codebase: storage-shape information should be visible at the call site. Epistemic honesty isn't directly at stake, but consistency makes the storage layer easier to audit and extend.
Acceptance Criteria
All Cypher writes in save_primitive() and update_metrics() use provenance_json / policies_json for the renamed properties.
All Cypher reads (find_by_id, find_by_name, resolve_subgraph, batch_read_metrics) project the renamed properties.
_hydrate_primitive, _record_to_node_data, and _record_to_relationships read from the renamed keys.
No occurrences of p.provenance, r.provenance, or r.policies remain as Cypher property accesses (Pydantic field accesses like relatum.provenance are unchanged).
A migration script (scripts/migrate_property_names.py) renames properties on existing graphs and is documented in README.md upgrade notes.
All existing tests pass without modification (Pydantic surface unchanged); persistence round-trip tests cover the new property names.
Release notes call out the breaking change and link to the migration script.
Open Questions
Should this rename ride along with another schema-touching change (e.g. issue Atomic metric updates for concurrent safety #50, atomic metric updates) so users only run one migration? Or ship standalone with a clear minor-version bump?
Is there appetite to go further and rename the Pydantic fields too (e.g. drop the .policies list from Relatum in favor of a different shape)? Out of scope for this issue, but worth flagging if any redesign is planned.
Summary
Most JSON-serialized blobs on
Primitivenodes and outgoing relationships use the*_jsonsuffix to signal "this is a stringified JSON payload, not a native Neo4j scalar." Two properties —provenance(on nodes and relationships) andpolicies(on relationships) — break this convention. Both were added after the original schema (provenance in41b9d77, policies later) and did not conform to the existing pattern.The result is a storage layer where the property name no longer reliably tells a reader whether they're looking at a native value or a JSON blob that needs
json.loads()before use.Problem Statement
Current property naming on stored primitives and relationships:
depths_jsonmetrics_jsonmetadata_jsonprovenancepoliciesConcretely from
src/vre/core/graph.py:save_primitive()writesp.provenance = $provenance(line 328) where$provenanceis the output of_dump_model_json()— a JSON string.provenance: $provenanceandpolicies: $policies(lines 360–361), both JSON strings._parse_json_field()these properties to get usable dicts (lines 198–201, 217–220).Why this matters:
p.metrics_jsonclearly needs decoding;p.provenancedoes not visually advertise that it does too. Future contributors querying the graph in Cypher (debugging, ad-hoc reporting) will be surprised.Proposed Solution
Standardize on the
*_jsonsuffix for every property whose value is a JSON-serialized Pydantic model.Renames required:
p.provenance→p.provenance_jsonr.provenance→r.provenance_jsonr.policies→r.policies_jsonFiles affected (all in
src/vre/core/graph.py):save_primitive()(lines 326–371) — write side: rename$provenance→$provenance_jsonand the SET/CREATE clauses to match._record_to_node_data()(lines 134–145) — rename the dict key returned to hydration._record_to_relationships()(lines 147–166) — renameprovenanceand addpolicies_jsonstyle if we also rename the local dict shape._hydrate_primitive()(lines 168–238) — read side: pull from the renamed fields.find_by_id()Cypher (lines 470–513) —RETURN p.provenance_json AS provenance_json,r.provenance_json,r.policies_jsonin thecollect({...}).find_by_name()Cypher (lines 515–559) — same rename.resolve_subgraph()Cypher (lines 578–685) — same rename in both the per-node projection and the per-edge projection.No changes outside
graph.pyare needed: the Pydantic field names onPrimitive,Depth, andRelatumremainprovenanceandpolicies— only the on-disk Neo4j property names change.Migration of existing graphs
This is a breaking change for any existing Neo4j database that was written before the rename. A one-shot Cypher migration is required:
Ship this as a script under
scripts/migrate_property_names.py(matching the style ofclear_graph.pyandseed_all.py) so users can run it once against their graph.VRE Design Alignment
This is purely a persistence-layer cleanup. It does not affect:
Primitive,Depth,RelatumPython objects.The change reinforces a principle that's already in the codebase: storage-shape information should be visible at the call site. Epistemic honesty isn't directly at stake, but consistency makes the storage layer easier to audit and extend.
Acceptance Criteria
save_primitive()andupdate_metrics()useprovenance_json/policies_jsonfor the renamed properties.find_by_id,find_by_name,resolve_subgraph,batch_read_metrics) project the renamed properties._hydrate_primitive,_record_to_node_data, and_record_to_relationshipsread from the renamed keys.p.provenance,r.provenance, orr.policiesremain as Cypher property accesses (Pydantic field accesses likerelatum.provenanceare unchanged).scripts/migrate_property_names.py) renames properties on existing graphs and is documented inREADME.mdupgrade notes.Open Questions
.policieslist fromRelatumin favor of a different shape)? Out of scope for this issue, but worth flagging if any redesign is planned.Dependencies