Skip to content

Flatten entity CRUD tool inputs; inline base-entity $ref (0.5.0)#22

Merged
mgoldsborough merged 4 commits into
mainfrom
pr1-flatten-entity-crud
Apr 16, 2026
Merged

Flatten entity CRUD tool inputs; inline base-entity $ref (0.5.0)#22
mgoldsborough merged 4 commits into
mainfrom
pr1-flatten-entity-crud

Conversation

@mgoldsborough
Copy link
Copy Markdown
Contributor

Summary

  • Auto-generated create_{name} and update_{name} MCP tools take flat kwargs at the top level. The {data: {...}} wrapper is removed. Tool schemas publish examples for create/update/delete.
  • tools/list no longer makes a live HTTP fetch of https://upjack.dev/schemas/v1/upjack-entity.schema.json. The allOf $ref is inlined at schema-build time — unit test suite drops from 18m to 6s.
  • Bumps to 0.5.0 (breaking tool-interface change).

Why

On a long Synapse CRM / Todo session, ~30% of tool calls failed. The largest single cause: auto-generated entity CRUD wrapped inputs in {data: {...}} while every hand-written tool used flat kwargs. Two shapes in the same tool list had the LLM thrashing — Unexpected keyword argument 'data', Error calling tool 'create_deal': 'data', etc.

Flat kwargs is the LLM default, matches FastMCP's decorator form, and matches every custom tool in the synapse apps. One shape everywhere.

Breaking change

There is no backwards-compat shim. A call passing the old {data: {...}} shape now fails schema validation with a clear error, which is the intended signal to callers (including LLMs that have cached the old pattern). Consumers re-read tool schemas each session, so there's no legitimate population that needs grace.

# before
create_deal({"data": {"title": "Acme pilot", "amount": 1000, "stage": "qualified"}})

# now
create_deal({"title": "Acme pilot", "amount": 1000, "stage": "qualified"})

Bonus fix

The activity-enabled test path was taking 22s per test because tools/list was fetching upjack.dev/schemas/v1/upjack-entity.schema.json over HTTP each time any allOf $ref schema was serialized. The local _REGISTRY in upjack.schema was only wired for jsonschema validation, not FastMCP's tool-listing path. Inlining the $ref at schema-build time closes that hole.

Test plan

  • make check at lib/python — all 407 tests pass in 6s
  • ruff format / check / ty check — clean
  • New regression tests: flat-kwarg success, legacy {data:...} rejected, examples present, activities-enabled path
  • End-to-end against synapse-crm (create_deal/update_deal/delete_deal) — flat works, legacy rejected
  • End-to-end against synapse-todo-board (create_board/update_board/delete_board) — flat works, legacy rejected
  • Tag python-v0.5.0 after merge to publish to PyPI
  • Follow-up PRs bump consuming Synapse apps to pin upjack[mcp]>=0.5.0

Auto-generated create_{name} and update_{name} MCP tools no longer wrap
args in {data: {...}}. Entity fields sit at the top level — matching the
FastMCP idiom and the hand-written tool convention. Mixing the two shapes
in one tool list was measurably confusing LLMs (~30% tool-call failure
rate on the auto-generated surface). Tool schemas also publish `examples`
for create/update/delete as in-context anchors.

Separately: tools/list no longer performs a live HTTP fetch of
upjack.dev/schemas/v1/upjack-entity.schema.json. The allOf $ref is
inlined at schema-build time, eliminating a ~4s-per-call penalty that
hit every activity-enabled app. Full pytest suite drops from 18m to 6s.

Breaking change — bumps to 0.5.0.
Moves the base-entity $ref inlining from build-time (every tool schema
build repeated the work) to load-time (once per schema, inside
load_schema). Downstream consumers now see fully self-contained schemas
— no remote $ref, nothing to fetch. Drops the accidental-coupling
layer between schema serialization and network state.

_prepare_entity_schema simplifies: instead of inline-then-filter-then-
strip, just filter the base-entity allOf entry and strip base fields
from properties/required. One concept per branch.

BASE_ENTITY_REF is now a single module-level export from upjack.schema
— no more duplicated URL constant across modules.

Examples switch from heuristic to pass-through: if the entity schema
has a top-level "examples" key, it's published on create/update. No
more invented values. Authors own the examples. Delete and get always
publish a trivial id-only example.

Tests and hydrate_defaults updated to reflect that schemas arriving
here have already been inlined — no live $ref resolution in the
downstream path.

Net -46 LOC across src + tests.
QA adjudication on pr1-flatten-entity-crud identified three real issues:

1. CHANGELOG promised upjack.schema.inline_base_entity_ref() as public API,
   but the refactor made it private. Replaced the entry with the actual new
   public surface (BASE_ENTITY_REF).

2. Author-supplied schema `examples` containing base entity fields (id,
   type, created_at, status, etc.) leaked into the published tool schema —
   instructing the LLM to send framework-managed fields. Now filter
   examples through the base-entity key set before passing through. Test
   added to lock the behavior.

3. TypeScript lib was pre-bumped to 0.5.0 in an earlier commit but the
   server wrapper still shipped the {data: {...}} contract, and used
   entity_id rather than entity-specific {name}_id. The shared 0.5.0
   version would have meant different things across the two SDKs.

   This ports the full Python refactor to lib/typescript:
   - schema.ts: load-time $ref inlining, BASE_ENTITY_REF / BASE_ENTITY_MARKER
     exports, self-contained schemas downstream (AJV's duplicate-$id issue
     sidestepped via a non-standard marker rather than keeping $id).
   - server.ts: flat kwargs for create/update/delete, {name}_id param,
     pass-through examples with base fields stripped.
   - activity.ts: getActivitySchema() delegates to loadSchema.
   - tests: 40+ call sites migrated from {data: ...} to flat kwargs,
     entity_id → {name}_id, new legacy-rejection test, hydrateDefaults
     test switched to loadSchema path.

One non-blocking note from QA accepted: update tool description now calls
out that unknown fields are silently merged onto the entity. One rejected:
setdefault("type", "object") is load-bearing when an activity-style schema
has no top-level type — kept as-is.

Python: 410 tests, TypeScript: 281 tests. Both suites green.
Second QA pass found three divergences between the Python and TypeScript
SDKs — all pre-existing but now more load-bearing because of the new
examples-stripping path added in the previous commit.

1. BASE_ENTITY_FIELDS: Python's _BASE_ENTITY_KEYS was missing
   'relationships'; _BASE_ENTITY_FIELD_NAMES was missing 'source'.
   TypeScript's sets were correct. Effect: Python published
   relationships in tool input schemas and examples when the author
   declared them, while TypeScript stripped. Python's add_field let
   authors add a field named 'source' (clashing with the base entity's
   source field); TS correctly rejected.

   Fix: hoisted a single canonical BASE_ENTITY_FIELDS frozenset to
   schema.py (mirror Set in schema.ts). Both server.py and server.ts
   import it under their local aliases. Parity test in each SDK pins
   the ten-field set.

2. Inline marker: Python's load_schema preserved $id on the inlined
   base entity; TypeScript stripped $id and attached x-upjack-base-entity.
   Tool output schemas published over MCP therefore differed in shape
   across SDKs. Not user-visible today but would trip any downstream
   AJV consumer validating a Python-produced outputSchema (the same
   duplicate-$id trap TS already sidesteps).

   Fix: Python now strips $id and attaches BASE_ENTITY_MARKER matching
   TypeScript. Both SDKs produce byte-identical tool schemas for the
   same entity.

3. {name}_id strip edge case: If an app declares a top-level property
   literally named `{entity_name}_id` (e.g. `user_id` on a `user`
   entity), it becomes unreachable via update_* because the handler
   strips that key before merging.

   Fix: comment added to both update handlers noting the convention.
   (Runtime detection at server-build time deferred — not worth the
   noise for a naming convention that's easy to avoid.)

Python: 411 tests. TypeScript: 282 tests. Both clean under
ruff/biome/tsc/ty.
@mgoldsborough mgoldsborough added the qa-reviewed QA review completed with no critical issues label Apr 16, 2026
@mgoldsborough mgoldsborough merged commit 34aae48 into main Apr 16, 2026
4 checks passed
@mgoldsborough mgoldsborough deleted the pr1-flatten-entity-crud branch April 16, 2026 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qa-reviewed QA review completed with no critical issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant