Skip to content

Schema migration framework — renames, transforms, version tracking #9

@mgoldsborough

Description

@mgoldsborough

Context

PR #8 added schema evolution guardrails (validate_schema_change, add_field tool, hydrate-on-read). That covers the safe additive path — adding fields with defaults, catching unsafe changes before they land.

What it doesn't cover is breaking changes — the cases where you need to restructure data, not just extend it.

Gaps identified

Scenario Current state
Rename a field (e.g. company_sizeteam_size) No mechanism — indistinguishable from add + remove
Transform a field (e.g. string → object) validate_schema_change blocks it, but no way to actually do it
Remove a field and clean up old data Warned but old JSON files retain the field forever
Schema version tracking No version in manifest or schema file — can't reason about "which version was this entity written under"
Migrate-on-read Entity version field tracks revision count, not schema version — no transform pipeline

Possible approach

  • Schema version in the manifest (schema_version per entity type or global)
  • Migration functions registered per version bump (Python callables, or declarative rename/transform rules)
  • Migrate-on-read: when entity's schema version < current, run transforms, persist updated entity
  • migrate_all tool for bulk backfill when needed
  • rename_field and transform_field MCP tools as companions to add_field

Priority

Low for now — additive evolution covers the current use cases (leadgen, CRM examples). This becomes urgent when a product needs a breaking schema change on existing data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions