Skip to content

Self-Serve MDR: Phase 2 — Schema Isolation per Tenant #883

@bjagg

Description

@bjagg

Overview

Isolate each user/group's MDR data using PostgreSQL schemas so users don't interfere with each other. Auto-provision schemas on registration and seed with the base LIF model.

Epic: #554
Proposal: docs/proposals/mdr-self-serve-registration.md
T-shirt size: L (~28-32h / 1 week)
Dependencies: Blocked by Phase 1 (#882)

Resolved Decisions

  • PostgreSQL schemas for isolation (not row-level security) — true isolation, simple to clone, no query changes needed
  • Auto-provision on registration — post-confirmation Lambda creates schema and seeds it, fully self-service
  • Always seed with base LIF model — an empty MDR is useless for evaluation
  • Schema naming: tenant_{group_name} (e.g., tenant_lif_team, tenant_eval_jsmith)
  • Routing: Auth middleware extracts Cognito group from JWT, sets SET search_path TO tenant_{group} per DB session

Tasks

Task Effort
Create schema provisioning script (scripts/provision-mdr-tenant.sh) — clones base schema, seeds with base LIF data model 8h
Update auth middleware to extract tenant from JWT cognito:groups claim, set request.state.tenant_schema 4h
Update database_setup.pySET search_path TO tenant_{group} per request session 4h
Create lif-team Cognito group and PG schema for existing demo data 4h
Migrate existing public schema data to tenant_lif_team (one-time) 4h
(Optional) Personalize seed data with registrant's org name — post-seed UPDATE pass renames OrgLIF model and ContributorOrganization fields 4h
Test: verify data isolation between two tenants (integration test) 4h

Key Technical Detail

# database_setup.py — per-request schema routing
async def get_session(request: Request) -> AsyncSession:
    tenant_schema = request.state.tenant_schema  # Set by auth middleware
    async with async_session() as session:
        await session.execute(text(f"SET search_path TO {tenant_schema}"))
        yield session

Existing queries work unchanged — they run against whichever schema is in the search_path. No query modifications needed.

Dual Auth Support

Auth Method Tenant Resolution
Cognito JWT (Bearer) cognito:groups[0]tenant_{group}
API Key (X-API-Key) Always → tenant_lif_team

Key Files

  • components/lif/mdr_auth/core.py — auth middleware
  • components/lif/mdr_utils/database_setup.py — DB session factory
  • sam/mdr-database/flyway/flyway-files/flyway/sql/mdr/V1.1__metadata_repository_init.sql — base schema SQL
  • New: scripts/provision-mdr-tenant.sh
  • New: post-confirmation Lambda DB provisioning logic

Acceptance Criteria

  • New user registration auto-creates a dedicated PG schema seeded with the base LIF model
  • Authenticated users see only their schema's data — no cross-tenant data leakage
  • Existing demo data is accessible under the lif-team group/schema
  • Service-to-service API key calls route to lif-team schema
  • (Optional) New tenant's OrgLIF model and contributor fields show their organization name
  • Schema provisioning script can also be run manually for ad-hoc tenant creation

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions