Skip to content

feat: VSAC value sets integration — CMS-versioned code lists, measure strata, evaluator seam#1

Merged
sudoshi merged 14 commits into
mainfrom
feature/cds-vsac-value-sets
Jun 13, 2026
Merged

feat: VSAC value sets integration — CMS-versioned code lists, measure strata, evaluator seam#1
sudoshi merged 14 commits into
mainfrom
feature/cds-vsac-value-sets

Conversation

@sudoshi

@sudoshi sudoshi commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

  • Imports the NLM VSAC data asset from Parthenon: 1,545 value sets / 225,261 codes / 72 CMS measures into new phm_edw.vsac_* reference tables (migration 050), with a measure_value_set bridge auto-mapping 44 of 45 local CMS measures to value-set OIDs by base CMS number (only CMS249 has no VSAC entry; version drift v12↔v14 recorded explicitly)
  • One-shot load script packages/db/scripts/load-vsac.sh (psql \copy pipes, --reload guard, built-in count verification) — already run against the dev DB; row counts verified exact against source
  • Adds vsacService (resolveMeasureCodes is the workhorse for the Phase 2 population finder) + read-only transparency endpoints: GET /value-sets, /value-sets/measure/:code, /value-sets/:oid/codes
  • Hardens the measure calculator (migration 051): single-pass GROUPING SETS stratification (phm_star.fact_measure_strata: headline + age band + gender, rebuilt atomically with fact_measure_result), Wilson 95% CIs on summary and new GET /measures/:id/strata, eCQM exclusion accounting regression-gated (excluded ∉ denominator ∧ ∉ numerator — verified 0 violations live)
  • Installs the MeasureEvaluator seam (MEASURE_EVALUATOR=sql|cql) so a future CQL/cqf-ruler engine drops in with no caller change; worker + admin refresh paths now go through it

Notes for merge

  • Migrations are numbered 050/051 deliberately — 039/040 were claimed by the concurrent Phase 5 session same-day; the runner sorts lexicographically and tracks full filenames, so the gap is harmless
  • Migrations are additive-only. The VSAC data load is a one-time script run per environment; prod needs a reachable source DB or a portable pg_dump --data-only of the four phm_edw.vsac_* tables from a loaded environment
  • EDW code-system reality (verified): condition/procedure are SNOMED CT, medication RxNorm, observation LOINC — the upstream handoff's ICD-10/CPT routing was corrected in EDW_CODE_SYSTEM
  • Spec: docs/superpowers/specs/2026-06-12-parthenon-ecqm-handoff.md; plan: docs/superpowers/plans/2026-06-12-vsac-value-sets-integration.md

Test plan

  • 121/121 vitest tests green (14 files), tsc --noEmit clean
  • Live load verified: 1545/225261/72/1597 rows; bridge 1,015 rows / 44 measures; CMS122 diabetes OID code count matches source exactly (774)
  • EDW joinability: 89 SNOMED condition codes, 228 RxNorm medication codes, LOINC observations confirmed
  • Exclusion regression gate: 0 rows with exclusion_flag AND (denominator_flag OR numerator_flag)
  • Strata reconcile with facts for every measure (0 mismatches); refresh steady-state 231 ms (no nightly-runtime regression)
  • Live endpoint proof: GET /measures/136/strata returns all three dimensions with rates + CIs
  • After merge: run ./packages/db/scripts/load-vsac.sh in any environment that hasn't loaded VSAC yet

🤖 Generated with Claude Code

sudoshi and others added 14 commits June 12, 2026 18:44
…dge seed)

Copies 1,545 value sets / 225,261 codes / 72 measures / 1,597 measure-value-sets
from parthenon app.vsac_* to medgnosis phm_edw.vsac_* via \copy TO STDOUT | FROM STDIN.
Seeds the measure_value_set bridge (44 of 45 CMS measures; CMS249v6 has no VSAC entry).

EDW joinability confirmed:
  conditions  (SNOMEDCT): 89 distinct codes
  medications (RXNORM):  228 distinct codes
  observations (LOINC):   43 distinct codes in sampled 5k rows (full scan
                          blocked by 1B-row table; no index on observation_code)

Source-consistency: OID 2.16.840.1.113883.3.464.1003.103.12.1001 (Diabetes)
  medgnosis count = 774, parthenon count = 774. Exact match.
Pure Wilson score interval (wilsonCI) for binomial proportions —
preferred over normal approximation for the small panels Medgnosis serves;
bounds are always clamped to [0, 1]. 5 Vitest tests, all passing; tsc clean.
Implements vsacService.ts with listValueSets, getValueSetCodes,
getMeasureValueSets, and resolveMeasureCodes over phm_edw.vsac_* tables
and the measure_value_set bridge. Uses inline NULL-coalescing for optional
filters to avoid nested sql`` fragments. EDW_CODE_SYSTEM maps domains to
verified VSAC code systems (condition/procedure = SNOMEDCT, not ICD-10/CPT).

TDD: 7/7 tests pass. Live DB: CMS122v12 resolves 2,704 SNOMEDCT codes
across 26 value sets.
… /:oid/codes)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n 051)

- Migration 051: creates phm_star.fact_measure_strata (SERIAL PK, measure_key,
  date_key_period, dimension, stratum, denominator, numerator, excluded, created_at)
  with idx_fms_measure index. Rebuilt atomically with fact_measure_result.

- measureCalculatorV2: single-pass GROUPING SETS produces 'all' + 'age_band' +
  'gender' strata in one subquery scan (no extra round-trip). TRUNCATE of both
  tables inside the same transaction — strata and results never diverge.
  Statement timeout raised 30s → 60s to accommodate the added strata INSERT.

- getMeasureSummary: adds ci_lower/ci_upper fields (Wilson 95% CI, percent ×1 decimal)
  via wilsonCI(). Returns null for measures with zero eligible patients.

Refresh timings (26967 fact rows, live DB):
  First run:       { rowCount: 26967, durationMs: 1596 }
  Steady-state:    { rowCount: 26967, durationMs: 231 }

Verification: dimensions=age_band,all,gender | violations=0 | mismatches=0
…terface

Introduces MeasureEvaluator interface with sql (current) and cql (stub)
implementations. Call sites in the BullMQ worker and admin routes now
resolve the engine via getMeasureEvaluator() instead of calling
refreshMeasureResults() directly. Engine selected by MEASURE_EVALUATOR
env var (default: sql). 5 new unit tests, full suite 121/121 green,
tsc clean.
Matches the referential policy of sibling star fact tables (ON DELETE RESTRICT).
Applied to the live dev DB via equivalent ALTER TABLE since 051 was already
recorded in _migrations there; fresh environments get it from the CREATE TABLE.
… review)

- resolveMeasureCodes header now warns it unions ALL population roles
  (~82% of CMS122 SNOMEDCT codes are exclusion-family) — must not drive
  population finding until the bridge carries population_role
- EDW_CODE_SYSTEM comment clarifies values are VSAC labels, not EDW
  code_system column values ('SNOMED'/'ICD-10') — translate before joining
- load-vsac.sh verification now asserts source==destination counts and
  exits non-zero on mismatch (was echo-only, could ship wrong data green)
sudoshi added a commit that referenced this pull request Jun 13, 2026
…ration

Lands PRs #1 (VSAC asset + measure hardening) and #2 (CI restoration) along with the clinical-fidelity work. See DEVLOG Session 20.
@sudoshi sudoshi merged commit e6a54dd into main Jun 13, 2026
3 of 6 checks passed
@sudoshi sudoshi deleted the feature/cds-vsac-value-sets branch June 13, 2026 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant