Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ The SDK auto-discovers native binaries by checking `sdk/bin/<target-triple>/` (n
### Schema system

- **Stable schemas**: released, immutable schemas live in [`schemas/stable/`](../schemas/stable) (one file per released version, plus a `-strict` view) β€” never edit them after release.
- **Dev schema**: the in-progress schema lives in [`schemas/dev/`](../schemas/dev).
- **Dev schema**: the in-progress schema lives in [`schemas/dev/`](../schemas/dev). It is **generated** from the Rust wire model (`src/core/wxc_common/src/wire.rs`) by the `mxc_schema_gen` tool β€” **do not hand-edit it**. To change the dev schema, edit the wire model and regenerate with `cargo run --manifest-path src/Cargo.toml -p mxc_schema_gen -- schemas/dev/mxc-config.schema.<dev>.json`. `scripts/versioning/check-schema-codegen.js` is a CI gate that regenerates and fails if the committed schema drifts. See [`docs/schema-codegen.md`](../docs/schema-codegen.md).
- **Canonical schema-version source**: `schemas/schema-version.json` β€” the single source of truth for the schema-version constants (min/maxSupported/state-aware/stable/dev). `scripts/versioning/check-schema-versions.js` enforces that the Rust parser, SDK, and schema filenames all agree with it; do not hand-edit a schema-version constant without updating the canonical file. See [`docs/versioning.md`](../docs/versioning.md) for the full design.
- Config files can reference schemas via `"$schema"` for editor validation. `scripts/versioning/validate-configs.js` validates the `tests/examples` + `tests/configs` corpus against the dev schema in CI.

Expand Down Expand Up @@ -162,7 +162,7 @@ State-aware lifecycle:

New features go under the `experimental` JSON section and are only active when `--experimental` is passed. See `docs/authoring-a-new-feature.md` for the full checklist. The pattern:

1. Add the property schema to `schemas/dev/` under `experimental`
1. Add the field to the Rust wire model (`src/core/wxc_common/src/wire.rs`) under the `Experimental` section, then regenerate the dev schema (`cargo run --manifest-path src/Cargo.toml -p mxc_schema_gen -- schemas/dev/mxc-config.schema.<dev>.json`) β€” do not hand-edit the generated schema
2. Add Rust structs to `models.rs` (`ExperimentalConfig`) and `config_parser.rs` (`RawExperimental`)
3. Guard execution behind `if request.experimental_enabled` in the runner
4. Never modify files in `schemas/stable/` β€” those are immutable release artifacts
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/Versioning.Checks.Job.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ jobs:
with:
node-version: 20

- name: Setup Rust toolchain
run: rustup update stable

- name: Check product version sync (Cargo == npm)
run: node scripts/check-version-sync.js

Expand All @@ -27,5 +30,8 @@ jobs:
- name: Check Rust toolchain sync (rust-toolchain.toml == ADO template)
run: node scripts/versioning/check-rust-toolchain-sync.js

- name: Check schema is in sync with the Rust wire model (codegen)
run: node scripts/versioning/check-schema-codegen.js

- name: Validate config corpus against dev schema
run: node scripts/versioning/validate-configs.js
114 changes: 114 additions & 0 deletions docs/schema-codegen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Schema codegen (config schema is generated from Rust)

The MXC config JSON Schema is **generated from the Rust wire model**, not
hand-authored: one source of truth (Rust types), with the schema and (later) the
SDK types as generated by-products that cannot drift from it.

## Source of truth

`src/core/wxc_common/src/wire.rs` defines the wire model (`MxcConfig` and its
nested types). It is precise by construction:

- real `enum`s for closed value sets (`Containment`, `NetworkPolicy`, `Phase`, …),
- `#[serde(rename_all = "camelCase")]` for wire names,
- `#[serde(deny_unknown_fields)]` on the **stable** surface β†’ the generated
schema is closed (`additionalProperties: false`),
- the `experimental` block is intentionally **permissive** (no
`deny_unknown_fields`): experimental features are in flux, so the schema
documents the known shapes for editors without rejecting in-progress fields.
Strict, closed contract = the stable (top-level) surface only.
- `///` doc-comments become schema `description`s.

## Generating

```
cargo run --manifest-path src/Cargo.toml -p mxc_schema_gen -- schemas/dev/mxc-config.schema.0.8.0-dev.json
```

`mxc_schema_gen` calls `wxc_common::wire::generate_config_schema_json()`, which
runs schemars and post-processes the result to replace schemars' Rust-specific
integer `format`s (`uint32`, …) β€” undefined in JSON Schema draft-07 β€” with
standard constraints (`minimum: 0` for unsigned). The `schema-gen` feature on
`wxc_common` gates the schemars dependency so production builds don't carry it.

## CI gates (`Versioning Checks` job)

- **`check-schema-codegen.js`** β€” regenerates the schema and fails if the
committed `schemas/dev/...json` differs (the schema can never go stale).
- **`validate-configs.js`** β€” validates the `tests/examples` + `tests/configs`
corpus against the committed schema.
- **`check-schema-versions.js`** / **`check-version-sync.js`** β€” version-constant
and product-version sync.

## What the generated schema does NOT contain

Cross-field constraints β€” the single-backend-section rule and phase-scoping that
the hand-written schema expressed with top-level `allOf` β€” are **not** in the
generated schema. They are enforced by the parser (`wxc_common::config_parser`),
which is the trust boundary. The schema is an editor/CI convenience, never the
gate; the parser rejects a backend/containment mismatch regardless of what the
schema says.

## Equivalence to the previous hand-written schema

The generated schema replaced a hand-maintained one. Because the schema is a
convenience and not the trust boundary, equivalence is judged **behaviorally**,
not by diffing the JSON line-by-line (the encodings differ: the hand schema
inlined every object, while schemars emits a `definitions` block with `$ref`
indirection and wraps optionals as `anyOf: [{ $ref }, { "type": "null" }]`, so
the file roughly doubled in size with no change in meaning). Three lenses:

1. **Accept side** β€” every config in the `tests/examples` + `tests/configs`
corpus must still validate. The `validate-configs.js` gate enforces this.
2. **Reject side** β€” the *effective* per-property constraints (allowed keys,
enum value sets, `additionalProperties` open/closed, `required`) after
resolving `$ref`s.
3. **Delegation** β€” constraints a JSON Schema expresses awkwardly are
deliberately moved to the parser.

Comparing the generated schema against the prior hand-written one on lens (2):

- **Enums are identical** on every canonical path (`containment`,
`network.defaultPolicy`, `network.enforcementMode`, `ui.clipboard`,
`processContainer.ui.isolation`, `seatbelt.launchMethod`,
`isolation_session.configurationId`, port `protocol`).
- **The generated schema is stricter:** it closes the stable nested objects
(`process`, `network`, `filesystem`, `lifecycle`, `ui`, `lxc`, `fallback`,
`processContainer`/`.ui`, `seatbelt`, the isolation `user` bundle) with
`additionalProperties: false`, matching the wire model's `deny_unknown_fields`.
The hand schema left several of these open, so the generated one catches
nested typos the old one silently accepted.
- **The generated schema is more complete:** it documents surface the hand
schema omitted β€” `processContainer.learningMode`,
`experimental.windows_sandbox.idleTimeout` (legacy alias),
`experimental.seatbelt` (pre-promotion alias), and the per-phase
`isolation_session.{provision,start,stop,deprovision}` nesting.

Two reductions are intentional, each compensated by the parser:

| Dropped from the schema | Why it's safe |
| --- | --- |
| Top-level `allOf` cross-field rules (single-backend-section; `appContainer` alias note) | Semantic rules the parser enforces at runtime; the editor no longer pre-flags them, but a backend/containment mismatch is still rejected. |
| `appContainer` alias path is undocumented; `network.proxy.builtinTestServer` widened from `const: true` to `boolean` | The serde alias still parses, and `convert_wire_proxy` still rejects `builtinTestServer: false`. |

Root metadata (`$id`, `title`, `description`) is preserved: `title` comes from a
`#[schemars(title = …)]` attribute on `MxcConfig`, `description` from its doc
comment, and `$id` is injected in the post-process step of
`generate_config_schema_json` (schemars does not emit one).

Net: the generated schema is **equivalent-or-stricter** on values and structure,
**more complete** in coverage, and **less expressive only** on the cross-field
rules β€” gaps consciously owned by the parser. The equivalence is not a
one-time review: the codegen gate regenerates the schema from the types on every
CI run, and the corpus gate pins the accept-side behavior.

## Roadmap

- The wire model generates the committed dev schema, guarded by the codegen and
corpus CI gates. The parser still deserializes via a separate set of
permissive `Raw*` structs.
- Next: rewire the parser to deserialize directly into the wire model and delete
the `Raw*` structs, so the schema source and the trust boundary share one
definition of the wire shape and cannot drift.
- After that: generate the SDK TypeScript types from the same schema and retire
the hand-maintained `*-strict.json` stable view.
Loading
Loading