chore(libs): inherit upstream MLX — bump mlx-swift / mlx-swift-lm pins to combined inherited+instrumented heads by Gajesh2007 · Pull Request #459 · Layr-Labs/d-inference

Gajesh2007 · 2026-06-24T00:54:14Z

Summary

Capstone of the "inherit upstream MLX" effort. Bumps the two d-inference submodule pins to the combined heads of the fork inheritance PRs, which now also carry the measurement-only instrumentation that landed on master via d-inference#451:

submodule	old pin (`master`)	new pin (this PR)	fork PR
`libs/mlx-swift`	`3c50ad69`	`e20ea3dd`	Layr-Labs/mlx-swift#7
`libs/mlx-swift-lm`	`2b4b0d8d`	`48313a08`	Layr-Labs/mlx-swift-lm#50

Why this isn't a naive pin bump

master already advanced both pins past the bases the inheritance branches were cut from (via #451's EvalProbe / EngineCore instrumentation), so the two histories diverged:

libs/mlx-swift: master = ac67822 + 3 EvalProbe commits (3c50ad69); branch = ac67822 + 2 inheritance commits.
libs/mlx-swift-lm: master = 8a9bc7c + 1 EngineCore idle-clear marker (2b4b0d8d); branch = 8a9bc7c + 29 inheritance commits.

A straight bump to the old branch heads would have reverted the instrumentation. Instead, each inheritance branch merged its fork's main (which carries the instrumentation), producing a head that is a strict superset of the master pin:

git -C libs/mlx-swift     merge-base --is-ancestor 3c50ad69 e20ea3dd   # exit 0
git -C libs/mlx-swift-lm  merge-base --is-ancestor 2b4b0d8d 48313a08   # exit 0

Both checks pass → no instrumentation is reverted. This is a clean forward move that layers the inherited upstream fixes on top of everything already on master.

The merges were conflict-free:

mlx-swift MLXArray.swift auto-merged — EvalProbe brackets eval() (theirs); the inheritance fix wraps description/tostring in evalLock (ours) — disjoint regions.
mlx-swift-lm only EngineCore.swift changed (theirs); none of the 29 inheritance commits touch it. All crown jewels preserved (continuous batching, DAR-325 KV fix, KV-quant, MTP, batched Gemma4, fast-follow fp32 gated-delta dedupe).

Re-validation (against the combined tree)

swift build clean for mlx-swift, mlx-swift-lm, and provider-swift.
provider-swift swift test: 1064 tests / 74 suites passed, 0 failures (9 live-MLX tests are env-gated and self-skip).
Live inference (M4 Max, weights cached):
- GPT-OSS 20B (mlx-community/gpt-oss-20b-MXFP4-Q8, compile() path): coherent — Average speed = 60 mi ÷ 1.5 h = **40 mph**, reasoning_tokens=78 / completion=114.
- Gemma 4 26B 8bit (mlx-community/gemma-4-26b-a4b-it-8bit): batched B=2 vs single-stream parity, arithmetic 7*8 = 56, and a clean multi-turn tool call run_terminal(command: cat hello.txt).
- Gemma 4 26B qat-4bit VLM (mlx-community/gemma-4-26B-A4B-it-qat-4bit): mixed-length batched decode with no degenerate repetition — coherent.
- No crash / NaN on any path.

Before / After

flowchart TB
  subgraph Before["BEFORE - master @ 80ce2574"]
    direction TB
    M0["d-inference master"] -->|gitlink| S0a["libs/mlx-swift @ 3c50ad69<br/>(ac67822 + EvalProbe x3, #451)"]
    M0 -->|gitlink| S0b["libs/mlx-swift-lm @ 2b4b0d8d<br/>(8a9bc7c + EngineCore marker, #451)"]
    S0a --> B0["provider-swift builds/serves<br/>instrumentation ONLY<br/>(no inherited upstream fixes)"]
    S0b --> B0
  end

  subgraph After["AFTER - this PR @ 772ff499"]
    direction TB
    M1["d-inference master + 1 commit"] -->|gitlink| S1a["libs/mlx-swift @ e20ea3dd<br/>= merge(branch #7 + main)<br/>EvalProbe AND evalLock/compile fixes"]
    M1 -->|gitlink| S1b["libs/mlx-swift-lm @ 48313a08<br/>= merge(branch #50 + main)<br/>EngineCore marker AND 29 inherited fixes"]
    S1a --> B1["provider-swift builds/serves<br/>instrumentation AND inherited fixes<br/>1064/74 green - live GPT-OSS + Gemma4 coherent"]
    S1b --> B1
  end

  Before -.->|"2 gitlink moves only<br/>strict superset - nothing reverted"| After

What this PR changes

Exactly two gitlink updates (160000 mode) — no source changes:

 libs/mlx-swift    | 2 +-
 libs/mlx-swift-lm | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Merge ordering

This superproject pin depends on the two fork PRs. Land mlx-swift#7 and mlx-swift-lm#50 first (in a way that keeps e20ea3dd / 48313a08 reachable on each fork's main). If either fork PR squash-merges to a new SHA, re-point the corresponding gitlink here before merging this PR.

Three-PR set: this PR + mlx-swift#7 + mlx-swift-lm#50.

^{Need help on this PR? Tag /codesmith with what you need. Autofix is disabled.}

…nted heads Adopts the combined heads of the two fork inheritance PRs, which now also carry the d-inference#451 measurement-only instrumentation (each branch merged its fork's `main`): libs/mlx-swift 3c50ad69 -> e20ea3dd (Layr-Labs/mlx-swift#7) libs/mlx-swift-lm 2b4b0d8d -> 48313a08 (Layr-Labs/mlx-swift-lm#50) Both moves are a clean forward: the old master pin is a strict ancestor of the new head (`git merge-base --is-ancestor` passes), so no EvalProbe / EngineCore instrumentation is reverted. Re-validated: provider-swift builds clean, 1064 tests / 74 suites green, live GPT-OSS-20B + Gemma-4-26B (batched, VLM, tool-calling) produce coherent output. Co-authored-by: Cursor <cursoragent@cursor.com>

vercel · 2026-06-24T00:54:15Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
d-inference	Ready	Preview	Jun 24, 2026 12:54am
d-inference-console-ui-dev	Ready	Preview	Jun 24, 2026 12:54am
d-inference-landing	Ready	Preview	Jun 24, 2026 12:54am

ethenotethan

Automated Code Review — Layr-Labs/d-inference#

Verdict: COMMENT

Security — ✅ No issues found

Performance — ✅ No issues found

Type_diligence — ✅ No issues found

Additive_complexity — ✅ No issues found

✅ All four passes clean. No issues found.

🤖 Automated review by Centaur · DAR-186

github-actions · 2026-06-24T00:54:52Z

No threat-model-covered files were changed; however, the updated submodules touch the innermost trust boundary (TB-007) and warrant a brief inspection note.

Trust boundaries touched

TB-007 (Provider Inference Engine) — libs/mlx-swift and libs/mlx-swift-lm are the Metal/MLX compute layer that BatchScheduler and LocalMLXModelFoundation call directly. Neither submodule path appears in any affected_files glob in the current threat model.

Threat relevance

Neither file is listed in the threat model, so no T-xxx finding changes state. That said:

Threat	Relevance
T-028 (Residual inference data in GPU Metal buffers)	Any change to buffer allocation, KV-cache layout, or weight tensor lifecycle in `mlx-swift`/`mlx-swift-lm` directly affects whether prompt residue persists in GPU memory between tenants. The threat model notes this is already open with no Metal-level memset in place.
T-027 / T-007 (Weight hash / model output tampering)	Changes to `mlx-swift-lm` model-loading or tokeniser code affect what actually runs at inference time, downstream of WeightHasher's startup-time check.
T-041 (Cross-tenant prefix-cache TTFT oracle)	KV-cache shape or reuse changes in `mlx-swift-lm` could alter the timing signal that makes the TTFT oracle exploitable.

New attack surface not covered by an existing threat

The submodule diff itself is not included here, so the following flags are conditional on what the bump actually changes:

New Metal kernels or buffer-reuse strategies — if mlx-swift introduces new allocation pools or explicit buffer reuse across forward passes, this widens the open finding for T-028 (GPU residue) and should be noted in the threat model under TB-007 affected_files.
Tokeniser or sampling code changes in mlx-swift-lm — speculative-decode helpers, draft models, or vocabulary expansion can change the memory layout of in-flight tensors. If they introduce shared state across concurrent batch slots, cross-request data leakage risk increases beyond what the current BatchScheduler actor isolation assumes.
Native C/C++/Metal extensions — any new FFI surface in these libraries bypasses Swift ARC and the secureZero coverage that SecurityHardening.swift provides. The threat model currently has no coverage for native extensions in the MLX stack.

Recommendation

Add libs/mlx-swift and libs/mlx-swift-lm (or the resolved submodule paths under provider-swift/) to the affected_files lists for T-007, T-027, T-028, and T-041 in the threat model. The submodule bump should include a brief changelog note or pointer to the upstream diff so reviewers can verify whether buffer-lifecycle or tokeniser behaviour changed.

🔐 Threat model: docs/threat-model.yaml · Updates on each push to this PR

vercel Bot deployed to Preview – d-inference June 24, 2026 00:54 View deployment

Gajesh2007 requested a deployment to benchmarks June 24, 2026 00:54 — with GitHub Actions Waiting

This was referenced Jun 24, 2026

feat: inherit upstream evalLock + compile/closure return-value fixes Layr-Labs/mlx-swift#7

Open

feat: inherit upstream SSM/VLM/KV-quant/tools/MoE/Gemma4 fixes (selective port) Layr-Labs/mlx-swift-lm#50

Open

ethenotethan reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(libs): inherit upstream MLX — bump mlx-swift / mlx-swift-lm pins to combined inherited+instrumented heads#459

chore(libs): inherit upstream MLX — bump mlx-swift / mlx-swift-lm pins to combined inherited+instrumented heads#459
Gajesh2007 wants to merge 1 commit into
masterfrom
feat/inherit-upstream-mlx-2026-06

Gajesh2007 commented Jun 24, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

vercel Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

ethenotethan left a comment

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Gajesh2007 commented Jun 24, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this isn't a naive pin bump

Re-validation (against the combined tree)

Before / After

What this PR changes

Merge ordering

Uh oh!

vercel Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ethenotethan left a comment

Choose a reason for hiding this comment

Automated Code Review — Layr-Labs/d-inference#

Security — ✅ No issues found

Performance — ✅ No issues found

Type_diligence — ✅ No issues found

Additive_complexity — ✅ No issues found

Uh oh!

github-actions Bot commented Jun 24, 2026

Trust boundaries touched

Threat relevance

New attack surface not covered by an existing threat

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Gajesh2007 commented Jun 24, 2026 •

edited by blacksmith-sh Bot

Loading

vercel Bot commented Jun 24, 2026 •

edited

Loading