LoRA: runtime toggle and PEFT adapter loader by smdesai · Pull Request #316 · ml-explore/mlx-swift-lm

smdesai · 2026-05-27T18:08:39Z

Proposed changes

This PR adds two related, backward-compatible improvements to the LoRA infrastructure in MLXLMCommon.

Runtime loraEnabled toggle on LoRALayer
A new loraEnabled: Bool property on the LoRALayer protocol lets callers enable or disable the LoRA term at runtime without unloading the adapter. When false, the layer behaves as the underlying base layer (no LoRA term added).

This is needed for inference patterns that interleave LoRA-on and LoRA-off forward passes against the same model — for example, speculative decoding schemes where a LoRA-tuned drafter feeds an un-tuned verifier with a shared KV cache. Today the only way to "disable" a loaded adapter is to unload and reload it, which is too expensive to do per inference step.

Backward compatibility: Strictly additive. The default value is true (LoRA always applied, matching pre-PR behavior). External LoRALayer conformers compile unchanged because the protocol-extension default satisfies the new requirement; their toggle is silently a no-op until they opt in by adding their own stored property.

LoRAContainer.fromPEFT(directory:) — load HuggingFace PEFT adapters
A new static loader on LoRAContainer reads adapter directories in the standard HuggingFace peft format:

adapter_config.json (PEFT schema: r, lora_alpha, target_modules, peft_type, …)
adapter_model.safetensors (with base_model.model..lora_A.weight / lora_B.weight keys)

These changes are extracted from PR #310 where the model's speculative-decoding mode requires per-phase LoRA toggling and the canonical adapter ships in PEFT format.

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)

davidkoski · 2026-05-27T20:01:09Z

+            if let bias { return y + bias }
+            return y


I wonder if this should call super? Linear is simple enough and unlikely to change, but since it is a subtype it might be better to call that way. It looks like LoRALinear does it that way -- I think that is the pattern to follow

Yes you're right, it can and updated accordingly.

davidkoski · 2026-05-27T20:04:28Z

+// Conversion:
+//   - Strip the leading `base_model.model.` prefix.
+//   - Rename `.lora_A.weight` -> `.lora_a`,  `.lora_B.weight` -> `.lora_b`.
+//   - Transpose both tensors to match MLX's [in, r] / [r, out] convention.


I wonder if this block comment should be on fromPEFT? As it is you can only see it in the code, not in the built docs.

Good catch. It's moved to fromPEFT.

davidkoski · 2026-05-27T20:08:42Z

+        // Match "<encoder|model>.layers.<n>." then return the rest. This
+        // matches the two common backbone layouts the project uses.
+        let parts = path.split(separator: ".", omittingEmptySubsequences: false)
+        for i in 0 ..< (parts.count - 2) {


if parts.count is 0 or 1 this will trap -- might want to guard vs that and return nil.

Good catch. Yes a guard needs to be in place. It's added.

davidkoski

Looks good, thank you!

smdesai and others added 8 commits March 11, 2026 17:03

support for glm-ocr model

20b78bf

Merge branch 'ml-explore:main' into main

9865921

Merge branch 'ml-explore:main' into main

dc31f83

Merge remote-tracking branch 'origin/main'

3613b1d

Merge remote-tracking branch 'upstream/main'

0c3ff27

Merge branch 'ml-explore:main' into main

26c266d

Merge branch 'ml-explore:main' into main

46a0b00

Add runtime LoRA enable/disable toggle and PEFT adapter loader

1b6a567

smdesai mentioned this pull request May 27, 2026

Add Nemotron Labs Diffusion #310

Open

4 tasks

davidkoski reviewed May 27, 2026

View reviewed changes

LoRA: minor cleanups in DoRA forward and PEFT adapter helpers

ffc15f7

davidkoski reviewed May 27, 2026

View reviewed changes

davidkoski approved these changes May 27, 2026

View reviewed changes

davidkoski merged commit 5626257 into ml-explore:main May 27, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA: runtime toggle and PEFT adapter loader#316

LoRA: runtime toggle and PEFT adapter loader#316
davidkoski merged 9 commits into
ml-explore:mainfrom
smdesai:lora-toggle-and-peft

smdesai commented May 27, 2026

Uh oh!

davidkoski May 27, 2026 •

edited

Loading

Uh oh!

smdesai May 27, 2026

Uh oh!

davidkoski May 27, 2026

Uh oh!

smdesai May 27, 2026

Uh oh!

davidkoski May 27, 2026

Uh oh!

smdesai May 27, 2026

Uh oh!

davidkoski left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

smdesai commented May 27, 2026

Proposed changes

Checklist

Uh oh!

davidkoski May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smdesai May 27, 2026

Choose a reason for hiding this comment

Uh oh!

davidkoski May 27, 2026

Choose a reason for hiding this comment

Uh oh!

smdesai May 27, 2026

Choose a reason for hiding this comment

Uh oh!

davidkoski May 27, 2026

Choose a reason for hiding this comment

Uh oh!

smdesai May 27, 2026

Choose a reason for hiding this comment

Uh oh!

davidkoski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davidkoski May 27, 2026 •

edited

Loading