perf: +8-9% on Qwen3.5-122B-A10B SSD-stream via stacked-buffer + Gate+Up fusion in mlx-swift-lm #298
| Job | Run time |
|---|---|
| 11m 2s | |
| 11m 7s | |
| 11m 57s | |
| 8m 44s | |
| 1m 13s | |
| 37s | |
| 1m 15s | |
| 1m 58s | |
| 1m 3s | |
| 4m 12s | |
| 3m 14s | |
| 56m 22s |
| Job | Run time |
|---|---|
| 11m 2s | |
| 11m 7s | |
| 11m 57s | |
| 8m 44s | |
| 1m 13s | |
| 37s | |
| 1m 15s | |
| 1m 58s | |
| 1m 3s | |
| 4m 12s | |
| 3m 14s | |
| 56m 22s |