Skip to content

fix(moe): prevent crash when persistent buffer slots are exhausted#37

Merged
solderzzc merged 2 commits into
mainfrom
fix/slot-exhaustion-crash
Apr 27, 2026
Merged

fix(moe): prevent crash when persistent buffer slots are exhausted#37
solderzzc merged 2 commits into
mainfrom
fix/slot-exhaustion-crash

Conversation

@solderzzc

Copy link
Copy Markdown
Member

Problem

SwitchGLU.callAsFunction crashes with _assertionFailure (force-unwrap of nil) in the warm-path hit/miss slot resolution when all persistent buffer slots are consumed by speculative hits:

Thread 5 Crashed:: com.apple.root.user-initiated-qos.cooperative
0  libswiftCore.dylib  _assertionFailure
1  SwiftBuddy          SwitchGLU.callAsFunction(_:_:) + 12024

Root cause: Line 726 uses a force-unwrap:

let freeSlot = (0..<maxBuffers).first { !usedSlots.contains($0) }!

When ranges.count == maxBuffers (typically 8 for top_k=8) and every expert in ranges gets a unique slot from the speculative-prefetch hit map, there are zero free slots remaining for any miss — .first returns nil and the ! crashes.

Fix

Replace the force-unwrap with a guard let that sets a slotExhausted flag and breaks out of the loop. When detected, the partial buffers are cleared and we fall through to the existing full-pread fallback path (same code as the no-predictions branch). Same correctness, no crash.

Fixes SharpAI/SwiftLM#87

When all buffer slots are claimed by speculative-hit routing (ranges.count
== maxBuffers and all experts get different slot assignments), the
force-unwrap on '.first { !usedSlots.contains($0) }!' returns nil and
crashes with _assertionFailure.

Replace the force-unwrap with a guard that sets a slotExhausted flag and
breaks out. When detected, the hit/miss arrays are cleared and we fall
through to the existing full-pread fallback path — same correctness,
no crash.

Fixes SharpAI/SwiftLM#87
Copilot AI review requested due to automatic review settings April 27, 2026 20:10

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a crash in the SwitchGLU.callAsFunction warm-path SSD streaming logic when resolving expert-to-persistent-buffer slots, by removing a force-unwrap that could trigger an _assertionFailure and cleanly falling back to the existing full-pread path.

Changes:

  • Replace force-unwrapped free-slot selection with a guard let that detects slot exhaustion and exits the warm-path loop safely.
  • Skip miss-only preads when slot exhaustion is detected, and instead trigger the existing full-pread fallback.
  • Consolidate “no predictions” and “slot exhausted” handling into a single fallback condition based on usedGate.count != ranges.count.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…cenario

6 unit tests exercising the pure-CPU slot resolution algorithm:
- testOldAlgorithmCrashesOnSlotExhaustion: documents the crash path
- testFixedAlgorithmHandlesSlotExhaustion: validates graceful detection
- testNormalHitMissResolution: regression guard for normal operation
- testAllHits: 100% speculation accuracy edge case
- testAllMisses: 0% speculation accuracy edge case
- testDuplicateExpertInRangesExhaustsSlots: sorted-idx duplicate expert
@solderzzc solderzzc merged commit 2b3f92d into main Apr 27, 2026
6 checks passed
@solderzzc solderzzc deleted the fix/slot-exhaustion-crash branch April 27, 2026 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

EXC_BREAKPOINT (SIGTRAP)

2 participants