Skip to content

Commit 5fc3e13

Browse files
committed
fix: correctly handle Gemma 4 and Llama 3 custom completion turn tokens natively to prevent padding loops
1 parent a7eb25f commit 5fc3e13

2 files changed

Lines changed: 2 additions & 2 deletions

File tree

Sources/SwiftLM/Server.swift

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1022,7 +1022,7 @@ func handleChatCompletion(
10221022
let temperature = chatReq.temperature.map(Float.init) ?? config.temp
10231023
let topP = chatReq.topP.map(Float.init) ?? config.topP
10241024
let repeatPenalty = chatReq.repetitionPenalty.map(Float.init) ?? config.repeatPenalty
1025-
let stopSequences = chatReq.stop ?? []
1025+
let stopSequences = (chatReq.stop ?? []) + ["<end_of_turn>", "<|im_end|>", "<|eot_id|>", "<turn|>", "<|tool_response|>"]
10261026
let includeUsage = chatReq.streamOptions?.includeUsage ?? false
10271027

10281028
// Log extra sampling params if provided (accepted for API compat, not all are used)

mlx-swift-lm

0 commit comments

Comments
 (0)