chore(provider): default-on MLX resource-count telemetry in reports (mlx-swift-lm#42)#373
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: REQUEST_CHANGES
Security — ✅ No issues found
Performance — 4 finding(s) (3 blocking)
- 🔵 [INFO]
provider-swift/Sources/ProviderCore/Models/ModelCatalog.swift:493— remainingBytesToFetch helper may allocate unbounded arrays for large model catalogs- Suggestion: Consider streaming calculation or chunked processing for very large model catalogs to avoid memory spikes
- 🟡 [MEDIUM]
provider-swift/Sources/ProviderCore/Models/ModelCatalog.swift:659— fileSize() called multiple times on same paths in hot download path- Suggestion: Cache fileSize results in local variables to avoid repeated filesystem calls
- 🟡 [MEDIUM]
provider-swift/Sources/ProviderCore/Models/ModelDownloadProgress.swift:65-85— Linear search through order array on every progress update- Suggestion: Use Dictionary for O(1) lookup or maintain index mapping to avoid O(n) search per update
- 🟡 [MEDIUM]
provider-swift/Sources/ProviderCore/Models/ModelDownloadProgress.swift:98-103— compactMap rebuilds entire progress array on every render frame- Suggestion: Cache the ordered progress array and only rebuild when order changes
Type_diligence — ✅ No issues found
Additive_complexity — 5 finding(s) (1 blocking)
- 🔵 [INFO]
provider-swift/Sources/ProviderCore/Models/ModelCatalog.swift:15-18— Comment references moved code but leaves no breadcrumb to new location- Suggestion: Add specific file reference: 'Progress state lives in ModelDownloadProgress.swift'
- 🔵 [INFO]
provider-swift/Sources/ProviderCore/Models/ModelCatalog.swift:493-500— Added onChunk parameter with default nil creates optional complexity- Suggestion: Consider making onChunk required or using a separate method variant
- 🔵 [INFO]
provider-swift/Sources/ProviderCore/Models/ModelCatalog.swift:813-840— localStagingDirName duplicates path sanitization logic from removed code- Suggestion: Extract path sanitization to shared utility function
- 🔵 [INFO]
provider-swift/Sources/darkbloom/StartCommand.swift:578-588— PickerEntry struct adds resumable field that could be computed property- Suggestion: Make resumable a computed property based on downloadedIDs and resumableIDs
- 🟡 [MEDIUM]
provider-swift/Sources/darkbloom/StartCommand.swift:613-642— buildPickerEntries static method on Start command mixes UI logic with data transformation- Suggestion: Move to dedicated PickerEntry or ModelCatalog utility class
9 finding(s) total, 4 blocking. Verdict: REQUEST_CHANGES.
🤖 Automated review by Centaur · DAR-186
| } | ||
|
|
||
| /// Download a single manifest file into its staging destination, resuming | ||
| /// from a `.part` file when present, verifying size + SHA-256 before |
There was a problem hiding this comment.
🔵 [INFO] ⚡ remainingBytesToFetch helper may allocate unbounded arrays for large model catalogs
💡 Suggestion: Consider streaming calculation or chunked processing for very large model catalogs to avoid memory spikes
📊 Score: 2×3 = 6 · Category: unbounded_allocations
| file: file, | ||
| destination: stagingDir.appendingPathComponent(relativePath, isDirectory: false), | ||
| url: "\(r2CDNURL)/\(Self.escapeR2Path(manifest.r2Prefix))/\(Self.escapeR2Path(relativePath))" | ||
| ) |
There was a problem hiding this comment.
🟡 [MEDIUM] ⚡ fileSize() called multiple times on same paths in hot download path
💡 Suggestion: Cache fileSize results in local variables to avoid repeated filesystem calls
📊 Score: 3×4 = 12 · Category: repeated_work
| /// Register a file to track. `initialBytes` seeds a resumed `.part` prefix | ||
| /// so the bar starts where the previous run left off (and is excluded from | ||
| /// the speed math via `baselineBytes`). | ||
| func register(label: String, expectedBytes: Int64, initialBytes: Int64 = 0) { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
| if progress[label] == nil { order.append(label) } | ||
| var p = FileProgress(label: label, expectedBytes: expectedBytes) | ||
| p.downloadedBytes = max(0, initialBytes) | ||
| p.baselineBytes = max(0, initialBytes) | ||
| progress[label] = p | ||
| } | ||
|
|
||
| /// Update cumulative bytes-on-disk for a file. | ||
| func update(label: String, downloadedBytes: Int64) { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
| guard var p = progress[label] else { return } | ||
| p.downloadedBytes = downloadedBytes | ||
| progress[label] = p | ||
| } |
There was a problem hiding this comment.
🟡 [MEDIUM] ⚡ Linear search through order array on every progress update
💡 Suggestion: Use Dictionary for O(1) lookup or maintain index mapping to avoid O(n) search per update
📊 Score: 2×4 = 8 · Category: inefficient_data_structures
| /// Thread-safe snapshot of all tracked file progress, in registration order. | ||
| var allProgress: [FileProgress] { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
| return order.compactMap { progress[$0] } | ||
| } |
There was a problem hiding this comment.
🟡 [MEDIUM] ⚡ compactMap rebuilds entire progress array on every render frame
💡 Suggestion: Cache the ordered progress array and only rebuild when order changes
📊 Score: 2×4 = 8 · Category: repeated_work
|
|
||
| // MARK: - Download progress tracking & rendering | ||
|
|
||
| /// Per-file progress state used by `DownloadProgressTracker`. | ||
| private struct FileProgress: Sendable { | ||
| let label: String | ||
| let expectedBytes: Int64 | ||
| var downloadedBytes: Int64 = 0 | ||
| var startTime: Date = Date() | ||
| var completed: Bool = false | ||
| var completionTime: Date? | ||
| var destinationURL: URL? | ||
|
|
||
| /// Bytes/second using elapsed wall time. | ||
| var speed: Double { | ||
| let elapsed = (completionTime ?? Date()).timeIntervalSince(startTime) | ||
| guard elapsed > 0.1 else { return 0 } | ||
| return Double(downloadedBytes) / elapsed | ||
| } | ||
|
|
||
| /// Estimated seconds remaining. | ||
| var eta: Double? { | ||
| guard speed > 0, expectedBytes > 0 else { return nil } | ||
| let remaining = Double(expectedBytes - downloadedBytes) | ||
| guard remaining > 0 else { return nil } | ||
| return remaining / speed | ||
| } | ||
|
|
||
| var fraction: Double { | ||
| guard expectedBytes > 0 else { return 0 } | ||
| return min(1.0, Double(downloadedBytes) / Double(expectedBytes)) | ||
| } | ||
| } | ||
|
|
||
| /// Delegate-based download tracker that provides incremental progress. | ||
| /// | ||
| /// Each download task is registered with `register(taskID:label:expectedBytes:)`. | ||
| /// The delegate callbacks update shared state that `ProgressRenderer` reads. | ||
| /// Completed downloads are signalled via per-task continuations. | ||
| private final class DownloadProgressTracker: NSObject, URLSessionDownloadDelegate, @unchecked Sendable { | ||
|
|
||
| /// Result of a single file download: the temporary location where | ||
| /// URLSession wrote the file (must be moved before the delegate returns). | ||
| struct DownloadResult { | ||
| let location: URL | ||
| let response: URLResponse | ||
| } | ||
|
|
||
| private let lock = NSLock() | ||
| private var progressMap: [Int: FileProgress] = [:] // taskIdentifier -> progress | ||
| private var continuations: [Int: CheckedContinuation<DownloadResult, Error>] = [:] | ||
| private var _allProgress: [FileProgress] = [] | ||
|
|
||
| /// Register a task so we can track its progress. | ||
| func register(taskID: Int, label: String, expectedBytes: Int64) { | ||
| lock.lock() | ||
| progressMap[taskID] = FileProgress(label: label, expectedBytes: expectedBytes) | ||
| rebuildSnapshot() | ||
| lock.unlock() | ||
| } | ||
|
|
||
| /// Store the continuation that will be resumed when the download finishes. | ||
| func setContinuation(_ cont: CheckedContinuation<DownloadResult, Error>, forTaskID taskID: Int) { | ||
| lock.lock() | ||
| continuations[taskID] = cont | ||
| lock.unlock() | ||
| } | ||
|
|
||
| /// Thread-safe snapshot of all tracked file progress, ordered by | ||
| /// registration time. | ||
| var allProgress: [FileProgress] { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
| return _allProgress | ||
| } | ||
|
|
||
| /// Whether all registered downloads have completed (or errored). | ||
| var isComplete: Bool { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
| return !progressMap.isEmpty && progressMap.values.allSatisfy(\.completed) | ||
| } | ||
|
|
||
| private func rebuildSnapshot() { | ||
| _allProgress = progressMap.keys.sorted().map { progressMap[$0]! } | ||
| } | ||
|
|
||
| // MARK: URLSessionDownloadDelegate | ||
|
|
||
| func urlSession( | ||
| _ session: URLSession, | ||
| downloadTask: URLSessionDownloadTask, | ||
| didWriteData bytesWritten: Int64, | ||
| totalBytesWritten: Int64, | ||
| totalBytesExpectedToWrite: Int64 | ||
| ) { | ||
| lock.lock() | ||
| let id = downloadTask.taskIdentifier | ||
| if var p = progressMap[id] { | ||
| p.downloadedBytes = totalBytesWritten | ||
| if totalBytesExpectedToWrite > 0 { | ||
| // Update expected if the server tells us (e.g. after resume | ||
| // partial content). Keep original manifest value if server | ||
| // returns -1. | ||
| } | ||
| progressMap[id] = p | ||
| rebuildSnapshot() | ||
| } | ||
| lock.unlock() | ||
| } | ||
|
|
||
| func urlSession( | ||
| _ session: URLSession, | ||
| downloadTask: URLSessionDownloadTask, | ||
| didFinishDownloadingTo location: URL | ||
| ) { | ||
| // URLSession deletes the temp file when this callback returns. | ||
| // Move it to a stable location so the continuation consumer can | ||
| // process it. | ||
| let stableLocation = FileManager.default.temporaryDirectory | ||
| .appendingPathComponent("darkbloom-dl-\(downloadTask.taskIdentifier)-\(UUID().uuidString)") | ||
| try? FileManager.default.moveItem(at: location, to: stableLocation) | ||
|
|
||
| lock.lock() | ||
| let id = downloadTask.taskIdentifier | ||
| if var p = progressMap[id] { | ||
| p.downloadedBytes = p.expectedBytes > 0 ? p.expectedBytes : p.downloadedBytes | ||
| p.completed = true | ||
| p.completionTime = Date() | ||
| p.destinationURL = stableLocation | ||
| progressMap[id] = p | ||
| rebuildSnapshot() | ||
| } | ||
| let cont = continuations.removeValue(forKey: id) | ||
| lock.unlock() | ||
|
|
||
| let response = downloadTask.response ?? HTTPURLResponse( | ||
| url: downloadTask.originalRequest?.url ?? URL(string: "about:blank")!, | ||
| statusCode: 200, | ||
| httpVersion: nil, | ||
| headerFields: nil | ||
| )! | ||
| cont?.resume(returning: DownloadResult(location: stableLocation, response: response)) | ||
| } | ||
|
|
||
| func urlSession( | ||
| _ session: URLSession, | ||
| task: URLSessionTask, | ||
| didCompleteWithError error: (any Error)? | ||
| ) { | ||
| guard let error else { return } | ||
| lock.lock() | ||
| let id = task.taskIdentifier | ||
| if var p = progressMap[id] { | ||
| p.completed = true | ||
| p.completionTime = Date() | ||
| progressMap[id] = p | ||
| rebuildSnapshot() | ||
| } | ||
| let cont = continuations.removeValue(forKey: id) | ||
| lock.unlock() | ||
| cont?.resume(throwing: error) | ||
| } | ||
| } | ||
|
|
||
| /// Renders a multi-line progress display to the terminal using ANSI escape | ||
| /// codes. Falls back to simple per-file messages when stdout is not a TTY. | ||
| private final class ProgressRenderer: @unchecked Sendable { | ||
|
|
||
| private let isTTY: Bool | ||
| private var linesPrinted: Int = 0 | ||
| private let lock = NSLock() | ||
| /// Set of labels already printed in non-TTY mode. | ||
| private var printedLabels: Set<String> = [] | ||
|
|
||
| init() { | ||
| self.isTTY = isatty(STDOUT_FILENO) != 0 | ||
| } | ||
|
|
||
| /// Render a frame given the current file progress snapshot. | ||
| func render(_ files: [FileProgress]) { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
|
|
||
| if !isTTY { | ||
| renderPlain(files) | ||
| return | ||
| } | ||
| renderANSI(files) | ||
| } | ||
|
|
||
| /// Final render: clear the progress area and print completion summary. | ||
| func finish(_ files: [FileProgress]) { | ||
| lock.lock() | ||
| defer { lock.unlock() } | ||
|
|
||
| if isTTY { | ||
| // Move up and clear all lines. | ||
| if linesPrinted > 0 { | ||
| print("\u{1B}[\(linesPrinted)A", terminator: "") | ||
| for _ in 0..<linesPrinted { | ||
| print("\u{1B}[2K") | ||
| } | ||
| print("\u{1B}[\(linesPrinted)A", terminator: "") | ||
| linesPrinted = 0 | ||
| } | ||
| } | ||
|
|
||
| // Print final summary lines. | ||
| for f in files { | ||
| let totalStr = Self.formatBytes(f.expectedBytes > 0 ? f.expectedBytes : f.downloadedBytes) | ||
| let elapsed = (f.completionTime ?? Date()).timeIntervalSince(f.startTime) | ||
| let avgSpeed = elapsed > 0.1 ? Double(f.downloadedBytes) / elapsed : 0 | ||
| let speedStr = Self.formatSpeed(avgSpeed) | ||
| let timeStr = Self.formatDuration(elapsed) | ||
| print(" \u{2713} \(f.label) \(totalStr) \(speedStr) \(timeStr)") | ||
| } | ||
| } | ||
|
|
||
| // MARK: - ANSI rendering | ||
|
|
||
| private func renderANSI(_ files: [FileProgress]) { | ||
| // Move cursor up to overwrite previous render. | ||
| if linesPrinted > 0 { | ||
| print("\u{1B}[\(linesPrinted)A", terminator: "") | ||
| } | ||
|
|
||
| let termWidth = Self.terminalWidth() | ||
| var lines = 0 | ||
| for f in files { | ||
| print("\u{1B}[2K", terminator: "") // Clear the line | ||
| let line = Self.formatLine(f, termWidth: termWidth) | ||
| print(line) | ||
| lines += 1 | ||
| } | ||
| linesPrinted = lines | ||
| fflush(stdout) | ||
| } | ||
|
|
||
| private func renderPlain(_ files: [FileProgress]) { | ||
| for f in files where f.completed && !printedLabels.contains(f.label) { | ||
| printedLabels.insert(f.label) | ||
| let totalStr = Self.formatBytes(f.expectedBytes > 0 ? f.expectedBytes : f.downloadedBytes) | ||
| print(" \u{2713} \(f.label) \(totalStr)") | ||
| } | ||
| } | ||
|
|
||
| // MARK: - Line formatting | ||
|
|
||
| private static func formatLine(_ f: FileProgress, termWidth: Int) -> String { | ||
| if f.completed { | ||
| let totalStr = formatBytes(f.expectedBytes > 0 ? f.expectedBytes : f.downloadedBytes) | ||
| let elapsed = (f.completionTime ?? Date()).timeIntervalSince(f.startTime) | ||
| let avgSpeed = elapsed > 0.1 ? Double(f.downloadedBytes) / elapsed : 0 | ||
| return " \u{2713} \(f.label) \(totalStr) \(formatSpeed(avgSpeed)) done" | ||
| } | ||
|
|
||
| let pct = Int(f.fraction * 100) | ||
| let dlStr = formatBytes(f.downloadedBytes) | ||
| let totStr = formatBytes(f.expectedBytes) | ||
| let speedStr = formatSpeed(f.speed) | ||
| let etaStr: String | ||
| if let eta = f.eta { | ||
| etaStr = "ETA \(formatDuration(eta))" | ||
| } else { | ||
| etaStr = "---" | ||
| } | ||
|
|
||
| // Assemble the suffix: " 62% 2.1/4.8 GB 113 MB/s ETA 24s" | ||
| let suffix = " \(String(format: "%3d", pct))% \(dlStr)/\(totStr) \(speedStr) \(etaStr)" | ||
|
|
||
| // Calculate bar width: total - label - prefix - suffix - brackets - spaces | ||
| let labelMaxWidth = min(f.label.count, 45) | ||
| let label = f.label.count > labelMaxWidth | ||
| ? String(f.label.suffix(labelMaxWidth - 1)).padding(toLength: labelMaxWidth, withPad: " ", startingAt: 0) | ||
| : f.label | ||
| let prefix = " \(label) [" | ||
| let postfix = "]\(suffix)" | ||
| let barWidth = max(10, termWidth - prefix.count - postfix.count) | ||
|
|
||
| let filled = Int(f.fraction * Double(barWidth)) | ||
| let empty = barWidth - filled | ||
| let bar = String(repeating: "\u{2588}", count: filled) + String(repeating: "\u{2591}", count: empty) | ||
|
|
||
| return "\(prefix)\(bar)\(postfix)" | ||
| } | ||
|
|
||
| // MARK: - Formatting helpers | ||
|
|
||
| static func formatBytes(_ bytes: Int64) -> String { | ||
| let b = Double(bytes) | ||
| if b < 1024 { return "\(bytes) B" } | ||
| if b < 1_048_576 { return String(format: "%.1f KB", b / 1024) } | ||
| if b < 1_073_741_824 { return String(format: "%.1f MB", b / 1_048_576) } | ||
| return String(format: "%.1f GB", b / 1_073_741_824) | ||
| } | ||
|
|
||
| static func formatSpeed(_ bytesPerSec: Double) -> String { | ||
| if bytesPerSec < 1024 { return String(format: "%.0f B/s", bytesPerSec) } | ||
| if bytesPerSec < 1_048_576 { return String(format: "%.0f KB/s", bytesPerSec / 1024) } | ||
| if bytesPerSec < 1_073_741_824 { return String(format: "%.0f MB/s", bytesPerSec / 1_048_576) } | ||
| return String(format: "%.1f GB/s", bytesPerSec / 1_073_741_824) | ||
| } | ||
|
|
||
| static func formatDuration(_ seconds: Double) -> String { | ||
| let s = Int(seconds) | ||
| if s < 60 { return "\(s)s" } | ||
| if s < 3600 { return "\(s / 60)m \(s % 60)s" } | ||
| return "\(s / 3600)h \(s / 60 % 60)m" | ||
| } | ||
|
|
||
| static func terminalWidth() -> Int { | ||
| #if canImport(Darwin) | ||
| var w = winsize() | ||
| if ioctl(STDOUT_FILENO, TIOCGWINSZ, &w) == 0, w.ws_col > 0 { | ||
| return Int(w.ws_col) | ||
| } | ||
| #endif | ||
| return 80 | ||
| } | ||
| } | ||
| // | ||
| // Progress state (`FileProgress`, `ManifestDownloadProgress`) and terminal |
There was a problem hiding this comment.
🔵 [INFO] 🧩 Comment references moved code but leaves no breadcrumb to new location
💡 Suggestion: Add specific file reference: 'Progress state lives in ModelDownloadProgress.swift'
📊 Score: 2×3 = 6 · Category: dead code
| /// from a `.part` file when present, verifying size + SHA-256 before | ||
| /// promoting to the final staged path. Reuses the resume-capable | ||
| /// `downloadFile` helper (Range requests, Content-Range validation, retries). | ||
| /// | ||
| /// `onChunk(bytesOnDisk)` reports cumulative bytes-on-disk for this file as | ||
| /// it streams, so the foreground path can render a live per-shard bar; the | ||
| /// background prefetch passes nil and accounts progress per whole file. | ||
| private func downloadManifestFileWithResume( |
There was a problem hiding this comment.
🔵 [INFO] 🧩 Added onChunk parameter with default nil creates optional complexity
💡 Suggestion: Consider making onChunk required or using a separate method variant
📊 Score: 1×2 = 2 · Category: over-abstraction
| .appendingPathComponent("local", isDirectory: true) | ||
| } | ||
|
|
||
| /// Stable foreground-download staging dir name, keyed by the manifest's | ||
| /// `r2Prefix` so an interrupted download resumes into the SAME dir instead of | ||
| /// a throwaway UUID. `r2Prefix` is path-like (e.g. "v2/org__name/version"); | ||
| /// flatten it to a single safe component. | ||
| static func localStagingDirName(r2Prefix: String) -> String { | ||
| ".local-staging-" + r2Prefix | ||
| .replacingOccurrences(of: "/", with: "__") | ||
| .replacingOccurrences(of: "\\", with: "__") | ||
| } | ||
|
|
||
| /// Whether an interrupted foreground download left resumable content staged on | ||
| /// disk for this model build (keyed by `r2Prefix`): a completed shard or a | ||
| /// `.part` prefix in the stable `.local-staging-…` dir. Lets the picker show | ||
| /// "resuming" instead of "not downloaded" for a partially-downloaded model so | ||
| /// re-selecting it FINISHES the download rather than appearing to start over. | ||
| public static func hasResumableStaging(modelID: String, r2Prefix: String) -> Bool { | ||
| let stagingDir = cacheModelDirectory(for: modelID) | ||
| .appendingPathComponent("snapshots", isDirectory: true) | ||
| .appendingPathComponent(localStagingDirName(r2Prefix: r2Prefix), isDirectory: true) | ||
| guard let entries = try? FileManager.default.contentsOfDirectory(atPath: stagingDir.path) else { | ||
| return false | ||
| } | ||
| // Any non-hidden staged entry (a finished file, a `.part`, or a nested | ||
| // subdir like `adapters/`) is resumable content worth finishing. | ||
| return entries.contains { !$0.hasPrefix(".") } |
There was a problem hiding this comment.
🔵 [INFO] 🧩 localStagingDirName duplicates path sanitization logic from removed code
💡 Suggestion: Extract path sanitization to shared utility function
📊 Score: 2×3 = 6 · Category: duplicate logic
| // MARK: - Interactive Catalog Picker | ||
|
|
||
| /// Entry shown in the interactive TUI model picker. | ||
| private struct PickerEntry { | ||
| /// | ||
| /// `downloaded` is computed from an UNFILTERED on-disk check (not the | ||
| /// available-memory-filtered scan) so a fully-downloaded model that exceeds | ||
| /// available RAM still reads "downloaded (won't fit)" rather than "not | ||
| /// downloaded". `resumable` flags a build whose foreground download was | ||
| /// interrupted (staging on disk) so the picker can show "resuming". | ||
| struct PickerEntry: Equatable { | ||
| let id: String |
There was a problem hiding this comment.
🔵 [INFO] 🧩 PickerEntry struct adds resumable field that could be computed property
💡 Suggestion: Make resumable a computed property based on downloadedIDs and resumableIDs
📊 Score: 1×2 = 2 · Category: over-abstraction
| static func buildPickerEntries( | ||
| rows: [PickerCatalogRow], | ||
| downloadedIDs: Set<String>, | ||
| localMemoryByID: [String: Double], | ||
| resumableIDs: Set<String>, | ||
| memoryGb: Double | ||
| ) -> [PickerEntry] { | ||
| var entries: [PickerEntry] = rows.compactMap { row in | ||
| let model = row.model | ||
| let isDownloaded = downloadedIDs.contains(model.id) | ||
| if !isDownloaded, let minRam = model.minRamGb, Double(minRam) > memoryGb { | ||
| return nil | ||
| } | ||
| let size = isDownloaded ? (localMemoryByID[model.id] ?? model.sizeGb) : model.sizeGb | ||
| return PickerEntry( | ||
| id: model.id, | ||
| catalogModel: model, | ||
| displayName: row.displayName, | ||
| sizeGb: size, | ||
| minRamGb: model.minRamGb, | ||
| downloaded: isDownloaded, | ||
| resumable: !isDownloaded && resumableIDs.contains(model.id) | ||
| ) | ||
| } | ||
| // Downloaded first, then larger first. | ||
| entries.sort { a, b in | ||
| if a.downloaded != b.downloaded { return a.downloaded } | ||
| return a.sizeGb > b.sizeGb | ||
| } | ||
| return entries |
There was a problem hiding this comment.
🟡 [MEDIUM] 🧩 buildPickerEntries static method on Start command mixes UI logic with data transformation
💡 Suggestion: Move to dedicated PickerEntry or ModelCatalog utility class
📊 Score: 3×4 = 12 · Category: misplaced responsibility
|
No directly mapped threat-model coverage for this PR — but two files warrant threat-model expansion and one warrants a targeted security note. Trust boundaries touchedNone of the four changed files match an Files outside current coverage
New attack surface not covered by an existing threat
A
Recommended threat-model updates
🔐 Threat model: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0cd519081b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if !isDownloaded, let minRam = model.minRamGb, Double(minRam) > memoryGb { | ||
| return nil |
There was a problem hiding this comment.
Block non-TTY picks of oversized local models
When a downloaded model exceeds this machine's RAM, this branch now keeps it in entries, but the non-TTY fallbackPicker still accepts all or the row number without the canFitIndividually guard used by the TTY picker. In the piped-input path that oversized ID can be written into the launch agent, and the foreground child later builds from the memory-filtered snapshot.models, so an oversized-only selection starts the service only to fail with No models selected; either filter/disable those rows in fallback or keep them display-only there.
Useful? React with 👍 / 👎.
| try await self.downloadManifestFileWithProgress( | ||
| job, tracker: tracker, session: delegateSession | ||
| ) | ||
| try await self.downloadManifestFileWithResume(job, onChunk: { bytes in |
There was a problem hiding this comment.
Avoid byte-wise foreground shard downloads
For manifest models selected through darkbloom models download or the interactive picker, this new foreground path now routes every multi-GB shard through downloadManifestFileWithResume, whose streaming helper consumes URLSession.AsyncBytes one UInt8 at a time (for try await byte in byteStream). That replaces the previous URLSessionDownloadTask foreground path with per-byte Swift async iteration, making large model downloads CPU-bound and much slower; use a chunked/delegate-based resume path instead of iterating each byte individually.
Useful? React with 👍 / 👎.
| try await self.downloadManifestFileWithProgress( | ||
| job, tracker: tracker, session: delegateSession | ||
| ) | ||
| try await self.downloadManifestFileWithResume(job, onChunk: { bytes in |
There was a problem hiding this comment.
Promote complete .part files before resuming
If a foreground download is killed after a shard is fully written to <dest>.part but before SHA verification/move (for example during hashing a multi-GB shard), restarting now enters this new resume path and sends Range: bytes=<full-size>-; a normal CDN responds 416, and streamDownload deletes the complete .part before retrying from zero. This defeats finish-on-restart and can fail on tight disks because the capacity check credited the full .part; check for a full-size/hash-valid .part and promote it before issuing a Range request.
Useful? React with 👍 / 👎.
0cd5190 to
d2b2eb7
Compare
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — ✅ No issues found
✅ All four passes clean. No issues found.
🤖 Automated review by Centaur · DAR-186
d2b2eb7 to
6953e40
Compare
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — 1 finding(s)
- 🔵 [INFO]
provider-swift/Tests/ProviderCoreTests/BatchKVCacheTests.swift:329-350— Test creates unnecessary MLXArray zeros when simple shapes would suffice- Suggestion: Use MLXArray.zeros([2, 1, 4, 1]) pattern is fine, but consider if the test logic could be simplified to focus on the core assertion about negative padding without the full cache update cycle
1 finding(s) total, 0 blocking. Verdict: COMMENT.
🤖 Automated review by Centaur · DAR-186
| @Test("rotating decode stays unmasked after trimming past the window (negative padding)") | ||
| func rotatingUnmaskedAfterWindowTrim() { | ||
| // A rotating cache that generates past its window trims slots and | ||
| // subtracts the trim from leftPadding, so unpadded rows go NEGATIVE. | ||
| // The decode-mask fast path must treat negative padding as unpadded | ||
| // (<= 0), or every long (> window) Gemma 4 decode falls back to the | ||
| // explicit-mask path on each sliding layer (the mlx#3384 slow/divergent | ||
| // path the repetition fix avoids). | ||
| let cache = BatchRotatingKVCache(maxSize: 3, leftPadding: [0, 0]) | ||
| _ = cache.update( // prefill 4 > window 3 -> trims, leftPadding goes negative | ||
| keys: MLXArray.zeros([2, 1, 4, 1]), values: MLXArray.zeros([2, 1, 4, 1])) | ||
| _ = cache.update( // one decode step | ||
| keys: MLXArray.zeros([2, 1, 1, 1]), values: MLXArray.zeros([2, 1, 1, 1])) | ||
| // Precondition that makes this a real regression test: rows are now | ||
| // negatively padded, so a `== 0` guard would (wrongly) mask. | ||
| #expect(cache.leftPadding.max().item(Int32.self) < 0) | ||
| let mask = cache.makeMask(n: 1, windowSize: 3, returnArray: true) | ||
| guard case .none = mask else { | ||
| Issue.record("expected .none for unpadded rotating decode past window, got \(mask)") | ||
| return | ||
| } | ||
| } |
There was a problem hiding this comment.
🔵 [INFO] 🧩 Test creates unnecessary MLXArray zeros when simple shapes would suffice
💡 Suggestion: Use MLXArray.zeros([2, 1, 4, 1]) pattern is fine, but consider if the test logic could be simplified to focus on the core assertion about negative padding without the full cache update cycle
📊 Score: 2×3 = 6 · Category: over-abstraction
There was a problem hiding this comment.
💡 Codex Review
https://github.com/Layr-Labs/d-inference/blob/6953e407485553f3f56a96054c84a6a07e46af78/libs/mlx-swift-lm/Libraries/MLXLMCommon/ContinuousBatching/EngineCore.swift#L112-L117
Persist the resource-debug opt-out for launchd
When the provider is started through the normal background path (darkbloom start), this check runs in launchd's start --foreground child, but LaunchAgent.makeServicePlist only persists DARKBLOOM_PREFIX_CACHE into EnvironmentVariables. As a result, DARKBLOOM_MLX_RESOURCE_DEBUG=0 darkbloom start is dropped before the child launches, so the new default-on stdout/os_log resource trace cannot be disabled for production daemons unless the plist is hand-edited or this key is added to the launchd passthrough allowlist.
https://github.com/Layr-Labs/d-inference/blob/6953e407485553f3f56a96054c84a6a07e46af78/libs/mlx-swift-lm/Libraries/MLXLMCommon/ContinuousBatching/EngineCore.swift#L371-L372
Avoid unbounded stdout resource traces
With this default-on path, every 50 busy scheduler steps appends an [rsrc] line to stdout before the coarser os_log gating runs. In the normal launchd service, stdout/stderr are redirected to ~/.darkbloom/provider.log via StandardOutPath/StandardErrorPath, and that file has no rotation in the repo, so high-throughput providers will grow a local log indefinitely even though the unified-log report path is capped; make the stdout trace coarse/opt-in or rotate/truncate the legacy log.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…k fix Bumps mlx-swift-lm to 1e7b5ac (Layr-Labs/mlx-swift-lm#42): (1) default-on [rsrc] resource-count telemetry via os_log so reports carry the trajectory; (2) decode-mask fast path now treats negative leftPadding as unpadded (<= 0), fixing a regression where long (> sliding-window) Gemma 4 generations fell back to the explicit-mask path on every sliding layer. Adds a provider regression test exercising the negative-padding rotating-cache case. Target 0.6.12. Co-authored-by: anupsv <6407789+anupsv@users.noreply.github.com>
6953e40 to
6fea8b9
Compare
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — 1 finding(s)
- 🔵 [INFO]
provider-swift/Tests/ProviderCoreTests/BatchKVCacheTests.swift:329-350— Test method has verbose name and complex setup for a simple negative padding check- Suggestion: Simplify test name to 'testNegativePaddingUnmasked' and reduce setup complexity - the core assertion is just checking leftPadding < 0 and mask == .none
1 finding(s) total, 0 blocking. Verdict: COMMENT.
🤖 Automated review by Centaur · DAR-186
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — ✅ No issues found
✅ All four passes clean. No issues found.
🤖 Automated review by Centaur · DAR-186
|
Addressed both Codex P2s on the default-on telemetry (
|
|
Found 1 test failure on Blacksmith runners: Failure
|
Bump libs/mlx-swift-lm to 288bcba (rebased onto main; drops the duplicate #41 commits and adds the post-update window-bound fix for the decode mask fast path). Add a regression test: when windowSize < maxCacheSize, makeMask must keep the windowed mask at the boundary (.none would let the new token attend one position past the window), while staying .none below the boundary. The existing Gemma (window == maxCacheSize) negative-padding test still passes unchanged. Co-authored-by: anupsv <6407789+anupsv@users.noreply.github.com>
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — ✅ No issues found
✅ All four passes clean. No issues found.
🤖 Automated review by Centaur · DAR-186
#42 (default-on resource telemetry + Gemma 4 decode-mask fixes) is merged to mlx-swift-lm main. Move the submodule pointer from the feature-branch commit to the squashed main commit e7af9df (identical content). The parent-side BatchKVCache regression tests added on this branch stay.
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — ✅ No issues found
✅ All four passes clean. No issues found.
🤖 Automated review by Centaur · DAR-186
ethenotethan
left a comment
There was a problem hiding this comment.
Automated Code Review — Layr-Labs/d-inference#
Verdict: COMMENT
Security — ✅ No issues found
Performance — ✅ No issues found
Type_diligence — ✅ No issues found
Additive_complexity — ✅ No issues found
✅ All four passes clean. No issues found.
🤖 Automated review by Centaur · DAR-186
There was a problem hiding this comment.
💡 Codex Review
https://github.com/Layr-Labs/d-inference/blob/4cc4973c940c0f79a9626b24d960019f19cdc0b1/libs/mlx-swift-lm/Libraries/MLXLMCommon/BatchKVCache.swift#L408-L410
Preserve window masks in BatchKVCache fast path
When windowSize is non-nil and smaller than the accumulated BatchKVCache length, this new single-token fast path drops the mask entirely for unpadded rows. A caller that asks for a sliding window after the cache already holds more than that window will now let the query attend to every retained key instead of only the active window; the rotating cache added a post-update window guard, but this regular batch cache needs the same check or should keep returning an array mask when windowSize is active.
https://github.com/Layr-Labs/d-inference/blob/4cc4973c940c0f79a9626b24d960019f19cdc0b1/libs/mlx-swift-lm/Libraries/MLXLMCommon/ContinuousBatching/EngineCore.swift#L404-L405
Throttle high-pressure os_log samples
When a provider stays above 70% resource pressure without immediately crashing, this branch logs [rsrc] to unified logging every 10 busy scheduler steps. Those are exactly the logs collected by darkbloom report, whose manual path rejects reports over 10 MB and whose auto-report path truncates to 10 MB, so a sustained high-pressure run can make the diagnostic report fail or lose the newest crash ramp; keep the dense sampling bounded to a short transition window or sample it at the report cadence after the first few high-pressure points.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Memory & reliability hardening: - 90% unified-memory cap + KV purge on unload + serve-while-load reservation (#363) - bounded checkpoint-capture pipeline, stops the Metal 499000 resource leak (#374) - resumable 4-bit model downloads + restart-safe detection (#372) - default-on [rsrc] resource telemetry in reports (#373 / mlx-swift-lm#42) - don't log raw request-parse errors — prompt-fragment privacy (#376) Coordinator LatestProviderVersion bump is intentionally deferred until the signed 0.6.12 bundle is published (prod uses the in-memory store, so the hardcoded fallback is authoritative after a redeploy).

Bumps the submodule to Layr-Labs/mlx-swift-lm#42 so the
[rsrc]resource-count diagnostic is default-on and emitted via os_log underdev.darkbloom.provider.Why
The
[metal::malloc] Resource limit (499000)crash is a live Metal buffer-count climb; diagnosing it needs the resource-count trajectory. Verified against real prod log-reports (Gumbii's M3 Ultra, ids 518/520 via/v1/admin/log-reports): they contain zero[rsrc]lines because the diagnostic was off by default andprint()-only (the report tool captureslog show --predicate subsystem == dev.darkbloom.provider). After this, every uploaded report carries the live resource-count trace.Coarse os_log cadence keeps reports under the 10 MB cap, but always logs at ≥70% pressure so the ramp toward 499000 is captured. Opt out:
DARKBLOOM_MLX_RESOURCE_DEBUG=0. Target: 0.6.12. Merge mlx-swift-lm#42 first, then re-point if its merge SHA differs.Need help on this PR? Tag
/codesmithwith what you need. Autofix is disabled.Credits
Builds directly on @anupsv's work, credited via
Co-authored-byon the bump commit:\n- Bumps the submodule onto the default-on[rsrc]telemetry he introduced in Layr-Labs/mlx-swift-lm#39, plus the negative-padding decode-mask fix that mirrors his Layr-Labs/mlx-swift-lm#40.