Skip to content

Feat: downstream perf improvements - bump .25#34

Merged
S1ro1 merged 1 commit into
mainfrom
fix/routed-experts-uint8-json
May 15, 2026
Merged

Feat: downstream perf improvements - bump .25#34
S1ro1 merged 1 commit into
mainfrom
fix/routed-experts-uint8-json

Conversation

@S1ro1

@S1ro1 S1ro1 commented May 15, 2026

Copy link
Copy Markdown

Note

Medium Risk
Medium risk because it changes the on-the-wire JSON schema for routed_experts (from base64-encoded .npy strings to {data, shape} objects), which can break compatibility with existing clients/servers if not coordinated. The merge logic now relies on shape-provided sizing and row offsets, so malformed shapes could surface as new validation errors.

Overview
Updates routed_experts prefill/decode merging to use a raw uint8 payload encoded as JSON { "data": <base64 bytes>, "shape": [seq_len, layers, topk] } instead of a base64-encoded NumPy .npy blob, removing dtype/header parsing and related validation.

Tightens merge validation by checking row_start <= seq_len, verifying byte length matches seq_len * layers * topk with overflow-safe math, and updates tests accordingly. Also bumps crate/Python package version to 0.1.25.

Reviewed by Cursor Bugbot for commit 510092f. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Replace NumPy .npy encoding with raw uint8 JSON objects for routed experts payloads

  • Replaces base64-encoded .npy strings with a JSON object {data: <base64>, shape: [seq, layers, topk]} for all routed experts payloads in routed_experts_merge.rs.
  • Removes all NumPy header/dtype parsing logic (parse_npy_payload, parse_descr, dtype_item_size); row size is now computed as layers * topk assuming 1 byte per cell.
  • Shape validation now parses a JSON array of three non-negative integers instead of a NumPy header string.
  • Risk: Breaking change — callers providing .npy-encoded routed_experts strings will receive errors; payloads must be migrated to the new object format.

Macroscope summarized 510092f.

@S1ro1 S1ro1 changed the title Use raw uint8 routed experts payloads Feat: downstream perf improvements - bump .25 May 15, 2026
@macroscopeapp

macroscopeapp Bot commented May 15, 2026

Copy link
Copy Markdown

Approvability

Verdict: Needs human review

This PR changes the wire format for routed_experts from base64-encoded NumPy .npy files to JSON objects with data/shape fields. This protocol change affects runtime behavior and compatibility with other components, warranting human review.

You can customize Macroscope's approvability policy. Learn more.

@S1ro1 S1ro1 merged commit 154a2c2 into main May 15, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants