Skip to content

docs(host/quickstart): add MiniMax M2.7 and reference repo node-configs#1143

Merged
tcharchian merged 1 commit into
mainfrom
docs/host-quickstart-add-minimax-m2.7
May 27, 2026
Merged

docs(host/quickstart): add MiniMax M2.7 and reference repo node-configs#1143
tcharchian merged 1 commit into
mainfrom
docs/host-quickstart-add-minimax-m2.7

Conversation

@tcharchian

Copy link
Copy Markdown
Collaborator

Summary

Follow-up to gonka-ai/gonka#1260 (MLNode 3.0.14 & MiniMax M2.7). Adds MiniMax M2.7 as the 3rd governance-approved model in docs/host/quickstart.md and surfaces the ready-made node-config-*.json files that ship in gonka/deploy/join/.

Changes

  • Supported models table — add MiniMaxAI/MiniMax-M2.7 as the 3rd approved model.
  • Proposed Hardware Configuration table — add MiniMax row (4×A100 / 4×H100 / 2×H200 / 2×B200, ~320 GB VRAM per ML Node).
  • New "Reference deploy configs in the repo" tip — lists every deploy/join/node-config-*.json shipped for Qwen, Kimi, and MiniMax, so hosts can copy a ready file instead of writing one from scratch.
  • Edit Inference Node Description section — new tabs, each mirroring the corresponding JSON in gonka/deploy/join/:
    • Qwen — 8×B200
    • Kimi — 8×H200 (FLASHMLA, tp=8)
    • MiniMax — 4×A100 (marlin MoE + note about VLLM_USE_FLASHINFER_MOE_FP8=0)
    • MiniMax — 4×H100 (FLASHINFER + fp8 kv-cache)
    • MiniMax — 2×H200 (FLASHINFER + fp8 kv-cache; matches the configuration used to record MiniMax PoC golden vectors)
    • MiniMax — 2×B200 (FLASHINFER_TRTLLM MoE + fp8 kv-cache)
  • Pre-download Model Weights — new MiniMax M2.7 tab with huggingface-cli download and a note that MiniMax requires MLNode 3.0.14+ (image ghcr.io/gonka-ai/mlnode:3.0.14-cu129) plus the A100 env-var caveat.
  • Optional: PoC delegation and refusal — added a copy-paste delegation example for MiniMax.

All vLLM argument sets are taken verbatim from the JSON files added in gonka-ai/gonka#1260 (deploy/join/node-config-minimax-{A100,H100,H200,B200}.json), so docs and repo stay in sync.

Test plan

  • Render the page locally (mkdocs serve) and verify all 6 new tabs render correctly under "Edit Inference Node Description for the Server".
  • Verify the new "MiniMax M2.7" tab appears under "Pre-download Model Weights to Hugging Face Cache (HF_HOME)".
  • Verify the new MiniMax row appears in both the Supported models and Proposed Hardware Configuration tables.
  • Verify the new "Reference deploy configs in the repo" tip box renders.
  • Spot-check that the inline node-config.json blocks match the JSON files in gonka-ai/gonka:deploy/join/node-config-minimax-*.json byte-for-byte (modulo whitespace/quoting style).

Made with Cursor

Follow-up to gonka-ai/gonka#1260 (MLNode 3.0.14 & MiniMax M2.7).

- Add MiniMaxAI/MiniMax-M2.7 to the Supported models table and the
  Proposed Hardware Configuration table (4xA100 / 4xH100 / 2xH200 /
  2xB200, ~320 GB VRAM).
- Add a "Reference deploy configs in the repo" tip pointing to every
  node-config-*.json in deploy/join/ for Qwen, Kimi, and MiniMax, so
  hosts can copy a ready file instead of writing one from scratch.
- Add new tabs in the "Edit Inference Node Description" section,
  mirroring the JSON files shipped in gonka/deploy/join/:
    * Qwen — 8xB200
    * Kimi — 8xH200 (FLASHMLA, tp=8)
    * MiniMax — 4xA100 (marlin MoE + VLLM_USE_FLASHINFER_MOE_FP8=0)
    * MiniMax — 4xH100 (FLASHINFER + fp8 kv-cache)
    * MiniMax — 2xH200 (FLASHINFER + fp8 kv-cache; matches the
      configuration used to record MiniMax PoC golden vectors)
    * MiniMax — 2xB200 (FLASHINFER_TRTLLM MoE + fp8 kv-cache)
- Add a MiniMax M2.7 tab in the Pre-download Model Weights section
  with the huggingface-cli command and the MLNode 3.0.14 / A100 env
  var note.
- Add a MiniMax delegation example in the "Optional: PoC delegation
  and refusal" section.

All vLLM argument sets are taken verbatim from the JSON files in
gonka/deploy/join/ so docs and repo stay in sync.

Co-authored-by: Cursor <cursoragent@cursor.com>
@tcharchian tcharchian merged commit c7b5143 into main May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant