Skip to content

Add experimental Blockscale BZM2 support and diagnostics#37

Closed
recklessnode wants to merge 16 commits into256foundation:mainfrom
recklessnode:codex/bzm2-upstream-port
Closed

Add experimental Blockscale BZM2 support and diagnostics#37
recklessnode wants to merge 16 commits into256foundation:mainfrom
recklessnode:codex/bzm2-upstream-port

Conversation

@recklessnode
Copy link

@recklessnode recklessnode commented Mar 7, 2026

Summary

This PR adds experimental BZM2 support to mujina-miner, including the runtime mining path, board bring-up, telemetry, tuning, diagnostics, and generic reference documentation.

What this adds

  • a dedicated asic/bzm2 stack for UART/TDM protocol handling, work dispatch, share reconstruction, PLL/DLL control, DTS/VS telemetry, and silicon-validation tooling
  • a dedicated board/bzm2 integration for startup enumeration, bring-up/shutdown sequencing, rail control, domain-voltage application, saved operating-point replay, and runtime engine discovery
  • board/API support for BZM2-specific diagnostics, on-demand DTS/VS queries, chain summary, engine discovery, and clock reports
  • generic Blockscale reference docs under docs/bzm2

Scope boundaries

  • this targets the Blockscale/BZM2 Gen2 path; Gen1-specific runtime support is intentionally not included
  • board-specific reference-platform glue is kept out of the core path in favor of generic bring-up abstractions
  • local process notes are not part of this branch; only upstream-relevant code and documentation are included

Suggested review order

  1. feat(bzm2): add initial BZM2 board integration
  2. feat(bzm2): add telemetry, control, and protocol coverage
  3. feat(bzm2): add UART clock diagnostics
  4. feat(bzm2): add debug CLI and DLL diagnostics
  5. feat(bzm2): add tuning planner and startup calibration
  6. feat(bzm2): expose DTS/VS telemetry through the API
  7. feat(bzm2): add chain enumeration and roadmap
  8. feat(bzm2): add startup auto-enumeration
  9. feat(bzm2): wire board bring-up into the lifecycle
  10. feat(bzm2): apply domain voltages and rail telemetry
  11. feat(bzm2): add engine discovery and runtime layouts
  12. feat(bzm2): make runtime tuning engine-capacity aware
  13. feat(bzm2): add runtime measurements and retune state
  14. feat(bzm2): add live board diagnostics and API parity
  15. docs(bzm2): organize reference docs under docs/bzm2
  16. fix(bzm2): resolve upstream port drift

Validation

  • cargo test -p mujina-miner --message-format=human
  • result: 344 passed, 0 failed, 5 ignored
  • BZM2 debug binary tests: 3 passed, 0 failed
  • doctests: 8 passed, 0 failed, 2 ignored

Add the first upstream BZM2 integration slice:
- introduce the new ASIC module with protocol and mining thread support
- add a virtual BZM2 board and transport wiring for serial-backed devices
- register the board with the backplane and daemon startup path

This commit intentionally lands the core integration surface first. Telemetry, tuning, diagnostics, and broader API support follow in later commits.
Extend the initial BZM2 integration with:
- board telemetry and safety-state handling
- reusable control-plane abstractions for reset and power sequencing
- expanded UART opcode coverage and parser behavior
- end-to-end actor tests for dispatch and share reconstruction

This keeps the transport and board core from the first commit, then layers in the reusable hardware-control and validation surface.
Add the reusable UART-side PLL diagnostic path for BZM2.

This introduces the first clock-control surface needed for bring-up and debugging:
- PLL divider calculation and programming
- enable/disable control
- lock-state polling
- structured clock status reporting

The port note is added here so the remaining BZM2 docs can evolve in place with later functionality.
Add the standalone BZM2 debug binary and extend the clock-control path beyond PLL status.

This commit adds:
- the UART-focused debug CLI for live serial interaction
- DLL configuration and diagnostics alongside the existing PLL flow
- protocol tests covering the extra wire-format and parser behavior needed by the tooling

The port note and UART guide are updated in the same slice because they document the new bring-up and debug surface.
Build out the BZM2 board runtime beyond basic mining support.

This commit adds:
- the tuning planner and saved operating-point model
- startup calibration and replay of saved operating points
- broadcast-oriented bring-up helpers in the debug CLI
- the naming cleanup for BZM2 tuning concepts so the Rust surface is less coupled to legacy internal terminology

It also keeps the supporting operator docs in sync with the new tuning and bring-up behavior.
Surface BZM2 sensor telemetry through the existing API and board runtime.

This commit adds:
- passive DTS/VS telemetry publication into board state
- explicit on-demand DTS/VS query support
- the supporting API and debug-tooling integration for ASIC voltage and temperature reads

The accompanying docs stay with this slice because they explain the new telemetry endpoints and sensor naming.
Add the first generic chain-enumeration tooling for BZM2 and document the remaining reference-implementation gaps.

This commit adds:
- default-ID chain walk helpers
- the debug CLI command for serial chain enumeration
- the initial Blockscale/BZM2 roadmap document
- README links for the growing BZM2 documentation set

The conversation log from the local integration repo is intentionally excluded from the upstream branch.
Teach the BZM2 board runtime to enumerate chains at startup instead of relying entirely on static ASIC-count configuration.

This commit adds:
- startup bus enumeration from the default ASIC ID
- fallback handling when the runtime must use configured counts instead
- the related operator documentation updates

The local conversation log is carried temporarily so later local commits apply cleanly; it will be removed from the final PR branch before completion.
Apply the generic BZM2 rail and reset sequencing plan as part of board startup and shutdown.

This moves the runtime closer to a reusable hardware reference implementation by:
- running the configured bring-up plan before discovery and calibration
- reversing the same plan during shutdown
- documenting the boundary between generic sequencing and board-specific glue
Extend the board runtime from sequencing alone to active rail management.

This commit adds:
- board-state rail telemetry publication
- application of planner-generated per-domain voltages onto the configured rail control path
- persistence and replay of the applied domain-voltage state alongside the saved operating point
Add the physical engine-discovery path and use the discovered topology in the live runtime.

This commit adds:
- per-ASIC engine probing helpers and CLI support
- board/API publication of discovered engine maps
- live dispatch and share reconstruction against the discovered layout instead of a fixed default hole map
Adjust the BZM2 tuning path to account for discovered missing engines so throughput and operating-point decisions scale to imperfect silicon and mixed topologies.
Add the live runtime feedback path for BZM2 tuning.

This commit adds:
- per-thread, per-ASIC, and per-PLL runtime measurements
- feeding live throughput data back into the tuning planner
- persistent retune triggers and saved-operating-point validation state

The board runtime can now reason about tuning quality from live operation instead of startup-only assumptions.
Expose the highest-value BZM2 diagnostics and runtime status through the board-owned API surface.

This commit adds:
- live UART diagnostics routed through the active BZM2 thread
- chain summary and clock-report endpoints
- the supporting API contract types and regression coverage

The diagnostics follow the existing UART ownership rules by requiring the thread to be idle when they run.
Move the BZM2 and Blockscale-specific documentation into a dedicated docs/bzm2 subtree, add the missing integration/reference guides, and drop the local porting conversation log from the upstream branch.

This keeps the upstream-facing PR focused on reusable implementation and operator documentation rather than local development history.
Restore the missing BZM2 engine layout type and the clock helper visibility that the later diagnostics and runtime layout work expect.

This keeps the upstream PR branch narrow: only the incomplete protocol and clock pieces are patched, and the full mujina-miner regression suite is green in WSL after the fix.
@recklessnode recklessnode marked this pull request as draft March 7, 2026 06:31
@recklessnode
Copy link
Author

Superseded by #38. This earlier branch contained non-buildable intermediate commits; the replacement PR was rebuilt from upstream/main with a full cargo test -p mujina-miner --message-format=human gate after each commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant