Skip to content

Add experimental Blockscale BZM2 support and diagnostics#38

Draft
recklessnode wants to merge 21 commits into256foundation:mainfrom
recklessnode:codex/bzm2-upstream-port-r2
Draft

Add experimental Blockscale BZM2 support and diagnostics#38
recklessnode wants to merge 21 commits into256foundation:mainfrom
recklessnode:codex/bzm2-upstream-port-r2

Conversation

@recklessnode
Copy link

@recklessnode recklessnode commented Mar 7, 2026

Summary

  • add experimental Blockscale BZM2 support for generic Gen2 hardware integrations
  • add the UART/TDM mining path, PLL/DLL control, DTS/VS telemetry, startup calibration, runtime tuning, chain/engine discovery, and board/API diagnostics
  • add BZM2 reference documentation under docs/bzm2

Scope

  • focuses on reusable ASIC, board, and API behavior rather than vendor-specific carrier implementations
  • excludes the internal porting conversation log from this upstream branch
  • keeps Gen1-specific work out of scope
  • removes upstream-only debug and virtual transport layers from this PR branch

Cleanup Since Initial Submission

  • scrubbed references to non-public/private source documents from the public BZM2 docs
  • moved the Blockscale tuning planner out of asic/bzm2 into a general tuning module
  • moved rail/reset sequencing out of asic/bzm2 into board-level power code
  • removed the bzm2-debug binary from the upstream-scoped branch
  • removed the synthetic virtual_device transport layer from the upstream-scoped branch
  • removed the superseded asic/bzm2/pnp.rs and asic/bzm2/control.rs files
  • rebased the branch onto the current upstream main

Review Notes

This branch is now rebased onto the current upstream main tip and has been rechecked after the rebase.

Suggested review order:

  1. initial board/protocol integration through startup calibration
  2. telemetry, chain discovery, and runtime engine layouts
  3. docs and module-placement cleanup for upstream scope

Validation

Final branch validation after rebase onto upstream main:

  • cargo test -p mujina-miner --message-format=human
  • result: 339 passed, 0 failed, 5 ignored
  • doctests: 8 passed, 0 failed, 2 ignored

Draft Status

This PR is intentionally kept as a draft so it can be reviewed before requesting final upstream merge consideration.

@penguin359
Copy link

This PR is in support of the feature discussed in #28. While the basic structure is in place, there is some clean-up that can be done with the Git history. I propose marking this as a draft for the time being while that is being worked on. Once that is done, I will work on a more formal code review of the results.

@recklessnode Can you try marking this as a draft? It seems I don't have that privilege.

#[derive(Debug, Default)]
pub struct Bzm2CalibrationPlanner;

impl Bzm2CalibrationPlanner {
Copy link

@johnny9 johnny9 Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This likely should exist as a more general calibration module. I think we can learn alot from this but I don't believe it makes sense to have an asic specific calibration tool. I'm imagining a general calibration algorithm with interfaces/configuration for each asic module time to make it work well across asic types

}

match args[1].as_str() {
"uart-read" => cmd_uart_read(&args[2..]).await,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there is much value in bringing this over.

}

#[async_trait]
impl<I2C: I2c> VoltageRegulator for Tps546PowerRail<I2C> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe power management belongs in the asic module

Copy link

@johnny9 johnny9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should rebase this.

@@ -0,0 +1,23 @@
//! Generic virtual device transport.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this

@rkuester rkuester marked this pull request as draft March 13, 2026 03:04
@rkuester
Copy link
Collaborator

Hey guys, thanks for sharing this code! Since this is a work-in-progress shared for informational purposes or early feedback and not yet ready for review and merge, I've marked it as a draft.

@recklessnode
Copy link
Author

Sorry for the delay folks, reviewing the suggestions today after several side conversations on the state of the PR, will see if I can take some of it into the plan today.

Add the first upstream BZM2 integration slice:
- introduce the new ASIC module with protocol and mining thread support
- add a virtual BZM2 board and transport wiring for serial-backed devices
- register the board with the backplane and daemon startup path

This commit intentionally lands the core integration surface first. Telemetry, tuning, diagnostics, and broader API support follow in later commits.
Extend the initial BZM2 integration with:
- board telemetry and safety-state handling
- reusable control-plane abstractions for reset and power sequencing
- expanded UART opcode coverage and parser behavior
- end-to-end actor tests for dispatch and share reconstruction

This keeps the transport and board core from the first commit, then layers in the reusable hardware-control and validation surface.
Add the reusable UART-side PLL diagnostic path for BZM2.

This introduces the first clock-control surface needed for bring-up and debugging:
- PLL divider calculation and programming
- enable/disable control
- lock-state polling
- structured clock status reporting

The port note is added here so the remaining BZM2 docs can evolve in place with later functionality.
Add the standalone BZM2 debug binary and extend the clock-control path beyond PLL status.

This commit adds:
- the UART-focused debug CLI for live serial interaction
- DLL configuration and diagnostics alongside the existing PLL flow
- protocol tests covering the extra wire-format and parser behavior needed by the tooling

The port note and UART guide are updated in the same slice because they document the new bring-up and debug surface.
Build out the BZM2 board runtime beyond basic mining support.

This commit adds:
- the tuning planner and saved operating-point model
- startup calibration and replay of saved operating points
- broadcast-oriented bring-up helpers in the debug CLI
- the naming cleanup for BZM2 tuning concepts so the Rust surface is less coupled to legacy internal terminology

It also keeps the supporting operator docs in sync with the new tuning and bring-up behavior.
Surface BZM2 sensor telemetry through the existing API and board runtime.

This commit adds:
- passive DTS/VS telemetry publication into board state
- explicit on-demand DTS/VS query support
- the supporting API and debug-tooling integration for ASIC voltage and temperature reads

The accompanying docs stay with this slice because they explain the new telemetry endpoints and sensor naming.
Add the first generic chain-enumeration tooling for BZM2 and document the remaining reference-implementation gaps.

This commit adds:
- default-ID chain walk helpers
- the debug CLI command for serial chain enumeration
- the initial Blockscale/BZM2 roadmap document
- README links for the growing BZM2 documentation set

The conversation log from the local integration repo is intentionally excluded from the upstream branch.
Teach the BZM2 board runtime to enumerate chains at startup instead of relying entirely on static ASIC-count configuration.

This commit adds:
- startup bus enumeration from the default ASIC ID
- fallback handling when the runtime must use configured counts instead
- the related operator documentation updates

The local conversation log is carried temporarily so later local commits apply cleanly; it will be removed from the final PR branch before completion.
Apply the generic BZM2 rail and reset sequencing plan as part of board startup and shutdown.

This moves the runtime closer to a reusable hardware reference implementation by:
- running the configured bring-up plan before discovery and calibration
- reversing the same plan during shutdown
- documenting the boundary between generic sequencing and board-specific glue
Extend the board runtime from sequencing alone to active rail management.

This commit adds:
- board-state rail telemetry publication
- application of planner-generated per-domain voltages onto the configured rail control path
- persistence and replay of the applied domain-voltage state alongside the saved operating point
Add the physical engine-discovery path and use the discovered topology in the live runtime.

This commit adds:
- per-ASIC engine probing helpers and CLI support
- board/API publication of discovered engine maps
- live dispatch and share reconstruction against the discovered layout instead of a fixed default hole map
Adjust the BZM2 tuning path to account for discovered missing engines so throughput and operating-point decisions scale to imperfect silicon and mixed topologies.
Add the live runtime feedback path for BZM2 tuning.

This commit adds:
- per-thread, per-ASIC, and per-PLL runtime measurements
- feeding live throughput data back into the tuning planner
- persistent retune triggers and saved-operating-point validation state

The board runtime can now reason about tuning quality from live operation instead of startup-only assumptions.
Expose the highest-value BZM2 diagnostics and runtime status through the board-owned API surface.

This commit adds:
- live UART diagnostics routed through the active BZM2 thread
- chain summary and clock-report endpoints
- the supporting API contract types and regression coverage

The diagnostics follow the existing UART ownership rules by requiring the thread to be idle when they run.
Move the BZM2 and Blockscale-specific documentation into a dedicated docs/bzm2 subtree, add the missing integration/reference guides, and drop the local porting conversation log from the upstream branch.

This keeps the upstream-facing PR focused on reusable implementation and operator documentation rather than local development history.
@recklessnode recklessnode force-pushed the codex/bzm2-upstream-port-r2 branch from d476ce7 to 44e0940 Compare March 19, 2026 03:12
@recklessnode
Copy link
Author

@johnny9 @rkuester PR #38 has now been rebased onto the current upstream main, revalidated, and kept in draft for your review. Core branch validation: 339 passed, 0 failed, 5 ignored. The parity follow-up branch is also rebuilt and green if you want to compare the remaining delta later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants