Skip to content

Thermal management#2

Open
jayrmotta wants to merge 2 commits intomainfrom
feature/thermal-mgmt
Open

Thermal management#2
jayrmotta wants to merge 2 commits intomainfrom
feature/thermal-mgmt

Conversation

@jayrmotta
Copy link
Owner

Summary

Currently the thermal management available in Mujina sets the fan speed to 100% as the system initializes, it sits at that value through the entire firmware lifecycle. Issue 256foundation#9 describes this behavior and points esp-miner's implementation of a PID controller to control the fan speed.

Problem

Even with fan speed set to 100% the BitAxe I've been using to test shoots to 100 celcius ASIC temperature little after it starts running:

12:41:53 INFO  scheduler: Mining status.
               uptime=2m 30s, hashrate=879.62 GH/s, shares=13
12:42:05 INFO  board::bitaxe: Board status.
               board=Bitaxe Gamma, serial=e5966ee3, asic_temp=100.0 degC, fan_speed=100%, fan_rpm=7510 RPM, vr_temp=79 degC, power=19.5W, current=16.94A, vin=5.05V, vout=1.148V

Solution

This PR closes 256foundation#9 by introducing a thermal management module that monitors temperature and reacts to it controlling the fan speed and operating frequency with the aim of keeping the device within safe operating temperature ranges.

12:54:21 INFO  scheduler: Mining status.
               uptime=2m 30s, hashrate=703.70 GH/s, shares=12
12:54:23 INFO  board::bitaxe: Board status.
               board=Bitaxe Gamma, serial=e5966ee3, asic_temp=80.6 degC, fan_speed=85%, fan_rpm=7725 RPM, vr_temp=58 degC, power=11.7W, current=10.19A, vin=5.22V, vout=1.148V

This solution starts stabilizing the temperature soon after its boot and in my experience trends towards 75 celsius, delivering 600~800gh/s.

There is definitely room to explore tunning as I used to get 1th/s with somewhat-quiet fan using AxeOs, there I played not only with fan speed and frequency, but also with voltage.

Changes

  • Introduced a thermal state-machine with built-in hysteresis, so depending on the current/prior state different heuristics will be used to handle the temperature
  • Introduced a PID (although I'm currently not using the derivative) controller to determine fan speed
  • Created tokio channels (producers/consumers) to adjust operating frequency & fan speed, the ideas is to make this new module easy to use/port for other boards and less coupled
  • Introduced a thermal controller to coordinate the various components of the solution
  • Wired the thermal controller to the BitAxe board and thread

Testing

  • Thermal state transitions: Verify that temperature thresholds map to the correct state (NORMAL, COOLING, THROTTLING, CRITICAL).
  • Fan PID controller math: Validate P+I output calculation, integral accumulation/freezing, and integral clamping at min/max bounds.
  • Closed-loop thermal controller: End-to-end async tests covering fan speed commands per state, frequency bump-up/bump-down on state transitions, PID integral reset/freeze behavior across states, and no-op when temperature data is absent or state is unchanged.
  • ASIC frequency range validation: Ensures bump-up/bump-down from the default operating frequency stays within valid frequency bounds.

Implement thermal monitoring and control with a state machine and
PID-based fan speed regulation. The module is self-contained and
has no dependency on board or ASIC code. It provides ThermalConfig
for thresholds and operating frequency, ThermalState with hysteresis
to avoid oscillation on noisy readings, ThermalController for
temperature and fan or frequency commands, FanPIDController for
fan speed control, and TemperatureFilter for smoothing sensor readings.
Tests cover state transitions, PID behavior, and filter semantics.
Integration with the Bitaxe board and BM13xx thread follows in the
next commit.
Connect the thermal controller to the Bitaxe board lifecycle and
the BM13xx thread so that temperature drives fan speed and
frequency throttling at runtime. Serializing EMC2101 access avoids
I2C bus contention between temperature reads and fan writes.
Accepting frequency commands in the BM13xx thread allows PLL
adjustment for thermal throttling. Sustained-overshoot frequency
reduction with cooldown avoids over-correction on transient
temperature spikes. Filtered temperature and hysteresis-based
state transitions from the thermal module keep behavior stable.
@jayrmotta
Copy link
Owner Author

Would you please review again @johnnyasantoss @csgui ?

@rkuester
Copy link

rkuester commented Feb 21, 2026

Hey @jayrmotta—Mujina maintainer here. It's exciting you picked Mujina for your project, and it is impressive what you've got working! We're eager to help you upstream parts of this to Mujina, if you're willing.

For upstreaming, I would want the fan controller and frequency controller split into independent components, and submitted in different PRs.

The first version of the fan controller could simply control fan duty cycle for a hard-coded ASIC temperature target.1 Structurally, I'd like to see it start local to Bitaxe—so, in bitaxe.rs, or better yet, we move bitaxe.rs to a module directory and have the fan controller in a submodule—and later migrate to a more global location if we reuse it. Future versions (subsequent PRs) could add API override of the fan duty cycle or temperature target. See some of the API changes we've merged lately for ideas.

Frequency control gets a little more complicated, as it will eventually need to fit into a larger picture where the scheduler sets different modes or controls for different targets (max rate, max efficiency, a particular power target, etc.) Perhaps controlling to a temperature fits in there somewhere; I hadn't thought about it.

What does make sense initially is a frequency controller that throttles the ASIC back if it goes over an unsafe temperature. That's internal to the board/thread, and would remain as a defensive mechanism no matter what the schedule was requesting. I could see that defensive frequency controller as its own PR. (Maybe it could be sensitive to other temperatures too, like the voltage regulator temperature.) But again, I'd like to see the frequency controller and the fan controller as completely independent parts, with their own control logic, even though they coincidentally use the same temperature as an input.

Footnotes

  1. I realize on your board, which seems to have an especially hot-running ASIC, the fan might simply saturate at 100%. Perhaps you could hardcode your ASIC frequency at something mid-range for testing purposes. On a typical Bitaxe at default clock speeds, the fan doesn't need to run at 100%. Maybe your board needs a tighter heat-sink, and or some new thermal paste between the chip and the heat sink? Maybe a better fan like a Noctua NF-A4x20? I'm curious what your temperature (and fan) is with the stock Bitaxe firmware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(bitaxe): implement closed-loop fan control based on temperature

2 participants