Skip to content

Fix critical architecture bugs and refactor ChaosGrad and Trainer#8

Merged
theomgdev merged 9 commits intomainfrom
experimental
Mar 18, 2026
Merged

Fix critical architecture bugs and refactor ChaosGrad and Trainer#8
theomgdev merged 9 commits intomainfrom
experimental

Conversation

@theomgdev
Copy link
Copy Markdown
Owner

This pull request introduces several enhancements and refinements to the ChaosGrad optimizer and the RealNetTrainer class to improve training stability, adaptability, and configurability. The most notable changes include the addition of new hyperparameters for adaptive learning rate smoothing, plateau noise control, and input gradient health detection, as well as improvements to parameter classification and gradient clipping logic. These updates make the optimizer more robust and flexible for diverse network architectures and training scenarios.

ChaosGrad Optimizer Enhancements:

  • Added new hyperparameters for improved adaptive learning rate and plateau escape: adaptive_ema (smoothing factor for adaptive LR variance), plateau_noise_intensity (multiplier for plateau noise), loss_history_min (minimum steps for loss history), and sentinel_threshold (threshold for input health detection). These are now configurable through the optimizer and preset methods. [1] [2] [3] [4] [5] [6] [7] [8]
  • Improved adaptive learning rate smoothing by replacing the fixed EMA factor with a configurable value (adaptive_ema).
  • Enhanced plateau noise injection logic for core parameters, making noise intensity configurable and more robust.
  • Refined input gradient health calculation by introducing a configurable threshold for vanishing input detection.
  • Updated parameter classification logic to better handle naming conventions and projections, improving optimizer targeting.

Training Logic Improvements:

  • Added max_grad_norm as a configurable argument to RealNetTrainer, allowing flexible gradient clipping during training. [1] [2] [3]
  • Modified gradient persistence logic to synchronize persistent gradients with the active AMP scaling factor, improving stability in mixed precision training.

Stability and Robustness:

  • Updated spectral clipping logic to maintain momentum buffer alignment, ensuring optimizer stability after parameter scaling.
  • Improved loss history tracking by making its minimum length configurable, supporting more reliable plateau detection.

These changes collectively make the optimizer and trainer more configurable, adaptive, and robust for a wide range of use cases.

…MP persistence, spectral momentum sync, and plateau escape)
…iner

- Fix 4 critical bugs: AMP/Persistence scaling, Spectral Clip momentum sync,
  Plateau Noise magnitude, and Adaptive LR epsilon guards.
- Restore learning performance for micro-architectures (MNIST Record PoC).
- Generalize architecture: Parameterize all magic numbers (EMA coefficients,
  noise intensity, sentinel thresholds) in ChaosGrad and Trainer.
- Standardize all ChaosGradConfig presets for scale-invariant performance.
- Support torch.compile naming conventions in parameter classification.
- Refactor comments for objective and professional documentation.
Copilot AI review requested due to automatic review settings March 18, 2026 00:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates RealNet’s training stack by expanding ChaosGrad configurability (adaptive LR smoothing, plateau behavior, and input-gradient health detection) and making RealNetTrainer gradient clipping configurable, with additional tweaks to stability-related optimizer/training logic.

Changes:

  • Added new ChaosGrad hyperparameters (adaptive_ema, plateau_noise_intensity, loss_history_min, sentinel_threshold) and wired them through presets/defaults.
  • Refined ChaosGrad behavior for parameter grouping, plateau noise injection, spectral clipping buffer handling, and input gradient health computation.
  • Added max_grad_norm to RealNetTrainer and adjusted gradient persistence to account for AMP scaling.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
realnet/training/trainer.py Adds configurable grad clipping (max_grad_norm) and adjusts AMP-related gradient persistence behavior.
realnet/training/chaos_optimizer.py Introduces new optimizer hyperparameters and modifies plateau/adaptive/spectral/sentinel logic for stability and configurability.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

theomgdev and others added 5 commits March 18, 2026 03:18
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@theomgdev
Copy link
Copy Markdown
Owner Author

@copilot any more?

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 18, 2026

@theomgdev I've opened a new pull request, #9, to work on those changes. Once the pull request is ready, I'll request review from you.

Stabilize ChaosGrad defaults and preserve trainer/optimizer API compatibility
@theomgdev theomgdev merged commit 71d3891 into main Mar 18, 2026
theomgdev added a commit that referenced this pull request Mar 18, 2026
This reverts commit 71d3891, reversing
changes made to 3617b41.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants