Fix critical architecture bugs and refactor ChaosGrad and Trainer by theomgdev · Pull Request #8 · theomgdev/OdyssNet

theomgdev · 2026-03-18T00:10:00Z

This pull request introduces several enhancements and refinements to the ChaosGrad optimizer and the RealNetTrainer class to improve training stability, adaptability, and configurability. The most notable changes include the addition of new hyperparameters for adaptive learning rate smoothing, plateau noise control, and input gradient health detection, as well as improvements to parameter classification and gradient clipping logic. These updates make the optimizer more robust and flexible for diverse network architectures and training scenarios.

ChaosGrad Optimizer Enhancements:

Added new hyperparameters for improved adaptive learning rate and plateau escape: adaptive_ema (smoothing factor for adaptive LR variance), plateau_noise_intensity (multiplier for plateau noise), loss_history_min (minimum steps for loss history), and sentinel_threshold (threshold for input health detection). These are now configurable through the optimizer and preset methods. [1] [2] [3] [4] [5] [6] [7] [8]
Improved adaptive learning rate smoothing by replacing the fixed EMA factor with a configurable value (adaptive_ema).
Enhanced plateau noise injection logic for core parameters, making noise intensity configurable and more robust.
Refined input gradient health calculation by introducing a configurable threshold for vanishing input detection.
Updated parameter classification logic to better handle naming conventions and projections, improving optimizer targeting.

Training Logic Improvements:

Added max_grad_norm as a configurable argument to RealNetTrainer, allowing flexible gradient clipping during training. [1] [2] [3]
Modified gradient persistence logic to synchronize persistent gradients with the active AMP scaling factor, improving stability in mixed precision training.

Stability and Robustness:

Updated spectral clipping logic to maintain momentum buffer alignment, ensuring optimizer stability after parameter scaling.
Improved loss history tracking by making its minimum length configurable, supporting more reliable plateau detection.

These changes collectively make the optimizer and trainer more configurable, adaptive, and robust for a wide range of use cases.

…MP persistence, spectral momentum sync, and plateau escape)

…iner - Fix 4 critical bugs: AMP/Persistence scaling, Spectral Clip momentum sync, Plateau Noise magnitude, and Adaptive LR epsilon guards. - Restore learning performance for micro-architectures (MNIST Record PoC). - Generalize architecture: Parameterize all magic numbers (EMA coefficients, noise intensity, sentinel thresholds) in ChaosGrad and Trainer. - Standardize all ChaosGradConfig presets for scale-invariant performance. - Support torch.compile naming conventions in parameter classification. - Refactor comments for objective and professional documentation.

Copilot

Pull request overview

This PR updates RealNet’s training stack by expanding ChaosGrad configurability (adaptive LR smoothing, plateau behavior, and input-gradient health detection) and making RealNetTrainer gradient clipping configurable, with additional tweaks to stability-related optimizer/training logic.

Changes:

Added new ChaosGrad hyperparameters (adaptive_ema, plateau_noise_intensity, loss_history_min, sentinel_threshold) and wired them through presets/defaults.
Refined ChaosGrad behavior for parameter grouping, plateau noise injection, spectral clipping buffer handling, and input gradient health computation.
Added max_grad_norm to RealNetTrainer and adjusted gradient persistence to account for AMP scaling.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`realnet/training/trainer.py`	Adds configurable grad clipping (`max_grad_norm`) and adjusts AMP-related gradient persistence behavior.
`realnet/training/chaos_optimizer.py`	Introduces new optimizer hyperparameters and modifies plateau/adaptive/spectral/sentinel logic for stability and configurability.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

realnet/training/trainer.py

realnet/training/chaos_optimizer.py

…trainer init

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

theomgdev · 2026-03-18T00:23:38Z

@copilot any more?

Copilot · 2026-03-18T00:23:45Z

@theomgdev I've opened a new pull request, #9, to work on those changes. Once the pull request is ready, I'll request review from you.

Stabilize ChaosGrad defaults and preserve trainer/optimizer API compatibility

This reverts commit 71d3891, reversing changes made to 3617b41.

theomgdev added 2 commits March 18, 2026 02:30

fix: resolve 4 critical architecture bugs in ChaosGrad and Trainer (A…

db0e402

…MP persistence, spectral momentum sync, and plateau escape)

Copilot AI review requested due to automatic review settings March 18, 2026 00:10

Copilot started reviewing on behalf of theomgdev March 18, 2026 00:10 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

realnet/training/trainer.py Show resolved Hide resolved

realnet/training/chaos_optimizer.py Outdated Show resolved Hide resolved

realnet/training/chaos_optimizer.py Outdated Show resolved Hide resolved

realnet/training/chaos_optimizer.py Outdated Show resolved Hide resolved

theomgdev and others added 5 commits March 18, 2026 03:18

Move new max_grad_norm to the end for backward signature support for …

0e91589

…trainer init

Potential fix for pull request finding

94b85b5

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

a5ba062

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

7c9d1f2

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

typo

53e9b8a

Initial plan

c4c898c

Copilot AI mentioned this pull request Mar 18, 2026

Stabilize ChaosGrad defaults and preserve trainer/optimizer API compatibility #9

Merged

Merge pull request #9 from theomgdev/copilot/sub-pr-8

f7ba0f3

Stabilize ChaosGrad defaults and preserve trainer/optimizer API compatibility

theomgdev merged commit 71d3891 into main Mar 18, 2026

theomgdev added a commit that referenced this pull request Mar 18, 2026

Revert "Merge pull request #8 from theomgdev/experimental"

b065045

This reverts commit 71d3891, reversing changes made to 3617b41.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix critical architecture bugs and refactor ChaosGrad and Trainer#8

Fix critical architecture bugs and refactor ChaosGrad and Trainer#8
theomgdev merged 9 commits intomainfrom
experimental

theomgdev commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

theomgdev commented Mar 18, 2026

Uh oh!

Copilot AI commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

theomgdev commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

theomgdev commented Mar 18, 2026

Uh oh!

Copilot AI commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants