[Proposal] Support data whitening in SAE training

[Data Whitening Improves Sparse Autoencoder Learning](https://arxiv.org/abs/2511.13981) shows that whitening the input data during training results in better SAEs. This seems like a pretty intuitive and sort of obvious thing to do, it's surprising nobody has done this so far. We already support normalizing activations using `normalize_activations="expected_average_only_in"`, we could add an additional option `"covariance_whitening"` implementing the technique from this paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Support data whitening in SAE training #647

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Proposal] Support data whitening in SAE training #647

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions