Skip to content

noanabeshima/matryoshka-saes

Repository files navigation

Matryoshka implementation is in sae.py. Vanilla is a special case of Matryoshka where the number of prefixes is 1.

See train_toy.ipynb for the code to train a Matryoshka and Vanilla SAE on the toy model.

The toy model in this repo is very similar to the excellent concurrent work Toys Model of Feature Absorption.

image

Ground-truth features

Ground-truth features

Ground-truth features

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages