Skip to content

[Proposal] Add to cfg: training architecture (and key details), possibly l0 and num dead #663

@hijohnnylin

Description

@hijohnnylin

Proposal

Add training architecture (eg BatchTopK Matryoshka) and key details (eg Matryoshka inner sizes) to sae config.

Motivation

Current details in cfg only show "jumprelu" which doesn't tell us what training architecture it was. We would like a quick way to access this and distinguish between matryoshka, know that it was batchtopk, etc.
For certain training architectures it's also useful to have key details like Matryoshka inner sizes.
Could be also good to have the l0 and number of dead latents.

Most of this is in the training cfg but it's buried in with a lot of other params.

Pitch

Add one or more of the following to the metadata in sae cfg output:

  • training architecture (to distinguish from existing "architecture")
    • should we have separate config values for batchtopk and matryoshka?
  • training architecture detail(s)? (not as important)
  • l0
  • num dead latents

Checklist

  • I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions