Skip to content

Is RA-SAE able to process embedding features from feed-forward 3D vision backbones? #6

@Zifeng-Zhang

Description

@Zifeng-Zhang

Hi, thanks on your great work on vision model interpretations! I also love your work and findings in the latest "Into the Rabbit Hull" paper!

Yet I'm wondring if it's do-able to process embeddings from feed-forward 3D backbones (e.g. mast3r family, vggt etc.) which are also attention-based models, and it's supre interesting to reveal what 3D concepts do they learned. Is there anything to pay attention to when swtich to 3D vision models?

Meanwhile, I'll trying to train a RA-SAE on VGGT's embeddings (the attention layer output) and I'm glad to share it if anyone find it interesting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions