Add MSA refinement model #8

stephprince · 2025-03-14T21:00:23Z

Motivation

Draft of an approach inspired by Fadini et al that runs iterative inference-time optimization by modifying the MSA inputs.

There are a couple of remaining to-do items, but I wanted to get feedback on the current implementation in the meantime.

Questions:

It was unclear to me from the paper what MSA values were modified as inputs to AF. The raw MSA features or the embedded MSA representation? Currently I'm using the raw MSA features.
Does this current setup apply the lightning module wrapper at the best level of abstraction for training? Currently, each "training step" currently loops n_iterations=N times through the linear layer + AF model and uses the manual optimization functions.

TODO

~~Update the initialization of the linear layer parameters to come from the config file, similar to AF setup~~ Edit: With the current setup this is challenging because the shape of the raw msa features modified by the linear layer changes for each data input. I updated the lighting callbacks to reinitialize the model parameters/optimizer at the start of each batch.
Setup checkpointing for final model / best version per batch?
Add option for multiple phase of training (e.g. original paper used a two phase approach, selecting the best model from phase 1 to fine-tune in phase 2)

How to test the behavior?

cd metfish
python src/metfish/refinement_model/train.py /path/to/data /path/to/output

Checklist

Did you update CHANGELOG.md with your changes?
Have you checked our Contributing document?
Have you ensured the PR clearly describes the problem and the solution?
Is your contribution compliant with our coding style? This can be checked running ruff from the source directory.
Have you checked to ensure that there aren't other open Pull Requests for the same change?
Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

codecov · 2025-07-07T23:45:56Z

Codecov Report

Attention: Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 25.22%. Comparing base (9a2a41c) to head (7caba1d).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/metfish/utils.py	40.00%	6 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (9a2a41c) and HEAD (7caba1d). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (9a2a41c) HEAD (7caba1d)

2 1

Additional details and impacted files

@@             Coverage Diff             @@
##             main       #8       +/-   ##
===========================================
- Coverage   44.05%   25.22%   -18.83%     
===========================================
  Files           3        7        +4     
  Lines         227     1118      +891     
===========================================
+ Hits          100      282      +182     
- Misses        127      836      +709

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

stephprince added 4 commits March 14, 2025 13:04

add msa refinement model

41895cd

clean up older code

4bb3a6b

add comments

838a078

update model initialization

d1c24b1

stephprince requested review from ajtritt, oruebel and smallfishabc March 17, 2025 23:33

stephprince added 3 commits March 25, 2025 14:53

update optimization model

3aee914

add fabric training function

258b6a7

update logging

7caba1d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MSA refinement model #8

Add MSA refinement model #8

Uh oh!

stephprince commented Mar 14, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add MSA refinement model #8

Are you sure you want to change the base?

Add MSA refinement model #8

Uh oh!

Conversation

stephprince commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

How to test the behavior?

Checklist

Uh oh!

codecov bot commented Jul 7, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stephprince commented Mar 14, 2025 •

edited

Loading