How to run parameter sweeps and override experiment settings without modifying source code.
Documentation Map
- Running a design? See Inference Guide
- Tuning YAML configs? See Configuration Guide
- Understanding metrics? See Evaluation Guide
- Search metadata? See Search Metadata
The sweep system has two mechanisms that work together:
--sweeper FILEloads a YAML file defining one or more parameter axes. All combinations are generated as a cartesian product.--override KEY=VAL [KEY=VAL ...]pins individual parameters to scalar values, applied to every generated config.
Both can be used independently or combined. When a key appears in both the sweeper file and the CLI overrides, the override wins and that axis is collapsed to a single value.
# 1. Single experiment with a specific target (no sweep)
./slurm_utils/launch_protein_binder_search_from_local_docker.sh \
--override generation.task_name=22_DerF21
# 2. Sweep beam widths for a single target
./slurm_utils/launch_protein_binder_search_from_local_docker.sh \
--sweeper configs/sweeps/beam_width.yaml \
--override generation.task_name=22_DerF21
# 3. Sweep beam widths with nsteps pinned to 400
./slurm_utils/launch_protein_binder_search_from_local_docker.sh \
--sweeper configs/sweeps/beam_width.yaml \
--override generation.task_name=22_DerF21 generation.args.nsteps=400
# 4. Dry run to preview config generation
python script_utils/generate_inference_configs.py \
--config_name search_binder_pipeline \
--sweeper configs/sweeps/beam_width.yaml \
--override generation.task_name=22_DerF21 \
--dryrunSweep files live in configs/sweeps/. Each key is a dot-notation config path
and each value is a list of values to sweep over:
# configs/sweeps/beam_width.yaml
generation.search.beam_search.beam_width:
- 1
- 2
- 4
- 8Multiple axes produce a cartesian product:
# 3 beam widths x 2 nsteps = 6 configs
generation.search.beam_search.beam_width:
- 2
- 4
- 8
generation.args.nsteps:
- 200
- 400Long values (e.g., checkpoint paths) benefit from YAML block-list style:
ckpt_path:
- /path/to/checkpoints/model_v1/epoch_100.ckpt
- /path/to/checkpoints/model_v2/epoch_200.ckpt
- /path/to/checkpoints/model_v3/epoch_50.ckptA scalar value (not a list) is automatically wrapped in a single-element list, so it pins that parameter without adding a sweep dimension:
generation.args.self_cond: trueOverride arguments are KEY=VALUE strings where:
- The key uses dot notation (e.g.,
generation.args.nsteps) - The value is auto-typed: integers, floats, booleans (
true/false), null (null/none), or strings
--override generation.task_name=22_DerF21 generation.args.nsteps=400 generation.args.self_cond=trueMultiple overrides are space-separated after a single --override flag.
+-----------------+
| Sweeper YAML | (optional)
+-----------------+
|
v
+-----------+ +------------------+ +-------------------+
| Pipeline | ---> | generate_infer- | ---> | N inf_*.yaml + |
| Config | | ence_configs.py | | N eval_*.yaml |
+-----------+ +------------------+ +-------------------+
^
|
+-----------------+
| --override | (optional)
| KEY=VAL ... |
+-----------------+
- The base pipeline config (e.g.,
search_binder_pipeline.yaml) is loaded via Hydra. - If a
--sweeperfile is provided, its axes are combined into a cartesian product. - For each combination, the sweep values are merged into the base config.
--overridevalues are applied on top of every config (overriding sweep values if they conflict).- Each config gets a unique
root_pathand is saved as a pair:inf_{idx}_{run}.yamlandeval_{idx}_{run}.yaml.
Generated config filenames follow the pattern:
inf_{index}_{run_name}.yaml
eval_{index}_{run_name}.yaml
The index is sequential (0, 1, 2, ...) and the run name comes from the
--run_name argument. Task name is intentionally NOT in the filename -- it
lives inside the config content. This keeps filenames predictable so the bash
launcher can construct Hydra --config-name paths without parsing YAML.
On the remote cluster (and on-cluster direct launches), configs always live at the canonical paths:
configs/
inference_configs/
inf_0_my_run.yaml
inf_1_my_run.yaml
eval_configs/
eval_0_my_run.yaml
eval_1_my_run.yaml
When launching from a local machine, configs are first generated into a unique
temporary directory (configs/_gen_XXXXXX/) so that concurrent launches never
collide. Inside the rsync lock, the temp directory contents are staged
(moved) into the canonical configs/inference_configs/ and
configs/eval_configs/ paths, rsync'd to the cluster, and then cleaned up
locally.
generate_configs() → configs/_gen_abc123/{inference_configs,eval_configs}/
stage_configs_for_sync() → configs/{inference_configs,eval_configs}/ (inside lock)
rsync → $RUN_DIR/configs/{inference_configs,eval_configs}/
cleanup_staged_configs() → local canonical dirs removed (inside lock)
When running directly on the cluster, no temp directory is needed. Configs are generated straight into the canonical paths and used in-place by the SLURM array jobs.
All three launcher scripts accept --sweeper and --override:
./slurm_utils/launch_protein_binder_search_from_local_docker.sh \
--sweeper configs/sweeps/beam_width.yaml \
--override generation.task_name=22_DerF21The on-cluster script works identically (no rsync needed):
./slurm_utils/launch_protein_binder_search_conda.sh \
--sweeper configs/sweeps/beam_width.yaml \
--override generation.task_name=22_DerF21The multi-target loop wrapper also passes these through:
./slurm_utils/launch_protein_binder_search_target_loop_conda.sh \
--sweeper configs/sweeps/beam_width.yaml \
./my_targets.txtThe TARGET_TASK positional argument is still supported for backward
compatibility. Internally it is converted to
--override generation.task_name=$TARGET_TASK.
The system is designed for safe concurrent launches:
- Local config isolation: Each launch generates configs in a unique
mktemp -d configs/_gen_XXXXXXdirectory. Before rsync (inside the lock), configs are staged to the canonicalconfigs/inference_configs/andconfigs/eval_configs/paths, rsync'd, then cleaned up. This means the generation phase is fully parallel, and the staging+rsync phase is serialised by the lock. - Remote directory uniqueness: The run name on the cluster includes the
sweeper file basename (e.g.,
my_run-search-beam_width-22_DerF21), making collisions nearly impossible for different sweeps. - Lock file for rsync: The
.lockfile prevents simultaneous rsync operations from corrupting the code copy.
Example: launching two experiments in parallel from different terminals:
# Terminal 1: sweep beam widths for target A
./slurm_utils/launch_protein_binder_search_from_local_docker.sh \
--sweeper configs/sweeps/beam_width.yaml \
--override generation.task_name=22_DerF21
# Terminal 2: sweep replicas for target B (safe to run simultaneously)
./slurm_utils/launch_protein_binder_search_from_local_docker.sh \
--sweeper configs/sweeps/search_replicas.yaml \
--override generation.task_name=02_PDL1The core Python modules (generate, filter, evaluate, analyze) can
still be run independently with Hydra ++key=value overrides:
# Run just evaluation on existing inference outputs
python -m proteinfoundation.evaluate \
--config-path /path/to/eval_configs \
--config-name eval_0_22_DerF21_my_run \
++eval_njobs=4The sweep system only affects config generation and the bash pipeline orchestration. It does not change how individual modules consume configs.
| Old | New |
|---|---|
Edit sweeper = {...} dict in Python |
Create a YAML file in configs/sweeps/ |
--target_task_override 22_DerF21 |
--override generation.task_name=22_DerF21 |
--unified flag |
Removed (always unified pipeline mode) |
--eval_config_name evaluate |
Removed (eval derived from pipeline config) |
--infer_config_name search_binder |
--config_name search_binder_pipeline |
generate_binder_inference_configs.py |
generate_inference_configs.py |
Hardcoded ALPHA_PROTEO_TARGETS list |
Removed (use sweep or override) |
rm -rf configs/inference_configs |
Automatic: temp dir + staging (concurrent-safe) |
The sweep system is covered by tests/test_sweep.py (92 tests):
pytest tests/test_sweep.py -vTest categories:
TestParseScalar/TestParseOverride: Value parsing and type inferenceTestLoadSweeperFile: YAML loading and validationTestBuildSweeper: Merge logic and conflict resolutionTestDotKeyDictToNestedDict: Dot-notation conversionTestGetTaskNameFromConfig/TestCreateEvalConfig: Config helpersTestApplySweeperAndSaveConfigs: End-to-end cartesian product, value propagation, unique paths, filename conventionsTestConcurrentSafety: Parallel generation without collisionsTestBuildParser: CLI argument parser validationTestCreateEvalConfigNjobs: eval_njobs priority regression testsTestCreateEvalConfigPaths: Path derivation edge casesTestEmptySweepAxis: Zero-config guard regression testsTestConfigFilenameContract: Filename contract between Python and bashTestEvalPathEdgeCases: Eval path derivation with adversarial inputs