Skip to content

Fix configparser API bug and add analysis_summary_file support in sgh_restart_ensemble#6

Draft
Copilot wants to merge 2 commits into
sgh_ensemble_generator_templatefrom
copilot/add-analysis-summary-file-option
Draft

Fix configparser API bug and add analysis_summary_file support in sgh_restart_ensemble#6
Copilot wants to merge 2 commits into
sgh_ensemble_generator_templatefrom
copilot/add-analysis-summary-file-option

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 31, 2026

configure() in sgh_restart_ensemble/test_case.py had a broken configparser usage pattern and no way to consume an external analysis_summary.json produced by sgh_ensemble_analysis. This PR fixes both.

Changes

  • Fix configparser API bug: config.get('restart_ensemble', {}) was treating the result as a dict — the second arg to MpasConfigParser.get() is an option name, not a default. All config reads now use the correct config.get(section, option) / config.getint(...) / config.getfloat(...) / config.getboolean(...) form with try/except (configparser.NoSectionError, configparser.NoOptionError) fallbacks.

  • Add analysis_summary_file config option: New optional key under [restart_ensemble]. When set, loads restart_needed_runs from the referenced JSON and pre-filters the run scan to only those run numbers — skipping all others before the per-run analysis_results.json logic runs. When absent, behavior is unchanged.

[restart_ensemble]
spinup_work_dir = /work/spinup/sgh_restart_ensemble
analysis_summary_file = /work/analysis/sgh_ensemble_analysis/analysis_summary.json  # optional
max_consecutive_restarts = 3
min_simulation_years_before_restart = 50.0
auto_restart_incomplete = True

Checklist

  • User's Guide has been updated
  • Developer's Guide has been updated
  • API documentation in the Developer's Guide (api.rst) has any new or modified class, method and/or functions listed
  • Documentation has been built locally and changes look as expected
  • The E3SM-Project submodule has been updated with relevant E3SM changes
  • The MALI-Dev submodule has been updated with relevant MALI changes
  • Document (in a comment titled Testing in this PR) any testing that was used to verify the changes
  • New tests have been added to a test suite
Original prompt

Background

sgh_ensemble_analysis writes an analysis_summary.json file into its own work directory. Currently, sgh_restart_ensemble/test_case.py's configure() has no way to read this file — it only checks per-run analysis_results.json files inside each run*/ directory. Users need a way to point the restart ensemble at an external analysis_summary.json.

Changes required

1. compass/landice/tests/ensemble_generator/sgh_restart_ensemble/test_case.py

Add an optional analysis_summary_file config option to [restart_ensemble]. When provided, use it to pre-filter which runs are eligible for restart (only runs listed in restart_needed_runs in the summary JSON). When not provided, fall back to the existing per-run analysis_results.json logic.

Also fix the existing configparser API bug (separate issue already in another PR, but needs to be applied here too): config.get('restart_ensemble', {}) is wrong — the second argument to MpasConfigParser.get() is an option name (string), not a default value. Replace all config reads with the proper config.get(section, option) / config.getint(...) / config.getfloat(...) / config.getboolean(...) pattern, with try/except fallbacks for optional options.

The updated configure() method should:

def configure(self):
    config = self.config

    # Read required spinup_work_dir
    try:
        spinup_work_dir = config.get('restart_ensemble', 'spinup_work_dir')
    except Exception:
        raise ValueError(
            "restart_ensemble config must specify spinup_work_dir\n"
            "Add to config file:\n"
            "[restart_ensemble]\n"
            "spinup_work_dir = /path/to/spinup/ensemble"
        )

    if not spinup_work_dir:
        raise ValueError(
            "restart_ensemble config must specify spinup_work_dir\n"
            "Add to config file:\n"
            "[restart_ensemble]\n"
            "spinup_work_dir = /path/to/spinup/ensemble"
        )

    if not os.path.exists(spinup_work_dir):
        raise ValueError(f"spinup_work_dir not found: {spinup_work_dir}")

    # Read optional analysis_summary_file
    analysis_summary_file = None
    try:
        val = config.get('restart_ensemble', 'analysis_summary_file')
        if val and val.lower() not in ('none', ''):
            analysis_summary_file = val
    except Exception:
        pass

    # If provided, validate it exists
    if analysis_summary_file and not os.path.exists(analysis_summary_file):
        raise ValueError(
            f"analysis_summary_file not found: {analysis_summary_file}")

    # Load the set of run numbers flagged for restart from the summary file
    summary_restart_runs = None
    if analysis_summary_file:
        with open(analysis_summary_file, 'r') as f:
            summary = json.load(f)
        summary_restart_runs = set(summary.get('restart_needed_runs', []))
        print(f"Loaded {len(summary_restart_runs)} restart candidates "
              f"from {analysis_summary_file}")

    # Read optional restart configuration
    try:
        max_consecutive_restarts = config.getint(
            'restart_ensemble', 'max_consecutive_restarts')
    except Exception:
        max_consecutive_restarts = 3

    try:
        min_simulation_years = config.getfloat(
            'restart_ensemble', 'min_simulation_years_before_restart')
    except Exception:
        min_simulation_years = 50.0

    try:
        auto_restart = config.getboolean(
            'restart_ensemble', 'auto_restart_incomplete')
    except Exception:
        auto_restart = True

    # Scan for existing run directories
    run_dirs = sorted(glob.glob(os.path.join(spinup_work_dir, 'run*')))

    restart_runs = []
    skipped_runs = []

    for run_dir in run_dirs:
        run_name = os.path.basename(run_dir)
        try:
            run_num = int(run_name.replace('run', ''))
        except ValueError:
            continue

        # If a summary file was provided, only consider runs listed there
        if summary_restart_runs is not None and run_num not in summary_restart_runs:
            skipped_runs.append((run_num, "Not in analysis_summary restart list"))
            continue

        should_restart, reason = self._should_restart_run(
            run_dir=run_dir,
            run_num=run_num,
            min_years=min_simulation_years,
            max_restarts=max_consecutive_restarts,
            auto_restart=auto_restart
        )

        if should_restart:
            restart_runs.append(run_num)
            print(f"Scheduling restart for {run_name}")

            self.add_step(InPlaceRestartMember(
                test_case=self,
                run_num=run_num,
                spinup_work_dir=spinup_work_dir
            ))
        else:
            if reason:
                skipped_runs.append((run_num, reason))

    if skipped_runs:
        print("\nSkipped runs:")
        for run_num, reason in skipped_runs:
            print(f"  run{run_num:...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

…_restart_ensemble

Agent-Logs-Url: https://github.com/alexolinhager/compass/sessions/e99b6809-04fb-4972-afe2-92ebe441f1c9

Co-authored-by: alexolinhager <131483939+alexolinhager@users.noreply.github.com>
Copilot AI changed the title [WIP] Add optional analysis summary file config for restart ensemble Fix configparser API bug and add analysis_summary_file support in sgh_restart_ensemble Mar 31, 2026
Copilot AI requested a review from alexolinhager March 31, 2026 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants