Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 80 additions & 59 deletions fre/make/create_checkout_script.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,12 @@
'''
FRE Checkout Script Generator

Retrieves information from the resolved YAML configuration to generate a
checkout.sh script that git clones the model source code.

The checkout script will clone component repositories defined in the
compile YAML to build the model.

Note, a bare-metal build defaults to a parallel checkout.
A container build defaults to a non-parallel checkout.

'''
"""
Create_checkout_script provides methods to generate a checkout.sh script from a YAML configuration
file, where checkout.sh git clone's all component source repositories listed under the
src key of the compile YAML.

The method checkout_create is the entry point called by fre make checkout-script and
fre make all. Checkout_create then calls baremetal_checkout_write for a bare-metal platform or
container_checkout_write for a container platform to write checkout.sh.
"""
import shutil
from pathlib import Path
from datetime import datetime
Expand All @@ -26,23 +22,31 @@
def baremetal_checkout_write(model_yaml: yamlfre.freyaml, src_dir: str, jobs: str,
parallel_cmd: str, execute: bool):
"""
This function baremetal_checkout_write is called by checkout_create in order to

- Extract compilation specifications from the parsed YAML configuration
- Generate a checkout script to the source directory. The source directory is
defined within the 'modelRoot' variable in the "platforms" section of the combined YAML


:param model_yaml: "freyaml" class object containing a parsed and validated yaml dictionary
containing the "compile" specification
Baremetal_checkout_write generates the checkout.sh script and optionally executes the script to
git clone the component repositories in preparation for model compilation.

Called by checkout_create for each bare-metal platform, this method
- reads the compile section of the resolved YAML to determine source repositories to clone,
- writes checkout.sh into src_dir,
- optionally executes checkout.sh afterwards

:param model_yaml: is the parsed and validated YAML object containing the compile
specifications (source repositories, experiment name, etc.).
:type model_yaml: yamlfre.freyaml
:param src_dir: Absolute directory path to git clone the source code
:param src_dir: is the absolute path of the directory where checkout.sh will be
written and where the source repositories will be cloned. Typically,
src_dir = [modelRoot]/[experiment]/src where modelRoot is defined in
platforms.yaml.
:type src_dir: str
:param jobs: Number of git submodules to clone simultaneously (TO CLARIFY)
:param jobs: is the number of git submodules to fetch simultaneously, passed to git clone
--jobs (relevant only if the component repository contain submodules.)
:type jobs: str
:param parallel_cmd: Set to " &" for parallel checkouts and "" for non-parallel checkouts
:param parallel_cmd: is the shell suffix appended to each git clone command to control
concurrency. Pass " &" to background each clone (parallel
checkout) or "" to clone sequentially.
:type parallel_cmd: str
:param execute: If True, run the generated checkout.sh
:param execute: is a flag where if True, checkout.sh is executed immediately after creation.
Defaults to False.
:type execute: bool
"""
fre_checkout = checkout.checkout("checkout.sh", src_dir)
Expand All @@ -60,23 +64,28 @@ def baremetal_checkout_write(model_yaml: yamlfre.freyaml, src_dir: str, jobs: st
def container_checkout_write(model_yaml: yamlfre.freyaml, src_dir: str, tmp_dir: str,
jobs: str, parallel_cmd: str):
"""
This function container_checkout_write is called by checkout_create in order to

- Extract compilation specifications from the parsed YAML configuration
- Generate a checkout script in a local ./tmp directory, where it will later be
copied to the directory of the container image filesystem for execution

:param model_yaml: "freyaml" class object containing a parsed and validated yaml dictionary
containing the "compile" specification
Container_checkout_write generates checkout.sh for a container build.

Called by checkout_create for each container platform, this method
writes checkout.sh into a temporary directory on the host (tmp/[platform-name]/)
where the script eventually will be COPY-ed to the container image filesystem.
The script will be executed during the container build to git clone the component
repositories serially.

:param model_yaml: is the parsed and validated YAML object containing the compile
specifications (source repositories, experiment name, etc.).
:type model_yaml: yamlfre.freyaml
:param src_dir: Internal path for source code in the running container. The source directory is
defined within the 'modelRoot' variable in the "platforms" section of the combined YAML
:param src_dir: is the source-code path inside the running container where repositories will
be cloned. Defined by modelRoot in platforms.yaml.
:type src_dir: str
:param tmp_dir: Temporary directory (outside of container) that hosts the created checkout script
:param tmp_dir: is the local temporary directory on the host (outside the container) where
checkout.sh is staged before being COPYed into the image.
Typically tmp/[platform-name].
:type tmp_dir: str
:param jobs: Number of git submodules to clone simultaneously (TO CLARIFY)
:param jobs: is the number of git submodules to fetch simultaneously, passed to git clone
--jobs.
:type jobs: str
:param parallel_cmd: Since container builds are not parallelized, set to ""
:param parallel_cmd: is a flag not used in this method and should be removed.
:type parallel_cmd: str
"""
fre_checkout = checkout.checkoutForContainer("checkout.sh", src_dir, tmp_dir)
Expand All @@ -88,31 +97,43 @@ def checkout_create(yamlfile: str, platform: tuple, target: tuple,
no_parallel_checkout: Optional[bool] = None, njobs: int = 4,
execute: Optional[bool] = False, force_checkout: Optional[bool] = False):
"""
Calls baremetal_checkout_write or container_checkout_write to create checkout.sh
for baremetal or container builds, respectively.
Checkout_create is the entry point for fre make checkout-script. The method resolves
the YAML configuration and calls baremetal_checkout_write or container_checkout_write
for each specified platform.

:param yamlfile: Model YAML file path
:param yamlfile: is the path to the model YAML configuration file (e.g. am5.yaml).
The experiment name is derived by stripping the .yaml extension.
:type yamlfile: str
:param platform: FRE platform(s) that are defined in the platforms.yaml
:type platform: tuple
:param target: Predefined FRE target(s)
:type target: tuple
:param no_parallel_checkout: Option to disable parallel checkouts
:type no_parallel_checkout: bool
:param njobs: Used in the recursive clone; number of submodules to fetch simultaneously (default 4) (TO CLARIFY)
:param platform: is one or more FRE platform strings as defined in platforms.yaml.
:type platform: tuple[str]
:param target: is one or more mkmf target strings (e.g. debug, repro, prod).
:type target: tuple[str]
:param no_parallel_checkout: is a flag where if True, git clone component repositories sequentially.
Defaults to False to enable parallel checkout for bare-metal builds;
Is not used for container builds
:type no_parallel_checkout: bool, optional
:param njobs: is the number of git submodules to fetch simultaneously, passed to
git clone --jobs. Defaults to 4.
:type njobs: int
:param execute: If True, run checkout.sh
:type execute: bool
:param force_checkout: If True, for bare-metal build: add timestamp to source directory and create a new checkout script
If True, for container build: overwrite locally existing checkout script before COPY-ing to the
container image filesystem
:type force_checkout: bool
:param execute: If True, execute checkout.sh immediately after writing it
(bare-metal only). Defaults to False.
:type execute: bool, optional
:param force_checkout: is a flag to controls behavior when checkout.sh already exists.
For bare-metal build, renames the existing src directory with a
YYYYmmdd.HHMMSS timestamp suffix, then writes a fresh
checkout.sh in a new src directory.
For container build, deletes the existing tmp/[platform]/checkout.sh
and writes a new one in its place.
Defaults to False.
:type force_checkout: bool, optional

:raises ValueError:
- If 'njobs' is not an integer
- If 'platform' does not exist in the platforms.yaml configuration
:raises OSError: If executing checkout.sh returns an error

- If njobs is passed as a boolean while --execute is also set (ambiguous
intent — njobs must be an explicit integer).
- If a specified platform name does not exist in platforms.yaml.
:raises OSError: If --execute is set and running an existing checkout.sh
returns a non-zero exit code (e.g. the source directory already
contains conflicting content).
"""
# Standardize inputs
jobs_str = str(njobs)
Expand Down
82 changes: 55 additions & 27 deletions fre/make/create_compile_script.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,33 @@
'''
Retrieves information from the resolved YAML configuration to generate the compile.sh
in the ``[modelRoot]/[experiment name]/[platform-target]/exec`` directory, where
"""
Create_compile_script retrieves information from the resolved YAML configuration to generate compile.sh
for bare-metal builds. The method compile_create is the entry point called by fre make all and
fre make compile-script.

- ``modelRoot`` is defined in the `platforms.yaml`
- ``experiment name`` is defined in `compile.yaml`
- ``platform`` and ``target`` are passed via Click options
The generated script is written to::

The compile.sh script
[modelRoot]/[experiment]/[platform]-[target]/exec/compile.sh

1. Sets the ``src_dir``
2. Sets the ``bld_dir``
3. Sets the ``mkmf_template``
4. Loads/unloads modules to set-up the compile environment
5. Calls ``mkmf`` to generate Makefiles for each model component defined in the `compile.yaml`
6. Calls ``make`` to generate the model executable
'''
where

- modelRoot is defined in platforms.yaml
- experiment is the basename of the model YAML file (e.g. am5 from am5.yaml)
- platform and target are passed via the -p / -t CLI options
to fre make compile-script and fre make all

When executed, compile.sh

1. Sets src_dir (where source code was checked out by checkout.sh)
2. Sets bld_dir (the exec/ directory where the executable is placed)
3. Sets the path to the mkmf template (mkTemplate from platforms.yaml)
4. Loads the correct environment modules to set the compile environment
(see envSetup from platforms.yaml)
5. Calls mkmf for each model component listed under src in compile.yaml
to generate per-component Makefiles
6. Calls make (with -j [makejobs]) to build the model executable

Container platforms are silently skipped — compilation inside a container image
is handled by the Dockerfile generated by fre make dockerfile.
"""

import logging
from multiprocessing.dummy import Pool
Expand All @@ -38,26 +51,41 @@ def compile_create(yamlfile:str, platform:tuple[str], target:tuple[str], makejob
nparallel: int = 1, execute: Optional[bool] = False,
verbose: Optional[bool] = None):
"""
This function compile_create generates the compile script for bare-metal build.
Generates the compile.sh script for each bare-metal platform and target combination and
optionally executes compile.sh to compile a model executable.

For each bare-metal platform in the platform yaml, a compile.sh is written to
[modelRoot]/[experiment]/[platform]-[target]/exec/. Container platforms are
silently skipped here; their compilation is handled by the Dockerfile produced by
``dockerfile_create``.

:param yamlfile: Model compile YAML file
:param yamlfile: is the path to the model YAML file (e.g. am5.yaml). The experiment
name is derived by stripping the `.yaml` extension.
:type yamlfile: str
:param platform: FRE platform; defined in the platforms yaml
:type platform: tuple of strings
:param target: Predefined FRE targets
:type target: tuple of strings
:param makejobs: Number of recipes from the Makefile to run in parallel (default 4);
corresponds to -j option in make
:param platform: is one or more FRE platform strings as defined in the platform yaml
(e.g. ncrc5.intel23). Container platforms in this tuple are
silently ignored.
:type platform: tuple[str]
:param target: is one or more mkmf target strings (e.g. prod, debug,
repro, prod-openmp). One compile.sh is generated per
platform/target pair.
:type target: tuple[str]
:param makejobs: is the number of Makefile recipes to run simultaneously, passed to
make -j. Defaults to 4.
:type makejobs: int
:param nparallel: Number of compile.sh scripts to run in parallel (default 1)
:param nparallel: is the number of compile.sh scripts to run concurrently when
execute=True. Defaults to 1 (sequential execution).
:type nparallel: int
:param execute: If True, execute the created compile.sh script to build a model executable
:param execute: is a flag where if True, run every generated compile.sh after creation.
Defaults to False.
:type execute: bool
:param verbose: If True, increase verbosity output
:param verbose: is a flag where if True, set logger level to "DEBUG" for detailed output.
Defaults to False with logger level set to "INFO".
:type verbose: bool

:raises ValueError:
- Error if platform does not exist in platforms yaml configuration
- Error if the mkmf template defined in platforms yaml does not exist
- If a specified platform does not exist in the platforms yaml
- If the mkTemplate path defined in the platforms yaml does not exist
"""

# Define variables
Expand Down
73 changes: 50 additions & 23 deletions fre/make/create_docker_script.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,29 @@
'''
Generates a Dockerfile and an accompanying createContainer.sh script that
builds a Docker image containing the compiled model executable and the library
dependencies from the generated Dockerfile. Unless specified,
createContainer.sh will convert the Docker OCI image to a Singularity image
file format (.sif) that can be launched with Singularity/Apptainer.

Note, once the container image is built, the source code and the compiled
executable cannot be modified.
'''
"""
Create_docker_script provides methods to generate a Dockerfile and
an accompanying createContainer.sh script to build container images.
The method dockerfile_create is the entry point called by
fre make dockerfile and by fre make all.

The Dockerfile uses a two-stage build:

1. Build stage — starts from the base container image (image field in
platforms yaml), copies in the checkout.sh and Makefile that were
staged under tmp/[platform]/ by fre make checkout-script and
fre make makefile, runs mkmf and make to compile the model
executable.
2. Runtime stage — copies the compiled executable and its runtime dependencies
into a leaner second base image (containerBase2 in platforms.yaml).

createContainer.sh builds the container image and, unless --no-format-transfer
is specified, converts it to a Singularity Image File (.sif) that can be
launched with Singularity/Apptainer on HPC systems.

.. note::
Once a container image is built, the source code and compiled executable
inside it cannot be modified. To incorporate source changes, re-run
fre make all (or the individual sub-commands) and rebuild the image.
"""

import logging
import os
Expand All @@ -29,24 +45,35 @@
def dockerfile_create(yamlfile: str, platform: tuple[str], target: tuple[str],
execute: bool = False, no_format_transfer: bool = False):
"""
This function dockerfile_create creates a Dockerfile and
an accompanying createContainer.sh script that builds a container image containing
the compiled model executable and the library dependencies

:param yamlfile: model compile YAML file
Dockerfile_create generates a Dockerfile and createContainer.sh for each container platform/target
combination and optionally executes the build script to produce a container image.

fre make checkout-script and fre make makefile should be invoked
beforehand to stage the checkout.sh script and Makefile in tmp/[platform-name]/.

:param yamlfile: is the path to the model YAML file (e.g. am5.yaml). The experiment
name is derived by stripping the .yaml extension.
:type yamlfile: str
:param platform: FRE container-specific platform(s) that are defined in platforms.yaml
:type platform: tuple(str)
:param target: Predefined FRE targets
:type target: tuple(str)
:param execute: If true, execute createContainer.sh to build the container image
:param platform: is one or more FRE platform strings as defined in platforms.yaml.
Only container platforms (container: true) are processed; bare-metal
platforms are skipped.
:type platform: tuple[str]
:param target: is one or more mkmf target strings (e.g. prod, repro, debug).
One Dockerfile is generated per platform/target pair.
:type target: tuple[str]
:param execute: is a flag where if True, run createContainer.sh immediately after generation
to build the container image. Defaults to False.
:type execute: bool
:param no_format_transfer: if True, skip container image format conversion to a .sif file
:param no_format_transfer: is a flag where if True, skip the OCI-to-Singularity (.sif) format
conversion step in createContainer.sh. Defaults to
False.
:type no_format_transfer: bool
:raises ValueError: Error if platform does not exist in platforms.yaml

.. note:: If building the container image on GFDL's RDHPCS GAEA with the Podman container engine,
please submit a GFDL helpdesk ticket to request Podman access
:raises ValueError: If a specified platform does not exist in platforms.yaml.

.. note:: If building the container image on GFDL's RDHPCS GAEA with the Podman
container engine, submit a GFDL helpdesk ticket to request Podman access
before running this command.
"""

## Split and store the platforms and targets in a list
Expand Down
Loading
Loading