Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--ci-generate pipeline fails when using a test fixture #3344

Open
yandthj opened this issue Dec 11, 2024 · 1 comment
Open

--ci-generate pipeline fails when using a test fixture #3344

yandthj opened this issue Dec 11, 2024 · 1 comment

Comments

@yandthj
Copy link

yandthj commented Dec 11, 2024

We have the following test:

import os

import reframe as rfm
import reframe.utility as util
import reframe.utility.sanity as sn

ValidateResources = util.import_from_module('..utils', 'ValidateResources')

@rfm.simple_test
class build_ior(rfm.CompileOnlyRegressionTest):
    descr = 'Compile IOR'
    valid_systems = ['*']
    valid_prog_environs = ['*']
    build_system = "Make"
    ior_dir = variable(str)

    @run_before('compile')
    def prepare_build(self):
        self.ior_dir = 'ior-3.3.0'
        self.prebuild_cmds += [
            'wget https://github.com/hpc/ior/releases/download/3.3.0/ior-3.3.0.tar.gz',
            f'tar -xzf {self.ior_dir}.tar.gz',
            f'cd {self.ior_dir}',
            f'./configure CC={self.current_environ.cc} MPICC={self.current_environ.cc}',
        ]
    @sanity_function
    def validate_build(self):
        # If compilation fails, the test would fail in any case, so nothing to
        # further validate here.
        return True


@rfm.simple_test
class IOR_cpu(rfm.RunOnlyRegressionTest, ValidateResources):
    descr = 'IOR benchmark'
    valid_systems = ['*']
    valid_prog_environs = ['*']
    executable_opts = ['-v -a POSIX -g -w -r -e -C -F -b 4000m -t 1m -s 1']
    n_nodes = variable(int, value=1)
    mem = 0
    filesystem = parameter(['scratch', 'projects'])
    time_limit = '0d0h15m0s'
    ior_build = fixture(build_ior, scope='environment')

    @run_before('compile')
    def set_executable(self):
        self.executable = os.path.join(
            self.ior_build.stagedir,
            self.ior_build.ior_dir,
            'src', 'ior'
        )

    @run_before('run')
    def set_run_dir(self):
        username = os.getlogin()
        if self.filesystem == 'scratch' or self.filesystem == 'home':
            run_dir = f'/{self.filesystem}/{username}'
        else:
            raise ValueError
        self.prerun_cmds += [f'[ ! -d {run_dir} ] && mkdir {run_dir}']
        # If the run directory wasn't created, the mkdir and cd commands fail,
        # but ReFrame would proceed to run the test in the staging directory,
        # which is not what we want.  Instead, we use "exit" to signal an error.
        self.prerun_cmds += [f'[ ! -d {run_dir} ] && exit 1', f'cd {run_dir}', 
                             'mkdir IOR_$SLURM_JOB_ID', 'cd IOR_$SLURM_JOB_ID']
        self.postrun_cmds += ['cd ../', 'rm -r IOR_$SLURM_JOB_ID']

    @sanity_function
    def assert_solution(self):
        return sn.assert_found(r'Finished\s+:.*', self.stdout)

    @performance_function('MiB/s', perf_key='Mean Read')
    def extract_mean_read(self):
        data = sn.extractall(r'read(.*)', self.stdout, 1, str)[-1]
        data_array = str(data).split()
        return float(data_array[2])

    @performance_function('MiB/s', perf_key='Mean Write')
    def extract_mean_write(self):
        data = sn.extractall(r'write(.*)', self.stdout, 1, str)[-1]
        data_array = str(data).split()
        return float(data_array[2]) 

Using --ci-generate=ior-pipeline/pipeline.yml generates the following pipeline:

default:
cache:
  key: ${CI_COMMIT_REF_SLUG}
  paths:
  - rfm-stage/${CI_COMMIT_SHORT_SHA}
stages:
- rfm-stage-0
- rfm-stage-1
build_ior_f18197df:
  stage: rfm-stage-0
  script:
  - reframe --prefix=rfm-stage/${CI_COMMIT_SHORT_SHA} -C system-settings/settings-swift.py -c tests/ior/ior-source.py  --report-file=build_ior_f18197df-report.json  --report-junit=build_ior_f18197df-report.xml  -n /ebc719b6 -r
  artifacts:
    paths:
    - build_ior_f18197df-report.json
  needs: []
IOR_cpu_1:
  stage: rfm-stage-1
  script:
  - reframe --prefix=rfm-stage/${CI_COMMIT_SHORT_SHA} -C system-settings/settings-swift.py -c tests/ior/ior-source.py  --report-file=IOR_cpu_1-report.json --restore-session=build_ior_f18197df-report.json --report-junit=IOR_cpu_1-report.xml  -n /eed258d9 -r
  artifacts:
    paths:
    - IOR_cpu_1-report.json
  needs:
  - build_ior_f18197df
IOR_cpu_0:
  stage: rfm-stage-1
  script:
  - reframe --prefix=rfm-stage/${CI_COMMIT_SHORT_SHA} -C system-settings/settings-swift.py -c tests/ior/ior-source.py  --report-file=IOR_cpu_0-report.json --restore-session=build_ior_f18197df-report.json --report-junit=IOR_cpu_0-report.xml  -n /e057fe9c -r
  artifacts:
    paths:
    - IOR_cpu_0-report.json
  needs:
  - build_ior_f18197df
build_ior:
  stage: rfm-stage-0
  script:
  - reframe --prefix=rfm-stage/${CI_COMMIT_SHORT_SHA} -C system-settings/settings-swift.py -c tests/ior/ior-source.py  --report-file=build_ior-report.json  --report-junit=build_ior-report.xml  -n /c20f7057 -r
  artifacts:
    paths:
    - build_ior-report.json
  needs: []

Jobs IOR_cpu_0 and IOR_cpu_1 fail with this error:
ERROR: run session stopped: reframe error: could not restore testcase ('build_ior_f18197df', 'swift:standard', 'intelmpi'): not found in the report files

No tests are run in the build_ior_f18197df job so the build_ior_f18197df-report.json is empty. We tried with and without the decorator on build_ior.

@vkarak
Copy link
Contributor

vkarak commented Jan 14, 2025

Hi and apologies for the late reply. I cannot reproduce it at the moment since the utils module is missing:

ValidateResources = util.import_from_module('..utils', 'ValidateResources')

Would you mind providing a stand-alone reproducer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

2 participants