Skip to content

Conversation

bedroge
Copy link
Contributor

@bedroge bedroge commented Oct 9, 2025

And undo the limit for CP2K (introduced in #104), I don't think that was required and it didn't solve the issue in EESSI/software-layer#1220 (comment). Newer QE versions removed the maxparallel=1, and it looks like this makes it run out of memory.

To be sure, I'll add an easystack here that builds both QE and CP2K, just to confirm that both build without issues now.

@bedroge bedroge added the a64fx label Oct 9, 2025
@bedroge
Copy link
Contributor Author

bedroge commented Oct 9, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Oct 9, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.10/pr_106/581452

date job status comment
Oct 09 13:55:22 UTC 2025 submitted job id 581452 awaits release by job manager
Oct 09 13:55:28 UTC 2025 released job awaits launch by Slurm scheduler
Oct 09 13:56:34 UTC 2025 running job 581452 is running
Oct 09 14:05:13 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-581452.out
✅ no message matching FATAL:
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17600183290.tar.gzsize: 0 MiB (21567 bytes)
entries: 1
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/easybuild/eb_hooks.py
Oct 09 14:05:13 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) accodring to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) accodring to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) accodring to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) accodring to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:aarch64_a64fx+default
P: perf: 582.844 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:aarch64_a64fx+default
P: perf: 583.466 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:aarch64_a64fx+default
P: latency: 1.64 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:aarch64_a64fx+default
P: latency: 1.64 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 8496.74 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:aarch64_a64fx+default
P: bandwidth: 7820.35 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-581452.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Contributor Author

bedroge commented Oct 9, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Oct 9, 2025

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2025.10/pr_106/581603

date job status comment
Oct 09 15:22:02 UTC 2025 submitted job id 581603 awaits release by job manager
Oct 09 15:22:17 UTC 2025 released job awaits launch by Slurm scheduler
Oct 09 15:23:23 UTC 2025 running job 581603 is running

@bedroge
Copy link
Contributor Author

bedroge commented Oct 9, 2025

Job is still running, but the QE build just completed. The max memory usage reported by Slurm is only 1953600K, so I don't understand why it didn't work with the default settings (which should be 12 cores instead of 6?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant