Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netcdf library on pm-gpu and pm-cpu #7076

Open
iulian787 opened this issue Mar 3, 2025 · 3 comments
Open

netcdf library on pm-gpu and pm-cpu #7076

iulian787 opened this issue Mar 3, 2025 · 3 comments
Assignees
Labels
pm-cpu Perlmutter at NERSC (CPU-only nodes) pm-gpu Perlmutter machine at NERSC (GPU nodes)

Comments

@iulian787
Copy link
Contributor

So this is the environment we use to build moab, it is the same environment used to run e3sm
module list

Currently Loaded Modules:
  1) craype-x86-milan                                5) cpe/24.07          9) PrgEnv-gnu/8.5.0      (prgenv)  13) cray-libsci/23.12.5                17) cray-netcdf-hdf5parallel/4.9.0.9
  2) libfabric/1.20.1                                6) gpu/1.0           10) gcc-native/12.3       (c)       14) craype/2.7.30               (c)    18) cray-parallel-netcdf/1.12.3.9
  3) craype-network-ofi                              7) sqs/2.0           11) cudatoolkit/12.2      (g)       15) cray-mpich/8.1.28           (mpi)  19) cmake/3.24.3
  4) xpmem/2.9.6-1.1_20240510205610__g087dc11fc19d   8) cray-dsmml/0.3.0  12) craype-accel-nvidia80 (cpe)     16) cray-hdf5-parallel/1.12.2.9
  Where:
   g:       built for GPU
   mpi:     MPI Providers
   cpe:     Cray Programming Environment Modules
   prgenv:  Programming Environment Modules
   c:       Compiler

In this folder
/opt/cray/pe/netcdf-hdf5parallel/4.9.0.9/gnu/12.3/lib

iulian@perlmutter:login10:/opt/cray/pe/netcdf-hdf5parallel/4.9.0.9/gnu/12.3/lib> ldd libnetcdff_parallel.so | grep hdf5
	libhdf5_hl_parallel_gnu.so.310 => /opt/cray/pe/lib64/libhdf5_hl_parallel_gnu.so.310 (0x00007f7eba4d9000)
	libhdf5_parallel_gnu.so.310 => /opt/cray/pe/lib64/libhdf5_parallel_gnu.so.310 (0x00007f7eb9b7a000)

While

iulian@perlmutter:login10:/opt/cray/pe/netcdf-hdf5parallel/4.9.0.9/gnu/12.3/lib> ls -l /opt/cray/pe/lib64/libhdf5_parallel_gnu.so.310
lrwxrwxrwx 1 root root 76 Jun 18  2024 /opt/cray/pe/lib64/libhdf5_parallel_gnu.so.310 -> /opt/cray/pe/hdf5-parallel/1.14.3.1/gnu/12.3/lib/libhdf5_parallel_gnu.so.310

So it is linked against hdf5 1.14.3.1, not 1.12.2.9
Although

libnetcdf.settings shows
CPPFLAGS:		 -DpgiFortran -I/opt/cray/pe/hdf5-parallel/1.12.2.9/include -I/opt/cray/pe/hdf5-parallel/1.12.2.9/gnu/12.3/include 
LDFLAGS:		-L/opt/cray/pe/hdf5-parallel/1.12.2.9/gnu/12.3/lib -Wl,-yMPI_Init -Wl,--disable-new-dtags 

this library from there seems to be OK:

ldd libnetcdf_parallel_gnu_123.so | grep hdf5
	libhdf5_hl_parallel_gnu_123.so.200 => /opt/cray/pe/lib64/libhdf5_hl_parallel_gnu_123.so.200 (0x00007fce27254000)
	libhdf5_parallel_gnu_123.so.200 => /opt/cray/pe/lib64/libhdf5_parallel_gnu_123.so.200 (0x00007fce26de3000)
ls -l  /opt/cray/pe/lib64/libhdf5_parallel_gnu_123.so.200
lrwxrwxrwx 1 root root 80 Nov 29  2023 /opt/cray/pe/lib64/libhdf5_parallel_gnu_123.so.200 -> /opt/cray/pe/hdf5-parallel/1.12.2.9/gnu/12.3/lib/libhdf5_parallel_gnu_123.so.200

there is a problem with fortran version, too

iulian@perlmutter:login10:/opt/cray/pe/netcdf-hdf5parallel/4.9.0.9/gnu/12.3/lib> ldd libnetcdff.so | grep hdf5
	libhdf5_hl_parallel_gnu.so.310 => /opt/cray/pe/lib64/libhdf5_hl_parallel_gnu.so.310 (0x00007f41b1a85000)
	libhdf5_parallel_gnu.so.310 => /opt/cray/pe/lib64/libhdf5_parallel_gnu.so.310 (0x00007f41b117a000)
@rljacob
Copy link
Member

rljacob commented Mar 3, 2025

@dqwu did you open a ticket with NERSC about this?

@dqwu
Copy link
Contributor

dqwu commented Mar 3, 2025

did you open a ticket with NERSC about this?

Not yet. To ensure a more efficient interaction with NERSC, I think it might be better if @ndkeen creates the ticket.

@ndkeen
Copy link
Contributor

ndkeen commented Mar 3, 2025

Was already documenting issue here #7049

Though it's nice to have this as separate issue as it may be the root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pm-cpu Perlmutter at NERSC (CPU-only nodes) pm-gpu Perlmutter machine at NERSC (GPU nodes)
Projects
None yet
Development

No branches or pull requests

4 participants