Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build issues on Theta (ANL) #11

Open
nphtan opened this issue May 5, 2021 · 6 comments
Open

Build issues on Theta (ANL) #11

nphtan opened this issue May 5, 2021 · 6 comments

Comments

@nphtan
Copy link
Collaborator

nphtan commented May 5, 2021

I'm having issues building the tests on Theta. I have tried both GNU and Intel compilers. During the build process for the tests there seems to be an issue with the resilience test. I've include the full build steps I performed along with the modules I have loaded.

(miniconda-3/latest/base) nphtan@thetalogin5:~/kokkos/build> module list
Currently Loaded Modulefiles:

  1. modules/3.2.11.4 15) rca/2.2.20-7.0.2.1_2.78__g8e3fb5b.ari
  2. intel/19.1.0.166 16) atp/3.8.1
  3. craype-network-aries 17) perftools-base/20.06.0
  4. craype/2.6.5 18) PrgEnv-intel/6.0.7
  5. cray-libsci/20.06.1 19) craype-mic-knl
  6. udreg/2.3.2-7.0.2.1_2.33__g8175d3d.ari 20) cray-mpich/7.7.14
  7. ugni/6.0.14.0-7.0.2.1_3.60__ge78e5b0.ari 21) nompirun/nompirun
  8. pmi/5.0.16 22) adaptive-routing-a3
  9. dmapp/7.1.1-7.0.2.1_2.78__g38cf134.ari 23) darshan/3.2.1
  10. gni-headers/5.0.12.0-7.0.2.1_2.19__g3b1768f.ari 24) xalt
  11. xpmem/2.2.20-7.0.2.1_2.60__g87eb960.ari 25) miniconda-3/latest
  12. job/2.2.4-7.0.2.1_2.72__g36b56f4.ari 26) cray-hdf5-parallel/1.10.6.1
  13. dvs/2.12_2.2.172-7.0.2.1_8.1__g7056cbb6 27) boost/intel/1.64.0
  14. alps/6.6.59-7.0.2.1_3.65__g872a8d62.ari

cat build.sh
#! /usr/bin/env bash

cmake
-DCMAKE_BUILD_TYPE=RelWithDebInfo
-DCMAKE_CXX_COMPILER=CC
-DCMAKE_CXX_FLAGS="-dynamic"
-DCMAKE_INSTALL_PREFIX=/home/nphtan/kokkos/build/install
-DKokkos_ENABLE_OPENMP=ON
-DKokkos_ENABLE_SERIAL=ON
-DKokkos_ARCH_KNL=ON
..

(miniconda-3/latest/base) nphtan@thetalogin5:~/kokkos/build> . build.sh
-- Setting default Kokkos CXX standard to 11
-- The CXX compiler identification is Intel 19.1.0.20191121
-- Cray Programming Environment 2.6.5 CXX
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5/bin/CC -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting policy CMP0074 to use _ROOT variables
-- The project name is: Kokkos
-- Using -std=gnu++11 for C++11 extensions as feature
-- Execution Spaces:
-- Device Parallel: NONE
-- Host Parallel: OPENMP
-- Host Serial: SERIAL

-- Architectures:
-- KNL
-- Found TPLLIBDL: /usr/lib64/libdl.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nphtan/kokkos/build
(miniconda-3/latest/base) nphtan@thetalogin5:~/kokkos/build> make
Scanning dependencies of target kokkoscore
[ 4%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_CPUDiscovery.cpp.o
[ 8%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Core.cpp.o
[ 13%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Error.cpp.o
[ 17%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_ExecPolicy.cpp.o
[ 21%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostBarrier.cpp.o
[ 26%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostSpace.cpp.o
[ 30%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostSpace_deepcopy.cpp.o
[ 34%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_HostThreadTeam.cpp.o
[ 39%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_MemoryPool.cpp.o
[ 43%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Profiling_Interface.cpp.o
[ 47%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Serial.cpp.o
[ 52%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Serial_Task.cpp.o
[ 56%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_SharedAlloc.cpp.o
[ 60%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Spinwait.cpp.o
[ 65%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Stacktrace.cpp.o
[ 69%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_TrackDuplicates.cpp.o
[ 73%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_ViewHooks.cpp.o
[ 78%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_hwloc.cpp.o
[ 82%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/OpenMP/Kokkos_OpenMP_Exec.cpp.o
[ 86%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/OpenMP/Kokkos_OpenMP_Task.cpp.o
[ 91%] Linking CXX static library libkokkoscore.a
[ 91%] Built target kokkoscore
Scanning dependencies of target kokkoscontainers
[ 95%] Building CXX object containers/src/CMakeFiles/kokkoscontainers.dir/impl/Kokkos_UnorderedMap_impl.cpp.o
[100%] Linking CXX static library libkokkoscontainers.a
[100%] Built target kokkoscontainers

(miniconda-3/latest/base) nphtan@thetalogin5:~/VELOC> python auto-install.py --no-boost --no-deps $HOME/VELOC/build/install
Installing VeloC in /home/nphtan/VELOC/build/install...
CMake Warning:
No source or binary directory provided. Both will be assumed to be the
same as the current working directory, but note that this warning will
become a fatal error in future CMake releases.

-- The C compiler identification is Intel 19.1.0.20191121
-- The CXX compiler identification is Intel 19.1.0.20191121
-- Cray Programming Environment 2.6.5 C
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Cray Programming Environment 2.6.5 CXX
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5/bin/CC -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Boost version: 1.64.0
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Found MPI_C: /opt/cray/pe/craype/2.6.5/bin/cc (found version "3.1")
-- Found MPI_CXX: /opt/cray/pe/craype/2.6.5/bin/CC (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nphtan/VELOC
Scanning dependencies of target veloc-modules
[ 5%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/module_manager.cpp.o
[ 10%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/client_watchdog.cpp.o
[ 15%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/transfer_module.cpp.o
[ 20%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir//common/config.cpp.o
[ 25%] Linking CXX shared library libveloc-modules.so
[ 25%] Built target veloc-modules
Scanning dependencies of target veloc-backend
[ 30%] Building CXX object src/backend/CMakeFiles/veloc-backend.dir/main.cpp.o
[ 35%] Building CXX object src/backend/CMakeFiles/veloc-backend.dir/
/common/config.cpp.o
[ 40%] Linking CXX executable veloc-backend
[ 40%] Built target veloc-backend
Scanning dependencies of target veloc-client
[ 45%] Building CXX object src/lib/CMakeFiles/veloc-client.dir/veloc.cpp.o
[ 50%] Building CXX object src/lib/CMakeFiles/veloc-client.dir/client.cpp.o
[ 55%] Building CXX object src/lib/CMakeFiles/veloc-client.dir/__/common/config.cpp.o
[ 60%] Linking CXX shared library libveloc-client.so
[ 60%] Built target veloc-client
Scanning dependencies of target heatdis_fault
[ 65%] Building CXX object test/CMakeFiles/heatdis_fault.dir/heatdis_fault.cpp.o
[ 70%] Linking CXX executable heatdis_fault
[ 70%] Built target heatdis_fault
Scanning dependencies of target heatdis_original
[ 75%] Building C object test/CMakeFiles/heatdis_original.dir/heatdis_original.c.o
[ 80%] Linking C executable heatdis_original
[ 80%] Built target heatdis_original
Scanning dependencies of target heatdis_file
[ 85%] Building C object test/CMakeFiles/heatdis_file.dir/heatdis_file.c.o
[ 90%] Linking C executable heatdis_file
[ 90%] Built target heatdis_file
Scanning dependencies of target heatdis_mem
[ 95%] Building C object test/CMakeFiles/heatdis_mem.dir/heatdis_mem.c.o
[100%] Linking C executable heatdis_mem
[100%] Built target heatdis_mem
Install the project...
-- Install configuration: "Release"
-- Installing: /home/nphtan/VELOC/build/install/lib/libveloc-modules.so
-- Installing: /home/nphtan/VELOC/build/install/bin/veloc-backend
-- Set runtime path of "/home/nphtan/VELOC/build/install/bin/veloc-backend" to ""
-- Installing: /home/nphtan/VELOC/build/install/lib/libveloc-client.so
-- Set runtime path of "/home/nphtan/VELOC/build/install/lib/libveloc-client.so" to ""
-- Up-to-date: /home/nphtan/VELOC/build/install/include/veloc.h
running install
running build
running build_py
running install_lib
running install_egg_info
Removing /home/nphtan/.local/miniconda-3/latest/lib/python3.7/site-packages/VELOC_Python-0.1-py3.7.egg-info
Writing /home/nphtan/.local/miniconda-3/latest/lib/python3.7/site-packages/VELOC_Python-0.1-py3.7.egg-info
Installation successful!

(miniconda-3/latest/base) nphtan@thetalogin5:~/kokkos-resilience/build> cat build.sh
#!/usr/bin/env bash

cmake
-DCMAKE_BUILD_TYPE=RelWithDebInfo
-DCMAKE_C_COMPILER=cc
-DCMAKE_C_FLAGS="-dynamic"
-DCMAKE_CXX_COMPILER=CC
-DCMAKE_CXX_FLAGS="-dynamic"
-DCMAKE_INSTALL_PREFIX=/home/nphtan/kokkos-resilience/build/install
-DVeloC_ROOT=/home/nphtan/VELOC/build/install
-DKokkos_ROOT=/home/nphtan/kokkos/build/install
-DKR_ENABLE_TRACING=ON
-DKR_ENABLE_STDIO=ON
-DKR_ENABLE_HDF5_PARALLEL=ON
-DVELOC_BAREBONE=ON
..

(miniconda-3/latest/base) nphtan@thetalogin5:/kokkos-resilience/build> . build.sh
-- The C compiler identification is Intel 19.1.0.20191121
-- The CXX compiler identification is Intel 19.1.0.20191121
-- Cray Programming Environment 2.6.5 C
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Cray Programming Environment 2.6.5 CXX
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5/bin/CC -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Enabled Kokkos devices: OPENMP;SERIAL
-- Found MPI_C: /opt/cray/pe/craype/2.6.5/bin/cc (found version "3.1")
-- Found MPI_CXX: /opt/cray/pe/craype/2.6.5/bin/CC (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- Found VeloC: /home/nphtan/VELOC/build/install
-- Found HDF5: /opt/cray/pe/hdf5-parallel/1.10.6.1/INTEL/19.1
-- Boost version: 1.64.0
-- Found the following Boost libraries:
-- filesystem
-- system
-- cxxopts version 2.2.0
-- Found PythonInterp: /soft/datascience/conda/miniconda3/latest/bin/python (found version "3.7.6")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nphtan/kokkos-resilience/build
(miniconda-3/latest/base) nphtan@thetalogin5:
/kokkos-resilience/build> make
Scanning dependencies of target resilience
[ 1%] Building CXX object CMakeFiles/resilience.dir/src/resilience/Resilience.cpp.o
[ 3%] Building CXX object CMakeFiles/resilience.dir/src/resilience/AutomaticCheckpoint.cpp.o
[ 5%] Building CXX object CMakeFiles/resilience.dir/src/resilience/Context.cpp.o
[ 7%] Building CXX object CMakeFiles/resilience.dir/src/resilience/Config.cpp.o
[ 9%] Building CXX object CMakeFiles/resilience.dir/src/resilience/Cref.cpp.o
[ 10%] Building CXX object CMakeFiles/resilience.dir/src/resilience/ResilientRef.cpp.o
[ 12%] Building CXX object CMakeFiles/resilience.dir/src/resilience/MPIContext.cpp.o
[ 14%] Building CXX object CMakeFiles/resilience.dir/src/resilience/filesystem/ExternalIOInterface.cpp.o
[ 16%] Building CXX object CMakeFiles/resilience.dir/src/resilience/filesystem/Filesystem.cpp.o
[ 18%] Building CXX object CMakeFiles/resilience.dir/src/resilience/stdio/StdFileSpace.cpp.o
[ 20%] Building CXX object CMakeFiles/resilience.dir/src/resilience/veloc/VelocBackend.cpp.o
[ 21%] Building CXX object CMakeFiles/resilience.dir/src/resilience/StdFileContext.cpp.o
[ 23%] Building CXX object CMakeFiles/resilience.dir/src/resilience/stdfile/StdFileBackend.cpp.o
[ 25%] Building CXX object CMakeFiles/resilience.dir/src/resilience/hdf5/HDF5Space.cpp.o
[ 27%] Linking CXX static library libresilience.a
[ 27%] Built target resilience
Scanning dependencies of target example
[ 29%] Building CXX object _deps/cxxopts-build/src/CMakeFiles/example.dir/example.cpp.o
[ 30%] Linking CXX executable example
[ 30%] Built target example
Scanning dependencies of target link_test
[ 32%] Building CXX object _deps/cxxopts-build/test/CMakeFiles/link_test.dir/link_a.cpp.o
[ 34%] Building CXX object _deps/cxxopts-build/test/CMakeFiles/link_test.dir/link_b.cpp.o
[ 36%] Linking CXX executable link_test
[ 36%] Built target link_test
Scanning dependencies of target options_test
[ 38%] Building CXX object _deps/cxxopts-build/test/CMakeFiles/options_test.dir/main.cpp.o
[ 40%] Building CXX object _deps/cxxopts-build/test/CMakeFiles/options_test.dir/options.cpp.o
[ 41%] Linking CXX executable options_test
[ 41%] Built target options_test
Scanning dependencies of target gtest
[ 43%] Building CXX object _deps/googletest-build/googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
[ 45%] Linking CXX static library ../../../lib/libgtest.a
[ 45%] Built target gtest
Scanning dependencies of target resilience_tests
[ 47%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestMain.cpp.o
[ 49%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestResilience.cpp.o
[ 50%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestLambdaCapture.cpp.o
[ 52%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestVelocMemoryBackend.cpp.o
[ 54%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestStdFileBackend.cpp.o
[ 56%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestViewCheckpoint.cpp.o
[ 58%] Building CXX object tests/CMakeFiles/resilience_tests.dir/TestHDF5Configuration.cpp.o
[ 60%] Linking CXX executable resilience_tests
CMake Error at /lus/theta-fs0/software/datascience/conda/miniconda3/latest/share/cmake-3.14/Modules/GoogleTestAddTests.cmake:40 (message):
Error running test executable.

Path: '/home/nphtan/kokkos-resilience/build/tests/resilience_tests'
Result: Illegal instruction
Output:

make[2]: *** [tests/CMakeFiles/resilience_tests.dir/build.make:185: tests/resilience_tests] Error 1
make[2]: *** Deleting file 'tests/resilience_tests'
make[1]: *** [CMakeFiles/Makefile2:475: tests/CMakeFiles/resilience_tests.dir/all] Error 2
make: *** [Makefile:141: all] Error 2

(miniconda-3/latest/base) nphtan@thetalogin5:~/kokkos-resilience/build> make VERBOSE=1
/lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -S/home/nphtan/kokkos-resilience -B/home/nphtan/kokkos-resilience/build --check-build-system CMakeFiles/Makefile.cmake 0
/lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_progress_start /home/nphtan/kokkos-resilience/build/CMakeFiles /home/nphtan/kokkos-resilience/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f CMakeFiles/resilience.dir/build.make CMakeFiles/resilience.dir/depend
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
cd /home/nphtan/kokkos-resilience/build && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_depends "Unix Makefiles" /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build/CMakeFiles/resilience.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f CMakeFiles/resilience.dir/build.make CMakeFiles/resilience.dir/build
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make[2]: Nothing to be done for 'CMakeFiles/resilience.dir/build'.
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
[ 27%] Built target resilience
make -f _deps/cxxopts-build/src/CMakeFiles/example.dir/build.make _deps/cxxopts-build/src/CMakeFiles/example.dir/depend
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
cd /home/nphtan/kokkos-resilience/build && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_depends "Unix Makefiles" /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience/build/_deps/cxxopts-src/src /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build/_deps/cxxopts-build/src /home/nphtan/kokkos-resilience/build/_deps/cxxopts-build/src/CMakeFiles/example.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f _deps/cxxopts-build/src/CMakeFiles/example.dir/build.make _deps/cxxopts-build/src/CMakeFiles/example.dir/build
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make[2]: Nothing to be done for '_deps/cxxopts-build/src/CMakeFiles/example.dir/build'.
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
[ 30%] Built target example
make -f _deps/cxxopts-build/test/CMakeFiles/link_test.dir/build.make _deps/cxxopts-build/test/CMakeFiles/link_test.dir/depend
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
cd /home/nphtan/kokkos-resilience/build && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_depends "Unix Makefiles" /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience/build/_deps/cxxopts-src/test /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build/_deps/cxxopts-build/test /home/nphtan/kokkos-resilience/build/_deps/cxxopts-build/test/CMakeFiles/link_test.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f _deps/cxxopts-build/test/CMakeFiles/link_test.dir/build.make _deps/cxxopts-build/test/CMakeFiles/link_test.dir/build
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make[2]: Nothing to be done for '_deps/cxxopts-build/test/CMakeFiles/link_test.dir/build'.
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
[ 36%] Built target link_test
make -f _deps/cxxopts-build/test/CMakeFiles/options_test.dir/build.make _deps/cxxopts-build/test/CMakeFiles/options_test.dir/depend
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
cd /home/nphtan/kokkos-resilience/build && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_depends "Unix Makefiles" /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience/build/_deps/cxxopts-src/test /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build/_deps/cxxopts-build/test /home/nphtan/kokkos-resilience/build/_deps/cxxopts-build/test/CMakeFiles/options_test.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f _deps/cxxopts-build/test/CMakeFiles/options_test.dir/build.make _deps/cxxopts-build/test/CMakeFiles/options_test.dir/build
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make[2]: Nothing to be done for '_deps/cxxopts-build/test/CMakeFiles/options_test.dir/build'.
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
[ 41%] Built target options_test
make -f _deps/googletest-build/googletest/CMakeFiles/gtest.dir/build.make _deps/googletest-build/googletest/CMakeFiles/gtest.dir/depend
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
cd /home/nphtan/kokkos-resilience/build && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_depends "Unix Makefiles" /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience/build/_deps/googletest-src/googletest /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build/_deps/googletest-build/googletest /home/nphtan/kokkos-resilience/build/_deps/googletest-build/googletest/CMakeFiles/gtest.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f _deps/googletest-build/googletest/CMakeFiles/gtest.dir/build.make _deps/googletest-build/googletest/CMakeFiles/gtest.dir/build
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make[2]: Nothing to be done for '_deps/googletest-build/googletest/CMakeFiles/gtest.dir/build'.
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
[ 45%] Built target gtest
make -f tests/CMakeFiles/resilience_tests.dir/build.make tests/CMakeFiles/resilience_tests.dir/depend
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
cd /home/nphtan/kokkos-resilience/build && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_depends "Unix Makefiles" /home/nphtan/kokkos-resilience /home/nphtan/kokkos-resilience/tests /home/nphtan/kokkos-resilience/build /home/nphtan/kokkos-resilience/build/tests /home/nphtan/kokkos-resilience/build/tests/CMakeFiles/resilience_tests.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make -f tests/CMakeFiles/resilience_tests.dir/build.make tests/CMakeFiles/resilience_tests.dir/build
make[2]: Entering directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
[ 47%] Linking CXX executable resilience_tests
cd /home/nphtan/kokkos-resilience/build/tests && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -E cmake_link_script CMakeFiles/resilience_tests.dir/link.txt --verbose=1
/opt/cray/pe/craype/2.6.5/bin/CC -dynamic -O2 -g -DNDEBUG -fopenmp -xMIC-AVX512 CMakeFiles/resilience_tests.dir/TestMain.cpp.o CMakeFiles/resilience_tests.dir/TestResilience.cpp.o CMakeFiles/resilience_tests.dir/TestLambdaCapture.cpp.o CMakeFiles/resilience_tests.dir/TestVelocMemoryBackend.cpp.o CMakeFiles/resilience_tests.dir/TestStdFileBackend.cpp.o CMakeFiles/resilience_tests.dir/TestViewCheckpoint.cpp.o CMakeFiles/resilience_tests.dir/TestHDF5Configuration.cpp.o -o resilience_tests -Wl,-rpath,/home/nphtan/VELOC/build/install/lib ../lib/libgtest.a ../libresilience.a /home/nphtan/kokkos/build/install/lib64/libkokkoscontainers.a /home/nphtan/kokkos/build/install/lib64/libkokkoscore.a /usr/lib64/libdl.so /home/nphtan/VELOC/build/install/lib/libveloc-client.so /home/nphtan/VELOC/build/install/lib/libveloc-modules.so /opt/cray/pe/hdf5-parallel/1.10.6.1/INTEL/19.1/lib/libhdf5.so /soft/libraries/boost/1.64.0/intel/lib/libboost_filesystem-mt.so /soft/libraries/boost/1.64.0/intel/lib/libboost_system-mt.so
cd /home/nphtan/kokkos-resilience/build/tests && /lus/theta-fs0/software/datascience/conda/miniconda3/latest/bin/cmake -D TEST_TARGET=resilience_tests -D TEST_EXECUTABLE=/home/nphtan/kokkos-resilience/build/tests/resilience_tests -D TEST_EXECUTOR= -D TEST_WORKING_DIR=/home/nphtan/kokkos-resilience/build/tests -D TEST_EXTRA_ARGS= -D TEST_PROPERTIES= -D TEST_PREFIX= -D TEST_SUFFIX= -D NO_PRETTY_TYPES=FALSE -D NO_PRETTY_VALUES=FALSE -D TEST_LIST=resilience_tests_TESTS -D CTEST_FILE=/home/nphtan/kokkos-resilience/build/tests/resilience_tests[1]_tests.cmake -D TEST_DISCOVERY_TIMEOUT=5 -P /lus/theta-fs0/software/datascience/conda/miniconda3/latest/share/cmake-3.14/Modules/GoogleTestAddTests.cmake
CMake Error at /lus/theta-fs0/software/datascience/conda/miniconda3/latest/share/cmake-3.14/Modules/GoogleTestAddTests.cmake:40 (message):
Error running test executable.

Path: '/home/nphtan/kokkos-resilience/build/tests/resilience_tests'
Result: Illegal instruction
Output:

make[2]: *** [tests/CMakeFiles/resilience_tests.dir/build.make:185: tests/resilience_tests] Error 1
make[2]: *** Deleting file 'tests/resilience_tests'
make[2]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make[1]: *** [CMakeFiles/Makefile2:475: tests/CMakeFiles/resilience_tests.dir/all] Error 2
make[1]: Leaving directory '/gpfs/mira-home/nphtan/kokkos-resilience/build'
make: *** [Makefile:141: all] Error 2

@nmm0
Copy link
Contributor

nmm0 commented May 5, 2021

Illegal instruction makes me think that the problem is the build step that discovers the tests. CMake discovers tests by running the executable and relying on GTest to report the test contents. Because the Theta login nodes have a different architecture than you are building for that will cause issues. Can you try building with a job on the debug queue and confirm if it works correctly?

If that's the case we should include a toggle to disable test discovery.

@keitat
Copy link
Contributor

keitat commented May 5, 2021

Theta's compute nodes has Intel KNL processor. I guess that you need to set a architecture specific parameters -DKokkos_ARCH_KNL=ON when running CMAKE. Otherwise, CMAKE could choose the compiler flags optimized for the login node.

@nphtan
Copy link
Collaborator Author

nphtan commented May 6, 2021

Building on the compute nodes worked. Still working on getting the tests to run properly.

@nphtan
Copy link
Collaborator Author

nphtan commented May 10, 2021

The first test passes but the remainder fail with this error
Start 2: Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set.ForbestperformancewithOpenMP3.1setOMP_PROC_BIND=true
2/17 Test #2: Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set.ForbestperformancewithOpenMP3.1setOMP_PROC_BIND=true ............................................Child aborted***Exception: 0.07 sec
Mon May 10 15:14:59 2021: [unset]:_pmi_alps_sync:alps response not OKAY
Mon May 10 15:14:59 2021: [unset]:_pmiu_daemon:_pmi_alps_sync failed
Mon May 10 15:14:59 2021: [PE_0]:_pmi_daemon_barrier:PE pipe read failed from daemon errno = Success
Mon May 10 15:14:59 2021: [PE_0]:_pmi_init:_pmi_daemon_barrier returned -1
[Mon May 10 15:14:59 2021] [c7-1c2s12n3] Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(537):
MPID_Init(246).......: channel initialization failed
MPID_Init(647).......: PMI2 init failed: 1

@nmm0
Copy link
Contributor

nmm0 commented May 10, 2021

@nphtan make sure you set OMP_PROC_BIND=true to get rid of the warning. However, that should not be causing the problem.

This seems like an MPI issue on theta... MPI is failing to initialize. Have you had a chance to run other programs on Theta and seen if they have similar issues?

@nphtan
Copy link
Collaborator Author

nphtan commented May 10, 2021

I've run the veloc-heatdis-test on Theta so I don't think it's a problem with the MPI installation. I think it is failing due to how the Cobalt scheduler works on Theta. It's executing the test on the compute node but each test is running it's own MPI executable. When running the tests individually some of them passed. The remaining failed tests are related to veloc_mem and hdf5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants