Skip to content

OpenMPI 5.0.7 with Intel compilers 2021.4 SegFault error #13309

Open
@MadNazgul

Description

@MadNazgul

Dear Team,

OS: Linux 3.10.0-1160.el7.x86_64 x86_64

Compilers:

$ icc --version
icc (ICC) 2021.4.0 20210910
Copyright (C) 1985-2021 Intel Corporation.  All rights reserved.

$ icpc --version
icpc (ICC) 2021.4.0 20210910
Copyright (C) 1985-2021 Intel Corporation.  All rights reserved.

$ ifort --version
ifort (IFORT) 2021.4.0 20210910
Copyright (C) 1985-2021 Intel Corporation.  All rights reserved.

OpenMPI: 5.0.7
Configure with command

./configure CC=icc CXX=icpc F77=ifort FC=ifort --prefix=/cvmfs/soft/sw/slc7_x86-64/openmpi/v5.0.7_icc2021

The make and make install commands went without errors.

Check on the Hello World program on C language.

Compile command

$ mpicc hello.c

Runs with command

$ mpirun -np 2 a.out

and via script

#!/bin/bash
#SBATCH -p tut
#SBATCH -n 2
#SBATCH -t 5
mpirun a.out

gives the same error

[blade09:25354] *** Process received signal ***
[blade09:25354] Signal: Segmentation fault (11)
[blade09:25354] Signal code: Address not mapped (1)
[blade09:25354] Failing at address: 0x440000e0
[blade09:25353] *** Process received signal ***
[blade09:25353] Signal: Segmentation fault (11)
[blade09:25353] Signal code: Address not mapped (1)
[blade09:25353] Failing at address: 0x440000e0
[blade09:25353] [ 0] /lib64/libpthread.so.0(+0xf630)[0x7f634b070630]
[blade09:25353] [ 1] [blade09:25354] [ 0] /lib64/libpthread.so.0(+0xf630)[0x7fa389c68630]
[blade09:25354] [ 1] /cvmfs/soft/sw/slc7_x86-64/openmpi/v5.0.7_icc2021/lib/libmpi.so.40(PMPI_Comm_rank+0x47)[0x7f634b845897]
[blade09:25353] [ 2] a.out[0x401d01]
[blade09:25353] [ 3] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f634acb5555]
/cvmfs/soft/sw/slc7_x86-64/openmpi/v5.0.7_icc2021/lib/libmpi.so.40(PMPI_Comm_rank+0x47)[0x7fa38a43d897]
[blade09:25354] [ 2] a.out[0x401d01]
[blade09:25354] [ 3] [blade09:25353] [ 4] a.out[0x401be9]
[blade09:25353] *** End of error message ***
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fa3898ad555]
[blade09:25354] [ 4] a.out[0x401be9]
[blade09:25354] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 25353 on node blade09 exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

With intel compilers 19.0.3.199 all work correct.

I ask for your help in solving this problem.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions