Skip to content

v3.2.6: crash on macOS if forking from non-main thread #1198

@pitrou

Description

@pitrou

I've been trying to run the Apache Arrow CI with mimalloc v3.2.6.

We have a unit test that calls fork from multiple threads. This test is crashing on macOS with v3.2.6, while it is passing on v3.1.5.

Here is the source for this for reference, but I don't think it's very interesting to look at; the child process isn't doing anything interesting as far as mimalloc is concerned:
https://github.com/apache/arrow/blob/985b16ec27664e6d80fe4b600168412209012103/cpp/src/arrow/util/atfork_test.cc#L200-L256

What happens is that mimalloc crashes with an assertion error at process shutdown in those child processes:

[ RUN      ] TestAtFork.MultipleThreads
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
  assertion: "_mi_theap_default() != NULL"
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
  assertion: "_mi_theap_default() != NULL"
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
  assertion: "_mi_theap_default() != NULL"
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
  assertion: "_mi_theap_default() != NULL"
/Users/runner/work/arrow/arrow/cpp/src/arrow/testing/gtest_util.cc:625: Failure
Failed
Child terminated by signal 6

[etc.]

(the error message happens a number of time, probably once for each child process)

Luckily, our CI is able to give a backtrace of the crashed child process(es):

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Running '/Users/runner/work/arrow/arrow/build/cpp/debug/arrow-utility-test' produced core dump at '/tmp/core.arrow-utility-te.40676', printing backtrace:
(lldb) target create --core "/tmp/core.arrow-utility-te.40676"
Core file '/tmp/core.arrow-utility-te.40676' (arm64) was loaded.
(lldb) thread backtrace all -e true
* thread #1
  * frame #0: 0x00000001863925d0 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001863cabc8 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001862d7a40 libsystem_c.dylib`abort + 180
    frame #3: 0x000000010ca9b070 libarrow.2300.0.0.dylib`_mi_assert_fail at options.c:561:3
    frame #4: 0x000000010ca9894c libarrow.2300.0.0.dylib`mi_process_done at init.c:918:3
    frame #5: 0x000000010ca98b1c libarrow.2300.0.0.dylib`_mi_auto_process_done at init.c:956:3
    frame #6: 0x000000010caafc94 libarrow.2300.0.0.dylib`mi_process_detach at prim.c:45:5
    frame #7: 0x0000000186288350 libsystem_c.dylib`__cxa_finalize_ranges + 476
    frame #8: 0x00000001862880d8 libsystem_c.dylib`exit + 44
    frame #9: 0x0000000100f3e5e0 arrow-utility-test`arrow::internal::TestAtFork::RunInChild(std::__1::function<void ()>) + 852
    frame #10: 0x0000000100f4a900 arrow-utility-test`arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0::operator()(int) const + 412
    frame #11: 0x0000000100f4a708 arrow-utility-test`decltype(std::declval<arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0>()(std::declval<int>())) std::__1::__invoke[abi:ue170006]<arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int>(arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0&&, int&&) + 36
    frame #12: 0x0000000100f4a698 arrow-utility-test`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int, 2ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int>&, std::__1::__tuple_indices<2ul>) + 48
    frame #13: 0x0000000100f49f94 arrow-utility-test`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int>>(void*) + 84
    frame #14: 0x00000001863caf3c libsystem_pthread.dylib`_pthread_start + 136
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Note that we have another test that calls fork from the main thread and that one passes. So my guess is that mimalloc might be assuming that _mi_auto_process_done is called from the same thread that did the init.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions