-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
I've been trying to run the Apache Arrow CI with mimalloc v3.2.6.
We have a unit test that calls fork from multiple threads. This test is crashing on macOS with v3.2.6, while it is passing on v3.1.5.
Here is the source for this for reference, but I don't think it's very interesting to look at; the child process isn't doing anything interesting as far as mimalloc is concerned:
https://github.com/apache/arrow/blob/985b16ec27664e6d80fe4b600168412209012103/cpp/src/arrow/util/atfork_test.cc#L200-L256
What happens is that mimalloc crashes with an assertion error at process shutdown in those child processes:
[ RUN ] TestAtFork.MultipleThreads
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
assertion: "_mi_theap_default() != NULL"
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
assertion: "_mi_theap_default() != NULL"
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
assertion: "_mi_theap_default() != NULL"
mimalloc: assertion failed: at "/Users/runner/work/arrow/arrow/build/cpp/mimalloc_ep-prefix/src/mimalloc_ep/src/init.c":918, mi_process_done
assertion: "_mi_theap_default() != NULL"
/Users/runner/work/arrow/arrow/cpp/src/arrow/testing/gtest_util.cc:625: Failure
Failed
Child terminated by signal 6
[etc.]
(the error message happens a number of time, probably once for each child process)
Luckily, our CI is able to give a backtrace of the crashed child process(es):
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Running '/Users/runner/work/arrow/arrow/build/cpp/debug/arrow-utility-test' produced core dump at '/tmp/core.arrow-utility-te.40676', printing backtrace:
(lldb) target create --core "/tmp/core.arrow-utility-te.40676"
Core file '/tmp/core.arrow-utility-te.40676' (arm64) was loaded.
(lldb) thread backtrace all -e true
* thread #1
* frame #0: 0x00000001863925d0 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x00000001863cabc8 libsystem_pthread.dylib`pthread_kill + 288
frame #2: 0x00000001862d7a40 libsystem_c.dylib`abort + 180
frame #3: 0x000000010ca9b070 libarrow.2300.0.0.dylib`_mi_assert_fail at options.c:561:3
frame #4: 0x000000010ca9894c libarrow.2300.0.0.dylib`mi_process_done at init.c:918:3
frame #5: 0x000000010ca98b1c libarrow.2300.0.0.dylib`_mi_auto_process_done at init.c:956:3
frame #6: 0x000000010caafc94 libarrow.2300.0.0.dylib`mi_process_detach at prim.c:45:5
frame #7: 0x0000000186288350 libsystem_c.dylib`__cxa_finalize_ranges + 476
frame #8: 0x00000001862880d8 libsystem_c.dylib`exit + 44
frame #9: 0x0000000100f3e5e0 arrow-utility-test`arrow::internal::TestAtFork::RunInChild(std::__1::function<void ()>) + 852
frame #10: 0x0000000100f4a900 arrow-utility-test`arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0::operator()(int) const + 412
frame #11: 0x0000000100f4a708 arrow-utility-test`decltype(std::declval<arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0>()(std::declval<int>())) std::__1::__invoke[abi:ue170006]<arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int>(arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0&&, int&&) + 36
frame #12: 0x0000000100f4a698 arrow-utility-test`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int, 2ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int>&, std::__1::__tuple_indices<2ul>) + 48
frame #13: 0x0000000100f49f94 arrow-utility-test`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::TestAtFork_MultipleThreads_Test::TestBody()::$_0, int>>(void*) + 84
frame #14: 0x00000001863caf3c libsystem_pthread.dylib`_pthread_start + 136
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Note that we have another test that calls fork from the main thread and that one passes. So my guess is that mimalloc might be assuming that _mi_auto_process_done is called from the same thread that did the init.