-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FreeBSD tests crash with HRESULT: 0x8007FF02 from #106724 #106892
Comments
@Thefrank , can you recommend a guide for building the v9 SDK on FreeBSD? https://github.com/dotnet/dotnet/ fails with |
Building the SDK requires bootstraping from Linux or using an old SDK to build the new one. If you just want a prebuilt we have those too. Building: I have not tried to build from VMR for net9 as the VMR tends to require more work than "bootstrap from tags" or "use old SDK to build new SDK" :( Downloading: The former repo is what I use for my builds; the later by @sec also has ARM64 versions and will work just as well for either AMD64 or ARM64. I cherry-pick missing/needed PRs into my SDKs (e.g., the ones listed above). If you want an SDK with 0 changes @sec's are better. Please let me know how else I can help with this! |
Not a codegen issue. Removing codegen label. |
FreeBSD dev here. Just to confirm, this is running in native FreeBSD, not via the Linux emulation layer? |
Does this process ever call execve? The "process has called MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED" flag gets cleared when we execve and that would result in MEMBARRIER_CMD_PRIVATE_EXPEDITED returning EPERM later. |
It is running on native FreeBSD (no linux kernel modules, linux service not running) on both host and jail. |
@Thefrank And does is call execve at any point? |
@Thefrank Is that one of the tests which is crashing? I don't see any process exiting with code 137. |
@cperciva All but seven of the tests are failing with this HRESULT. The
I have attached the repro log which is a bit easier to read than the raw output from the build edit: A bit of a follow up, the repro.txt does not show exit 137 either. It does show up in the build log (e.g., https://dev.azure.com/IFailAt/freebsd-dotnet-runtime-nightly/_build/results?buildId=1625&view=logs&j=7a7d9a85-6408-51cd-9969-a6e11cb53059&t=61d9617a-4ab8-5d55-8d34-1c6cdb462292&l=7503) after a "Failed to create CoreCLR, HRESULT: 0x8007FF02". I guess the test runner is seeing that HRESULT as a SIGKILL/OOM? |
The dotnet process appears to be exiting after this mmap() failure. Can someone run this in gdb and get a call stack? I still cannot build on FreeBSD.
|
Also, are these related to the following known issues? #100558 and dotnet/dnceng#2496 |
It is going to be the call here: runtime/src/coreclr/pal/src/thread/process.cpp Line 2597 in 8d24ba3
The problem is that value returned by A related problem is why
These are unrelated. |
…ialize() A fixup of commit 27ee590 that's broken on platforms which don't support membarrier() syscall: GetVirtualPageSize() is called in the fallback path of InitializeFlushProcessWriteBuffers() and attempts to mmap() zero bytes. Move InitializeFlushProcessWriteBuffers() after VIRTUALInitialize() but before the first thread is created. Fixes dotnet#106892 Fixes dotnet#106722
…ialize() A fixup of commit 27ee590 that's broken on platforms which don't support membarrier() syscall: GetVirtualPageSize() is called in the fallback path of InitializeFlushProcessWriteBuffers() and attempts to mmap() zero bytes. Move InitializeFlushProcessWriteBuffers() after VIRTUALInitialize() but before the first thread is created. Fixes dotnet#106892 Fixes dotnet#106722
…ialize() (#107100) A fixup of commit 27ee590 that's broken on platforms which don't support membarrier() syscall: GetVirtualPageSize() is called in the fallback path of InitializeFlushProcessWriteBuffers() and attempts to mmap() zero bytes. Move InitializeFlushProcessWriteBuffers() after VIRTUALInitialize() but before the first thread is created. Fixes #106892 Fixes #106722 Co-authored-by: Haris Okanovic <[email protected]>
…ialize() A fixup of commit 27ee590 that's broken on platforms which don't support membarrier() syscall: GetVirtualPageSize() is called in the fallback path of InitializeFlushProcessWriteBuffers() and attempts to mmap() zero bytes. Move InitializeFlushProcessWriteBuffers() after VIRTUALInitialize() but before the first thread is created. Fixes #106892 Fixes #106722
@Thefrank The fix is in. Could you please verify that FreeBSD is functional again? Also, it would be useful to find out why |
…ialize() (#107114) A fixup of commit 27ee590 that's broken on platforms which don't support membarrier() syscall: GetVirtualPageSize() is called in the fallback path of InitializeFlushProcessWriteBuffers() and attempts to mmap() zero bytes. Move InitializeFlushProcessWriteBuffers() after VIRTUALInitialize() but before the first thread is created. Fixes #106892 Fixes #106722 Co-authored-by: Haris Okanovic <[email protected]>
@jkotas Looks good on my end! I don't have an ARM64 hardware setup to test on but @sec might have some insight on FreeBSD-ARM64 with .NET. Under FreeBSD-x64, I have not noticed any consistent errors* Not sure what to do about the *I have noticed an inconsistent error when restore is run during the initial build of runtime. I will open an issue about this when I can better reproduce it. |
I will try to find some time and do the checks/tests on arm64. edit: fix looks fine also on arm64. |
…ialize() (dotnet#107100) A fixup of commit 27ee590 that's broken on platforms which don't support membarrier() syscall: GetVirtualPageSize() is called in the fallback path of InitializeFlushProcessWriteBuffers() and attempts to mmap() zero bytes. Move InitializeFlushProcessWriteBuffers() after VIRTUALInitialize() but before the first thread is created. Fixes dotnet#106892 Fixes dotnet#106722 Co-authored-by: Haris Okanovic <[email protected]>
Description
From #106724 forward tests on FreeBSD now all creash with HRESULT: 0x8007FF02
These do not show up in
/var/log/messages
on either the jail or the host.dmesg
also free of errors from this. Nothing SIGXXXX.Builds otherwise complete without issue.
Reproduction Steps
As initially discovered:
./build.sh -ci -c Release -subset Clr+Mono+Host.Native+Host.Tools+Host.Pkg+Libs+Libs.Tests+Packs --test
Double checked with (requires #106302):
src/tests/build.sh Release /p:LibrariesConfiguration=Release -rebuild -runtests
Expected behavior
Tests to run either pass/fail
Actual behavior
"exit code 137 means SIGKILL Killed either due to out of memory/resources (see /var/log/messages) or by explicit kill."
or
"Failed to create CoreCLR, HRESULT: 0x8007FF02"
Regression?
Yes
Known Workarounds
None?
Configuration
FreeBSD-x64 both 13.3 and 14.1
System has about ~200G of free RAM during this. Jails run without resource quotas.
Other information
Public log for both FreeBSD 13.3 and 14.1: https://dev.azure.com/IFailAt/freebsd-dotnet-runtime-nightly/_build/results?buildId=1625&view=results BinLog are published to artifacts
commit listed is 6df7807 (after PR listed)
git bisect log
The text was updated successfully, but these errors were encountered: