Skip to content

set subsections_via_symbols for ld64 helper sections #139752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

usamoi
Copy link
Contributor

@usamoi usamoi commented Apr 13, 2025

closes #139744
cc @madsmtm

@rustbot
Copy link
Collaborator

rustbot commented Apr 13, 2025

r? @Nadrieril

rustbot has assigned @Nadrieril.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 13, 2025
@rustbot
Copy link
Collaborator

rustbot commented Apr 13, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@rustbot rustbot added the A-run-make Area: port run-make Makefiles to rmake.rs label Apr 13, 2025
@rustbot

This comment was marked as outdated.

Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping!

@rustbot label O-apple O-linkage

Comment on lines 224 to 231
if binary_format == BinaryFormat::MachO {
file.set_subsections_via_symbols();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that this needs to be in create_object_file? Maybe it would be better to only use it in add_linked_symbol_object?

Also, MH_SUBSECTIONS_VIA_SYMBOLS is vastly under-documented, I'd really like to see a comment here explaining why it's safe for us to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object::set_subsections_via_symbols says it should be called before add_section or add_subsection, so I feel it better to put it here (not error-prone).

I don't know whether it's safe or not, either. I searched on the Internet and found that this is the only way for Mach-O to implement GC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem with having it in create_object_file is that it may negatively affect the other object files we create (which again is hard to tell for sure, since the docs around it are so limited).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I didn't notice that this function has multiple callers.

I will add a parameter to control this behavior.

Comment on lines 1 to 8
unsafe extern "C" {
unsafe static UNDEFINED: usize;
}

#[unsafe(no_mangle)]
pub fn used() {
println!("UNDEFINED = {}", unsafe { UNDEFINED });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm kind of surprised that this worked in the past. Could you describe the use-case?

I'd also be interested, does this pattern work on other platforms?

Copy link
Contributor Author

@usamoi usamoi Apr 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by https://github.com/pgcentralfoundation/pgrx. It's a framework for developing PostgreSQL extensions. This trick is used for writing hybrid (dylib and SQL generation) code in a library. Code about SQL generation could be compiled to an executable, since dylib-related code are GC-ed.

This should be reasonable usage. Please see #95604 and #95363 (comment).

Yes. It works for Linux, FreeBSD, and Windows, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the answer. Might make sense to leave a comment in the file about this, and perhaps link to #139744, stating that it is a regression test for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is the expected behaviour for the symbol to still be present, or do we expect the linker to completely strip it out? E.g. would dlsym(RTLD_DEFAULT, "used") work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The symbol is present in fhe dynamic library. It's GC-ed iff the final artifact is the executable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea then would perhaps be to add another test (or use UI-test revisions) with #![crate_type = "dylib"] and //@ dont-check-compiler-stderr that expectedly fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I meant having test that fails when trying to link undefined symbols in a dylib

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is added.

@rustbot rustbot added the O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) label Apr 13, 2025
@jieyouxu
Copy link
Member

cc @bjorn3 @petrochenkov in case this is problematic

#[unsafe(no_mangle)]
pub unsafe fn used() {
println!("THIS_SYMBOL_SHOULD_BE_UNDEFINED = {}", unsafe { THIS_SYMBOL_SHOULD_BE_UNDEFINED });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I know that #[no_mangle] implies it, but could we also add an explicit test for #[used]?

Copy link
Contributor Author

@usamoi usamoi Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[used] emits errors, while #[no_mangle] does not, on MacOS.

Don't know why.

Edit: using pr + lld, nightly + lld, stable + lld, stable + ld64, #[used] emits errors, too.

Copy link
Contributor Author

@usamoi usamoi Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know why now.

On MacOS, #[used] emits llvm.used. So #[used(compiler)] is needed here. Is it expected behavior?

Copy link
Member

@bjorn3 bjorn3 Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that #[used] is an alias for #[used(linker)] on some platforms:

// Unfortunately, unconditionally using `llvm.used` causes
// issues in handling `.init_array` with the gold linker,
// but using `llvm.compiler.used` caused a nontrivial amount
// of unintentional ecosystem breakage -- particularly on
// Mach-O targets.
//
// As a result, we emit `llvm.compiler.used` only on ELF
// targets. This is somewhat ad-hoc, but actually follows
// our pre-LLVM 13 behavior (prior to the ecosystem
// breakage), and seems to match `clang`'s behavior as well
// (both before and after LLVM 13), possibly because they
// have similar compatibility concerns to us. See
// https://github.com/rust-lang/rust/issues/47384#issuecomment-1019080146
// and following comments for some discussion of this, as
// well as the comments in `rustc_codegen_llvm` where these
// flags are handled.
//
// Anyway, to be clear: this is still up in the air
// somewhat, and is subject to change in the future (which
// is a good thing, because this would ideally be a bit
// more firmed up).
let is_like_elf = !(tcx.sess.target.is_like_darwin
|| tcx.sess.target.is_like_windows
|| tcx.sess.target.is_like_wasm);
codegen_fn_attrs.flags |= if is_like_elf {
CodegenFnAttrFlags::USED
} else {
CodegenFnAttrFlags::USED_LINKER
};
Eventually #[used] should be changed to #[used(linker)] unconditionally anyway. Gold is deprecated upstream and broken with current rustc versions anyway: #139425

@usamoi usamoi force-pushed the macos-used branch 2 times, most recently from 6e175fc to 7d10cf7 Compare April 14, 2025 05:34
@jieyouxu jieyouxu added the A-linkage Area: linking into static, shared libraries and binaries label Apr 14, 2025
@jieyouxu jieyouxu removed the A-run-make Area: port run-make Makefiles to rmake.rs label Apr 14, 2025
@Nadrieril
Copy link
Member

r? codegen

@rustbot rustbot assigned saethlin and unassigned Nadrieril Apr 16, 2025
@usamoi
Copy link
Contributor Author

usamoi commented Apr 22, 2025

I hope this gets a beta backport to prevent pgrx from being broken on macOS.

@rustbot label +beta-nominated

@rustbot rustbot added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries beta-nominated Nominated for backporting to the compiler in the beta channel. O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

functions marked with #[no_mangle] cannot be GC-ed on MacOS
7 participants