Skip to content

[syscall] E: Run dlopen ctors/dtors and DT_RUNPATH#27

Closed
esaurez wants to merge 1 commit into
devfrom
feat/dlfcn-init-array-and-runpath
Closed

[syscall] E: Run dlopen ctors/dtors and DT_RUNPATH#27
esaurez wants to merge 1 commit into
devfrom
feat/dlfcn-init-array-and-runpath

Conversation

@esaurez

@esaurez esaurez commented Jun 4, 2026

Copy link
Copy Markdown
Owner

Summary

Add three closely related capabilities to the user-space dynamic loader that close long-standing System V gABI gaps and unblock shared libraries with C/C++ global constructors:

  1. .init_array invocation — constructors run in dependency order (leaves first) after dlopen completes relocation.
  2. .fini_array invocation — destructors run in reverse order on each library about to be unmapped by dlclose.
  3. DT_RUNPATH consultationDT_NEEDED bare names are probed against the loading library's DT_RUNPATH directories before falling back to the default lib/ search path.

Why

Today every "production" Nanvix port library (libc, libm, libssl, libcrypto, libxml2, lxml, …) ships as .a because the loader silently swallows constructors. This blocks converting any C/C++ library with init functions (e.g., libxml2, OpenSSL, almost any C++ lib) to .so, which in turn forces every shared module that consumes them to bundle the code (per-.so 3–5 MB duplication observed in the cpython migration, e.g., esaurez/cpython#11).

The lack of DT_RUNPATH is the partner gap: even a manually-constructed multi-.so chain can't be located unless every file sits in the single hard-coded lib/ directory.

What changes

File Change
src/libs/syscall/src/dlfcn/syscall/dynlib.rs New DynamicLibrary fields init_array, fini_array, runpaths; populate in open(); expose call_init_array(), call_fini_array(), runpaths()
src/libs/syscall/src/dlfcn/syscall/dlopen.rs resolve_all_symbols returns the topologically ordered Arc list; dlopen drops the registry lock and then walks the list invoking constructors (so a constructor may legally dlsym); dependency resolution forwards the parent's runpaths into resolve_library_path
src/libs/syscall/src/dlfcn/syscall/dlclose.rs call_fini_array() invoked on each library before its segments are dropped
src/libs/syscall/src/dlfcn/syscall/mod.rs resolve_library_path accepts optional runpaths and probes them ahead of LIBRARY_SEARCH_PATHS; minimal $ORIGIN → "." substitution

What is intentionally not implemented

  • DT_INIT / DT_FINI (legacy single-function tags). Modern toolchains emit .init_array / .fini_array; DT_INIT is rarely populated on non-glibc targets and adds little value.
  • DT_RPATH. Deprecated by the System V gABI; modern linkers emit DT_RUNPATH instead.
  • Recursive-lock support for dlopen from within a constructor. A constructor may safely call dlsym (lock is dropped before invocation), but a nested dlopen would still recurse into the same Mutex. Tracked as future work; libxml2/OpenSSL constructors do not nest.

Validation

  • z build all FEATURES=networking LOG_LEVEL=error RELEASE=yes MEMORY_SIZE=256 PASS
  • z build format-check, rust-lint-check, spellcheck PASS
  • New posix-tests suite dlfcn-init-runpath-c (esaurez/posix-tests companion PR) — three assertions, all PASS on a standalone VM:
    === dlfcn init_array + DT_RUNPATH tests ===
      PASS: init_array fires on dlopen
      PASS: fini_array fires on dlclose
      PASS: DT_RUNPATH dependency search
    
  • Full posix-tests suite PASS (15/15 testable suites including pre-existing dlfcn-c, dlfcn-pie-c).

Companion PR

esaurez/posix-tests feat/dlfcn-init-array-and-runpath — adds the end-to-end test suite that proves the loader behaviour on a real VM.

Downstream consumers

This loader change unblocks the CPython .a.so migration and its supporting port-libraries:

esaurez pushed a commit to esaurez/libxslt that referenced this pull request Jun 4, 2026
Produce position-independent libxslt.so and libexslt.so alongside
the existing static .a archives, wired as a real DT_NEEDED chain
on top of esaurez/libxml2's libxml2.so:

  libxslt.so   -> NEEDED libxml2.so
  libexslt.so  -> NEEDED libxslt.so, NEEDED libxml2.so

Only each .so's own .a is embedded via --whole-archive; the lower
layers (libxml2, libz) are NOT bundled, so the Nanvix dynamic
loader pulls them in transitively at dlopen time. This eliminates
the multi-megabyte per-module duplication a self-contained build
would cause and exercises the DT_NEEDED chain support shipped in
esaurez/nanvix#27 in a real-world setting.

Concretely:

* `--with-pic`, `-fPIC` in CFLAGS — same .o files reusable for .a
  and .so.
* Keep `--disable-shared` (libtool has no rules for i686-nanvix);
  the .so files are linked manually with `-shared -fPIC -nostdlib`.
* The new SHAREDLIB targets use `-Wl,--whole-archive <own>.a
  -Wl,--no-whole-archive -lxml2 [-lxslt]`, setting
  DT_SONAME=libxslt.so / DT_SONAME=libexslt.so.
* `make test` extended to verify each .so has the expected SONAME
  and exports its public API entry point.
* `.nanvix/z.py` `_BUILD_OUTPUTS` and `release()` ship both static
  and shared variants.

Sizes (stripped, DT_NEEDED chain vs the discarded self-contained
prototype):

  libxslt.so   296 KB (was 1.8 MB)
  libexslt.so   92 KB (was 1.9 MB)

Runtime dependencies:

* esaurez/nanvix#27 — `.init_array` invocation + DT_NEEDED chain
  walking in the user-space loader.
* esaurez/libxml2#1 — the libxml2.so this PR's binaries reference
  must be present in the buildroot. This implies a sequenced
  rollout: merge esaurez/libxml2#1 first, cut a new
  nanvix/libxml2 release, then this PR's CI build can resolve
  `-lxml2` to libxml2.so. Until then, CI continues to satisfy
  `-lxml2` against the existing libxml2.a in the release tarball,
  which produces a libxslt.so without a DT_NEEDED libxml2.so
  entry. The end-state expects libxml2.so to be present.

End-to-end validation (DT_NEEDED chain successfully resolved by
the Nanvix loader at dlopen time) is performed downstream in
esaurez/lxml#1 and the CPython lxml integration.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
esaurez pushed a commit to esaurez/lxml that referenced this pull request Jun 4, 2026
Produce position-independent liblxml_etree.so and
liblxml_elementpath.so alongside the existing static archives,
wired as a real DT_NEEDED chain on top of esaurez/libxml2 +
esaurez/libxslt:

  liblxml_etree.so       -> NEEDED libxslt.so, libexslt.so, libxml2.so
  liblxml_elementpath.so -> (pure-Cython, no native deps)

Only the cython-generated lxml.etree.c is embedded in
liblxml_etree.so; libxslt, libxml2, and libz live in their own
.so files and are pulled in transitively by the Nanvix dynamic
loader at dlopen time. This exercises the DT_NEEDED chain support
shipped in esaurez/nanvix#27 in a real-world setting and
eliminates the multi-megabyte per-module duplication that a
self-contained build would cause.

Concretely:

* `-fPIC` is added to the per-source compile commands, so the
  same .o files are usable for both .a and .so.
* Two new SHAREDLIB targets link via `-shared -fPIC -nostdlib
  -Wl,--whole-archive <own>.a -Wl,--no-whole-archive [-lxslt
  -lexslt -lxml2]`, setting DT_SONAME=liblxml_etree.so /
  DT_SONAME=liblxml_elementpath.so.
* `.nanvix/z.py` `output_files` and the Makefile's `package` /
  `verify-package` targets ship both the static and shared
  variants.

Sizes (stripped, DT_NEEDED chain vs the discarded self-contained
prototype):

  liblxml_etree.so       1.7 MB (was 3.5 MB)
  liblxml_elementpath.so 157 KB (was 153 KB; pure-Cython, no deps)

Runtime dependencies:

* esaurez/nanvix#27 — `.init_array` invocation + DT_NEEDED chain
  walking in the user-space loader.
* esaurez/libxml2#1 + esaurez/libxslt#1 — libxml2.so, libxslt.so,
  and libexslt.so must be present in the buildroot. This implies
  a sequenced rollout: merge libxml2#1 -> release -> bump libxslt's
  pin -> merge libxslt#1 -> release -> bump this PR's pins ->
  merge this PR.

End-to-end validation (DT_NEEDED chain resolved by the Nanvix
loader: liblxml_etree.so -> libxslt.so -> libxml2.so) will land
in a follow-up against esaurez/cpython#11. CPython's Phase 4 will
switch from the MODLIBS-piggyback workaround to a clean dlopen
of liblxml_etree.so, letting python.elf shrink by ~3 MB.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@esaurez esaurez force-pushed the feat/dlfcn-init-array-and-runpath branch 2 times, most recently from df776d2 to fae4251 Compare June 8, 2026 21:31
Add three closely related capabilities to the user-space dynamic
loader that close long-standing System V ABI gaps and unblock
shared libraries with C/C++ global constructors:

1. `.init_array` invocation. Each `DynamicLibrary` now records the
   loaded address and length of its `.init_array` section. After
   `resolve_all_symbols` completes the topological relocation pass,
   `dlopen` drops the registry lock and invokes every function
   pointer in dependency order (leaves first), exactly as required
   for shared-object constructor execution. The registry lock is
   dropped first so a constructor may legally call `dlsym` without
   deadlocking. Sentinel values 0 and -1 are skipped to match the
   glibc loader behaviour.

2. `.fini_array` invocation. `dlclose` invokes destructors in
   reverse order on each library that is about to be unmapped,
   before the underlying memory segments are dropped. The
   BFS-from-root traversal already in place naturally yields the
   "dependents fini before dependencies" order required by the
   ABI.

3. `DT_RUNPATH` consultation. `DynamicLibrary::open` parses the
   library's `DT_RUNPATH` entries (already split out by goblin),
   `:`-splits each one, and stores the resulting directory list.
   `resolve_library_path` accepts an optional runpath slice and
   probes those directories ahead of the default `lib/` search
   path. A bare-bones `$ORIGIN` substitution (-> `.`) is included
   so toolchain-default runpaths do not silently miss. `DT_RPATH`
   is intentionally not consulted: it is deprecated by the System
   V gABI and modern toolchains emit `DT_RUNPATH` instead.

The changes are purely additive — existing dlopen behaviour for
libraries without `.init_array`, `.fini_array`, or `DT_RUNPATH` is
unchanged. The accompanying posix-tests suite
`dlfcn-init-runpath-c` (separate PR) exercises all three features
end-to-end on a standalone Nanvix VM.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@esaurez esaurez force-pushed the feat/dlfcn-init-array-and-runpath branch from fae4251 to 9495ff0 Compare June 8, 2026 22:27
@esaurez

esaurez commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Closing in favor of nanvix/nanvix#2473, which carries the same branch (feat/dlfcn-init-array-and-runpath) and identical content (head SHA 9495ff08d) but targets upstream nanvix/nanvix:dev directly.

This fork-side PR was redundant once the upstream filing landed.

Tracked in nanvix-todo/pending-prs-for-upstream-review.md (Wave 1 section).

#28/#29/#30 continue to stack on this branch (feat/dlfcn-init-array-and-runpath) which remains live on the esaurez fork.

@esaurez esaurez closed this Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant