Skip to content

[build] E: Build liblzma.so alongside liblzma.a#97

Open
esaurez wants to merge 1 commit into
nanvix/v5.2.5from
feat/build-shared-library
Open

[build] E: Build liblzma.so alongside liblzma.a#97
esaurez wants to merge 1 commit into
nanvix/v5.2.5from
feat/build-shared-library

Conversation

@esaurez

@esaurez esaurez commented Jun 17, 2026

Copy link
Copy Markdown

Summary

Builds liblzma.so alongside the existing liblzma.a so consumers can dlopen xz at run time instead of statically linking it.

Why

Today the Nanvix xz port ships only liblzma.a. Anything that wants xz (e.g., cpython's _lzma extension) has to statically embed it into the consumer binary. That works for cpython's monolithic python.elf but does not allow:

  • Sharing a single liblzma implementation across multiple .so consumers in the same process.
  • Building a cpython _lzma.so that resolves xz via DT_NEEDED liblzma.so (the model already used by _ssl.solibssl.so and _ctypes.solibffi.so on Nanvix).

Producing liblzma.so here unblocks the cpython migration of _lzma from --whole-archive .a in python.elf to DT_NEEDED .so.

What changed

xz uses autotools and the upstream --enable-shared path goes through libtool, which does not know about i686-nanvix. Rather than teaching libtool a new platform target, this PR keeps --disable-shared --enable-static and links liblzma.so manually from the .a after make install completes, the same approach already taken by nanvix/libffi#235.

Concretely, .nanvix/z.py:

  • _configure_env_overrides: appends -fPIC to CFLAGS so the same .o files compile into both .a and .so.
  • build (in-Docker shell script):
    • Strips \r from the vendored autotools scripts before running ./configure. Required on Windows hosts where git's core.autocrlf=true converts shell scripts to CRLF and dash inside the container chokes on the trailing \r (no-op on Unix hosts where the files already have LF endings).
    • Adds a gcc -shared invocation after make install DESTDIR=... that links liblzma.so from src/liblzma/.libs/liblzma.a via -Wl,--whole-archive with DT_SONAME=liblzma.so and -nostdlib (libc / libm UND, bind at dlopen). Writes the result into <install-stage>/sysroot/lib/liblzma.so.
  • _stage_artefacts: extends the lib_dir / build_dir copy-out to cover liblzma.so in addition to liblzma.a.
  • _stage_release_outputs: copies liblzma.so into lib_out() so ./z release packages it.

Architecture

liblzma.so          (PIC-recompiled from liblzma.a)
  └── UND libc / libm symbols (memcpy, memset, malloc, ...)
      → resolved against the host executable's .dynsym at dlopen time

liblzma.a (unchanged shape; same .o files now compiled with -fPIC)

liblzma.so has no transitive .so dependencies (liblzma is self-contained; libc/libm symbols are UND), so the loader does not need to walk any DT_NEEDED chain to load it.

Dependencies

Runtime dep (already merged upstream):

  • nanvix/nanvix#2473dlfcn init-array + DT_RUNPATH support. Required for any consumer to dlopen("liblzma.so") at run time.

Validation

Build runs end-to-end inside the toolchain-gcc Docker image on a Windows host:

$ ./z build
... (compiles liblzma .c files with -fPIC) ...
make install DESTDIR=/mnt/workspace/build/_install
i686-nanvix-gcc -shared -fPIC -nostdlib \
    -Wl,-soname,liblzma.so -Wl,-z,noexecstack \
    -Wl,--whole-archive src/liblzma/.libs/liblzma.a -Wl,--no-whole-archive \
    -o /mnt/workspace/build/_install/sysroot/lib/liblzma.so
... (upstream tests build) ...
info: Staged artefacts under D:\src\xz-dev\build
info: Staged release outputs under D:\src\xz-dev\.nanvix\out\release
success: Build complete

Structural verification of the produced liblzma.so:

$ i686-nanvix-readelf -d build/liblzma.so | grep -E 'SONAME|NEEDED'
 0x0000000e (SONAME)                     Library soname: [liblzma.so]

$ i686-nanvix-nm -D build/liblzma.so | grep ' T lzma_' | head -5
00007c40 T lzma_alone_decoder
00004ee0 T lzma_alone_encoder
00007fe0 T lzma_auto_decoder
000055f0 T lzma_block_buffer_bound
00008070 T lzma_block_buffer_decode

SONAME correctly set, no spurious DT_NEEDED, full lzma_* public API exported, libc symbols left UND for runtime binding. The existing liblzma.a and the upstream test_*.elf test binaries continue to build unchanged.

Adds a liblzma.so link step inside the existing build script that
links liblzma.so from the (now PIC) static archive via
-Wl,--whole-archive so every liblzma entry point becomes part of the
.so's .dynsym. Sets DT_SONAME=liblzma.so so consumers that link
against it emit a proper DT_NEEDED entry.

liblzma is self-contained -- no transitive .so deps -- so the .so
records no DT_NEEDED of its own. libc / libm symbols (memcpy, memset,
malloc, ...) are left UND via -nostdlib and bind at dlopen time
against the host executable's .dynsym, matching the model already
used by other Nanvix shared libraries such as libffi.so / libssl.so.

xz's autotools libtool does not know about i686-nanvix, so the
upstream --enable-shared path is not viable. The configure opts keep
--disable-shared --enable-static; the .so is linked manually from the
.a after `make install` completes.

Changes to .nanvix/z.py:

- _configure_env_overrides: append -fPIC to CFLAGS so the same .o
  files compile into both .a and .so.
- build: add a SHAREDLIB link step after `make install DESTDIR=...`
  that writes liblzma.so directly into the install staging tree
  alongside liblzma.a.
- _stage_artefacts: extend the lib_dir / build_dir copy-out to cover
  liblzma.so in addition to liblzma.a.
- _stage_release_outputs: copy liblzma.so into lib_out() so
  `./z release` packages it.

Adds .gitattributes forcing LF line endings on the vendored Unix
shell / autotools scripts (configure, config.*, install-sh,
build-aux/*, *.sh, ...). These are already committed as LF; the
attribute keeps them LF on checkout regardless of a contributor's
core.autocrlf setting, so dash inside the Linux toolchain container
does not choke on CRLF. This replaces an earlier runtime CR-stripping
workaround that was carried in the build script.

Runtime dependency: the shared-library build becomes useful once the
loader changes in nanvix/nanvix#2473 ([syscall] E: Run dlopen
ctors/dtors and DT_RUNPATH) ship.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Nanvix xz port build to produce a shared liblzma.so alongside the existing static liblzma.a, enabling runtime dlopen/DT_NEEDED consumption (e.g., CPython _lzma.so) instead of requiring static embedding.

Changes:

  • Compile the liblzma objects with -fPIC so they can be reused for a shared library.
  • Manually link and stage liblzma.so from the libtool-produced liblzma.a, and include it in artefact/release staging.
  • Add .gitattributes rules to keep vendored autotools/shell scripts checked out with LF endings.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
.nanvix/z.py Adds PIC CFLAGS, links liblzma.so from the static archive post-install, and stages/copies the new .so into build and release outputs.
.gitattributes Forces LF line endings for configure/autotools helper scripts to avoid CRLF issues in container builds from Windows checkouts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .nanvix/z.py
# i686-nanvix (hence --disable-shared in configure opts),
# so we link the .so ourselves and stage it next to .a.
f"{shlex.quote(cc)} -shared -fPIC -nostdlib "
f" -Wl,-soname,liblzma.so -Wl,-z,noexecstack"
Comment thread .nanvix/z.py
Comment on lines 333 to 334
"set -e",
configure_cmd,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants