Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shlibs functionnality should be arch dependant #2331

Open
bapt opened this issue Nov 7, 2024 · 27 comments
Open

shlibs functionnality should be arch dependant #2331

bapt opened this issue Nov 7, 2024 · 27 comments

Comments

@bapt
Copy link
Member

bapt commented Nov 7, 2024

shlibs_provides vs shlibs_requires should take in account the ABI to avoid issue between lib32 and non lib32

Also when backing up libs, we should have compat & compat32 taken in account.

@ifreund
Copy link
Contributor

ifreund commented Nov 11, 2024

I've just opened #2333 which is a first step towards making pkg's shlib features work in the presence of lib32.

I believe the next step is to append the architecture of the .so file to shlib_provides/requires names. This would allow differentiating between the "native" 64 bit libfoo.so.3-amd64 and the lib32 version libfoo.so.3-i386. For backwards compatibility, we would consider shlib names missing an architecture (e.g. libfoo.so.3) as if the architecture matches that of pkg config ABI.

I think this approach is preferable over modifying the pkgbase bits of the src/ build system to append additional metadata to lib32 packages or otherwise try to solve https://bugs.freebsd.org/bugzilla//show_bug.cgi?id=265061 outside of pkg.

@kevemueller
Copy link

This is very straightforward. Please also consider future-proofing any approach taken by considering kernel modules similar to shared libraries.
In the end, kernel provides a set of modules, and every kernel module has also a set it provides and depends on.
Might or might not be enough to do this with a "-kmod" suffix.

@bapt
Copy link
Member Author

bapt commented Nov 13, 2024

hum I think kernel modules should use the regular provides/requires (which didn't exist at the time we the the shlibs)
but yes amended with the architecture, which makes me think we should probably make the "provides/requires" analyzer kind of pluggable (lua script ?) so the ports tree or the base build system provide their own rather than hardcoding everything inside pkg.

@ifreund
Copy link
Contributor

ifreund commented Nov 14, 2024

Which makes me think we should probably make the "provides/requires" analyzer kind of pluggable (lua script ?) so the ports tree or the base build system provide their own rather than hardcoding everything inside pkg.

Doesn't pkg already allow specifying provides/requires in the manifest? I don't think anything more than that would be necessary to make it possible for the ports tree/base build system to generate their own provides/requires based on the package contents.

@bapt
Copy link
Member Author

bapt commented Nov 14, 2024

yes this is how it should be done, imho, one could argue we should do the same for shlibs_* aka deprecate shlibs_* and use the regular provides/requires, the only reason why not to do it, that the backup library functionality depends on it. (aka we only backup libraries which are marked as "provided" there are also rooms of thinking here on how this should be done)

@bapt
Copy link
Member Author

bapt commented Nov 19, 2024

note: #1048 this should be taken in consideration is someone is working on improving the shlibs provides/requires

@bapt
Copy link
Member Author

bapt commented Nov 22, 2024

I don't know if anyone is working on this right now, if not, I will start working on this.
We have roughly 2 cases on FreeBSD where we need to amend the libs:
32bit compatibility and linux compatibility.

What I propose to do is the following, keep as is anything which matches the ABI of the host aka what we currently have.
Amend with "(32)" anything which is a 32bit binary
Amend with "Linux" anything which is a linux binary same wordsize as the host
Amend with "Linux32" anything which is a linux binary 32bit (aiming at being installed on a 64bit host.

@emaste @evadot @ifreund @kevemueller am I missing anything ?

For example nvidia drivers whill provide:

  • libvdpau_nvidia.so.1
  • libnvidia-tls.so.1
  • libnvidia-ml.so.1
  • libnvidia-glvkspirv.so.1
  • libnvidia-glsi.so.1
  • libnvidia-glcore.so.1
  • libnvidia-eglcore.so.1
  • libnvidia-cfg.so.1
  • libGLX_nvidia.so.0
  • libGLESv2_nvidia.so.2
  • libGLESv1_CM_nvidia.so.1
  • libEGL_nvidia.so.0
  • libvdpau_nvidia.so.1(32)
  • libnvidia-tls.so.1(32)
  • libnvidia-ml.so.1(32)
  • libnvidia-glvkspirv.so.1(32)
  • libnvidia-glsi.so.1(32)
  • libnvidia-glcore.so.1(32)
  • libnvidia-eglcore.so.1(32)
  • libnvidia-cfg.so.1(32)
  • libGLX_nvidia.so.0(32)
  • libGLESv2_nvidia.so.2(32)
  • libGLESv1_CM_nvidia.so.1(32)
  • libEGL_nvidia.so.0(32)

@bapt
Copy link
Member Author

bapt commented Nov 22, 2024

linux_base-rl9 on amd64 will provide:

  • libc.so.6(Linux)
  • libc.so.6(Linux32)
  • libgeocode-glib.so.0.0.0(Linux)
  • libgeocode-glib.so.0.0.0(Linux32)
    ...

@evadot
Copy link
Contributor

evadot commented Nov 22, 2024

Maybe it will be easier to split the tag, something like (Linux)(32) or (Linux,32) etc ...

@ifreund
Copy link
Contributor

ifreund commented Nov 22, 2024

I don't know if anyone is working on this right now, if not, I will start working on this.

I started working on this but paused work to do some higher level design thinking. I see two alternative paths we could take with this feature that I wanted to explore and discuss before starting down one of them. I'll finish up the writeup I intended to post to this issue shortly. (I should have finished it last week, but I got sick)

@bapt
Copy link
Member Author

bapt commented Nov 22, 2024

I'll wait for your proposal

@ifreund
Copy link
Contributor

ifreund commented Nov 22, 2024

I see two alternative options pkg could choose for its shlib handling:

Option 1: Try to duplicate ld-elf.so search order

This is the goal pkg currently seems to be working towards (#1048)

From rtld(1) the search order is:

  1. DT_RPATH of the referencing object unless that object also
    contains a DT_RUNPATH tag
  2. DT_RPATH of the program unless the referencing object contains
    a DT_RUNPATH tag
  3. Path indicated by LD_LIBRARY_PATH environment variable
  4. DT_RUNPATH of the referencing object
  5. Hints file produced by the ldconfig(8) utility
  6. The /lib and /usr/lib directories, unless the referencing
    object was linked using the “-z nodefaultlib” option

In order to get the shlibs_provided/shlibs_required feature closer to the
behavior of ld-elf.so I think the following changes would be needed:

  1. Store the full path of shared objects provided in shlibs_provided
    (e.g. /usr/lib/libfoo.so.1 or /usr/lib32/libfoo.so.1).

  2. Store the following for each entry in shlibs_required:

    • Shared library name
    • DT_RPATH or DT_RUNPATH
    • Whether ld-elf.so or ld-elf32.so search order is used. This can be determined by comparing the architecture in the elf header to pkg->arch. For example, on an amd64 system pkg->arch will be amd64 even for lib32 packages but lib32 packages will, of course, include i386 elf objects.
  3. Duplicate the ld-elf.so or ld-elf32.so search order in the solver to determine if a shlib requirement is satisfied by a given shared object path from shlibs_provided.

    • LD_LIBRARY_PATH should be ignored, so this won't be a perfect reproduction of ld-elf.so logic
    • It doesn't seem feasible to me to check DT_RPATH of the program in addition to the referencing object, so the behavior here will be subtly different compared to ld-elf.so.

If BACKUP_LIBRARIES is enabled, copy any file with a path in shlibs_provided to the compat package. I don't see a need for e.g. a separate compat32 package though I could be missing something.

Option 2: only track shared objects in "standard" search paths

Get rid of shlibs_provided/shlibs_required as a pkg feature.
Instead, use the more general provides/requires mechanism.

This approach intentionally ignores DT_RPATH and DT_RUNPATH as they cannot be tracked without a special shlibs_required/shlibs_provided feature.

Define "standard" search paths based on the target system.

On FreeBSD define "standard" search paths as the output of ldconfig -r, ldconfig -32 -r, or ldconfig-linux -r depending on the elf file in question.

Note that on FreeBSD packages can add their own "standard" paths by installing a file containing a list of paths to /usr/local/libdata/ldconfig/foo.

On pkg create provides/requires entries are generated to express shlib dependencies:

  1. Add a provides entry with the shared library name plus a possible libcompat suffix for any shared libraries in the "standard" search paths. For example, libfoo.so.1 for a non-libcompat shared library or libfoo.so.1-lib32 for a
    lib32 shared library.

  2. Add a requires entry with the same format for all required libraries.

If BACKUP_LIBRARIES is enabled, copy any libraries in the "standard" search paths to the compat package. I don't see a need for e.g. a separate compat32 package though I could be missing something.

Advantages/Disadvantages

Overall the first option is more complex but gets closer to the actual runtime linker's behavior.

I think the second option likely has better UX since it uses the more general provides/requires mechanism and doesn't require the user to know the runtime linker search order to determine whether a shlibs_required is fulfilled by a
given shlibs_provided.

The main downside of the second option is of course that DT_RPATH and DT_RUNPATH would need to be handled manually by the porter (e.g. by adding a explicit dependency) rather than automatically by pkg. I personally think this is a
reasonable tradeoff to make. Many linux distros (e.g. Debian) discourage usage of DT_RPATH and DT_RUNPATH in their packaging. I think it would be reasonable for FreeBSD to do the same in the ports tree, especially since there is a way for packages to add to the "standard" search paths with a /usr/local/libdata/ldconfig/foo file.

Other notes:

  • CheriBSD has other libcompats aside from lib32 with ld-elf64c.so and ld-elf64cb.so for example, we should avoid special casing lib32 as the only libcompat.

  • The job scheduler probably should consider provides/requires and shlibs_provided/shlibs_required for job ordering if these features see widespread use. This is another minor point in favor of dropping shlibs_provided/shlibs_required and using the more general provides/requires as it would reduce complexity there.

@bapt
Copy link
Member Author

bapt commented Nov 22, 2024

I think the second case is also the less complex to undertsand/debug so I am fine with the second case, about the RUNPATH, having a special case with hard dependencies for such case is perfectly fine. right now what we have is closer to the option 2 we read the elf hints to get the known ldconfig path, and lookup there we lookup for RUNPATH as well, and this is imho fine (we could arguably drop entirely the RUNPATH but that is another storry).
As for the -lib32, I think my proposal complement yours, if we can tag the shlibs with: Linux, lib32 (maybe just 32) downstreams should just be able to add their own tags.

Regarding the solver provides/requires and shlibs_provides/shlibs_required are already working the same way, the only difference is they are in 2 separated fields, which in terms of UX is kind of convenient, I don't think eliminating one would simplify anything, but if it does, then so be it.

@ifreund
Copy link
Contributor

ifreund commented Nov 22, 2024

Yeah, "Option 2" is pretty much identical to what you proposed. It just spells out more of the details of how it should actually be implemented.

Regarding the solver provides/requires and shlibs_provides/shlibs_required are already working the same way, the only difference is they are in 2 separated fields, which in terms of UX is kind of convenient, I don't think eliminating one would simplify anything, but if it does, then so be it.

Indeed, I think this is mostly a UX question in the end. What would you think about using a namespace for shared library provides/requires (e.g. so, full name so:libfoo.so.1(32)) in addition to dropping shlibs_provided/shlibs_required?

This could be potentially extended in the future for pkg-config provides/requires files (pc:foo) or other things.

@bapt
Copy link
Member Author

bapt commented Nov 22, 2024

i like namespaces

@kevemueller
Copy link

When looking at pkg_elf I noticed and very much disliked ld.so.hints parsing. This is an absolutely non-portable way of handling things, but I am the 1‰ here. In my use-case the (FreeBSD) packages are created and used completely outside FreeBSD, no way I can have an ld.so.hints lying around.

Namespaces also solve my previous request for kmod dependencies.
If we want to go all-in, we should also add a dyld or ld or loader namespace for the required loader.

Happy to write the Mach-O update after the work was done for ELF (I would like to keep the code close in its logic).

@kevemueller
Copy link

Btw. standard search paths are a happy route with Mach-O, as I could just handle@rpath/@loader_path variables from the otherwise absolute path library references by dropping them.

@ifreund
Copy link
Contributor

ifreund commented Nov 26, 2024

I think the approach of ignoring RPATH/RUNPATH and sticking to standard search paths is great for 99% percent of packages and significantly reduces complexity.

However, we do need to allow the user creating a package to override the automatically generated provides/requires for shlibs if a program relies on rpaths or has other unusual properties. I think that this can be supported with a few additions to the manifest format:

  1. pkg create -M manifest by default appends automatically generated so:libfoo.so.1 formatted items to the provides/requires entries listed in the manifest based on a scan of the files in the package.
  2. If the manifest has the annotation no-scan-shlibs, completely disable the default scanning for shlibs. Any manual provides/requires entries listed in the manifest will still be taken into account of course.
  3. Add a new toplevel key ignore-shlibs which requires an array of strings as its value. Every entry in the array of strings is interpreted as a glob and causes the shlib scan to skip any files in the package matching the glob.

Does this sound sufficient to integrate with the ports tree? Is the a better way to expose this functionality?

@kevemueller
Copy link

After writing the unit tests in #2372 I now understand better what pkg_elf tries to do. I honestly disagree with the approach.
Regardless on how shlibs handling is improved/generalized as discussed above (I do agree to that), I believe that the mere existence of ALLOW_BASE_SHLIBS is evil.

I propose the following paradigm shift in handline requires/provides.

  • Remove ALLOW_BASE_SHLIBS
  • At package creation time, all library dependencies shall be captured and recorded in the package manifest. This simplifies the pkg_elf parser and removes any "magic" in trying to find out if this is a base lib or not. Most packages built, whether base or pkg.FreeBSD.org will e.g. depend on libc.so.7.
  • At package installation time obviously you need to find providers for all these dependencies, the "magic" code removed above, or a derivation of it would run and close the dependencies based on ld cache or file system crawl, or similar.
  • The choices derived by library resolution should be persisted in the DB across runs, e.g. with the introduction of a (hidden, special) SYSTEM package listing them. This could be pre-populated already at pkg installation time with all the libraries found.

Advantages

  • Super simple pkg_elf code at package creation time. Guaranteed consistency across runs on different platforms (no side effects possible).
  • pkg.freebsd.org packages become natively supported in pkgbase installations as they have the right rich dependencies embedded.
  • magic in resolving shlib providers still there for convenience at package installation time.
  • magic runs only once as opposed to at every package creation (assuming db caching).

Disadvantages

  • A magnitude higher number of provides/requires entries.

@bapt @ifreund what do you think?

@ifreund
Copy link
Contributor

ifreund commented Dec 2, 2024

I 100% agree that the current approach of ALLOW_BASE_SHLIBS is bad. It wouldn't exist at all in a post-pkgbase world and makes cross-building packages from a non-FreeBSD host quite problematic. I had more or less written it off as a necessary evil to be tolerated until pkgbase rules all but I like your general idea.

I think there are some simplifications that can be made though, here's the approach I would propose:

  1. Check if the system has any pkgbase packages installed (It's likely sufficient to check if some critical package is installed, FreeBSD-runtime perhaps?)
  2. If not a pkgbase system, do the following before performing an install/upgrade/remove:
    1. Scan the system library directories (exactly /lib, /usr/lib, /lib32, /usr/lib32) and add all shlibs found there to an automatically created/updated system-libraries package (similar to the current compat-libraries package).
    2. Analyze files for the automatic system package to populate shlib provides (maybe ignore requires to be safe?)
    3. After that package is created, the solver may be run as normal. No changes to the solver should be required.
  3. If the system is a pkgbase system, do nothing.

@bapt
Copy link
Member Author

bapt commented Dec 2, 2024

I do like @ifreund proposal to remove ALLOW_BASE_SHLIBS (which yes is a giant hack)
scanning the system would be greate, we can even store the mtime of each directories, so check if we should or not update the cached list of provides. that said I won't go up to the point where we create a packages dynamically, but I would preferrably store this somewhere: /var/db/pkg/system-libraries.ucl. We could even reuse this mecanism to add flexibility for end users, like I am hacking on my own library for gettext for example that I do build locally libbapt-intl.so.X and I use libmap.conf to map it to libintl.so.X,, I now have a mecanism to inject it in the solver, via pkg.conf or something like that as "yet libintl.so.X is provided"

@ifreund
Copy link
Contributor

ifreund commented Dec 2, 2024

I won't go up to the point where we create a packages dynamically, but I would preferrably store this somewhere: /var/db/pkg/system-libraries.ucl. We could even reuse this mecanism to add flexibility for end users, like I am hacking on my own library for gettext for example that I do build locally libbapt-intl.so.X and I use libmap.conf to map it to libintl.so.X,, I now have a mecanism to inject it in the solver, via pkg.conf or something like that as "yet libintl.so.X is provided"

Ah, this seems like a good approach to me. You're right that dynamically creating a package is overkill, supplying a simple list of system-provided shlibs to the solver would hardly increase the complexity.

@bapt
Copy link
Member Author

bapt commented Dec 3, 2024

I have merged all the pending code.

Note that the fact the elf analysis does not work for all OSes is due to the code look for elfhints, to it is very FreeBSD and dragonfly specific so it fails as finding for example libc.so.8 for dragonfly binary running pkg create on freebsd and fails at finding libc.so.6 binary for linux on FreeBSD.

this part of the code needs to be more cross os friendly.

@ifreund
Copy link
Contributor

ifreund commented Dec 4, 2024

Note that the fact the elf analysis does not work for all OSes is due to the code look for elfhints, to it is very FreeBSD and dragonfly specific so it fails as finding for example libc.so.8 for dragonfly binary running pkg create on freebsd and fails at finding libc.so.6 binary for linux on FreeBSD.

this part of the code needs to be more cross os friendly.

I agree, I've been thinking about this more and I think with the new idea for how to get rid of ALLOW_BASE_SHLIBS we can stop using elfhints at all for the generation of shlib provides/requires entries. Instead, we could add a set of options that provide pkg with lists of paths to scan for provided shared libraries. I think the following would be sufficient for now:

  • SHLIB_SCAN_PATHS_NATIVE
  • SHLIB_SCAN_PATHS_COMPAT_32
  • SHLIB_SCAN_PATHS_COMPAT_LINUX
  • SHLIB_SCAN_PATHS_COMPAT_LINUX_32

These could be automatically populated by the FreeBSD ports system based on the existing USE_LDCONFIG or
USE_LDCONFIG32 options, which can add extra search paths if necessary (e.g. llvm19).

If we take this path, the only place we might still want to keep using elfhints is the code to backup shared libraries if BACKUP_LIBRARIES is set. Reading the elfhints file to determine paths that need backup seems straightforward but there are likely other options we could take here. I'll give this case a bit more thought.

@kevemueller
Copy link

I would like to continue along my initial proposal that was confirmed by @bapt and @ifreund to simplify ELF parsing at package creation time. As this is a major change, I suggest to implement it alongside current ELF functionality.
In case of ALLOW_BASE_SHLIBS=yes I propose to use the "new" way of building dependencies, and with ALLOW_BASE_SHLIBS=no to use the legacy ELF hints based code for compatibility.

The proposed new way for ELF is deemed to have less (no?) impact on users with ALLOW_BASE_SHLIBS=yes, which is not the default and allows to gain experience with a limited risk exposure.

@bapt do you agree on going ahead like this?
@ifreund did you start on ELF already?

@bapt
Copy link
Member Author

bapt commented Dec 5, 2024

I think we should entirely nuke the ALLOW_BASE_SHLIBS, no fallback, we should test if a predictable file belong to a package "uname"? if yes then we consider we are using pkgbase and nothing to do, if not we scan /lib and /usr/lib to propulate the list of provided libs (this is for the solver).

Now regarding the provided/required shlibs, I think what we should do now, nuke the rpath and elfhints thing. As stated this is not portable. which means where pkg scans an elf file it should expose all its DT_NEEDED as required if not provided by itself.
and it should expose as provided all the libraries it installs which are in a patch which is either /lib, /usr/lib or provided via a command line and/or a plist option and/or a manifest entry (not sure yet about this one) at the pkg create time!

This will simplify a lot the code inside pkg as well as making it portable.

@ifreund
Copy link
Contributor

ifreund commented Dec 5, 2024

@ifreund did you start on ELF already?

Yes, I started implementing the approach described by @bapt in #2331 (comment) and #2331 (comment)

ifreund added a commit to ifreund/pkg that referenced this issue Dec 9, 2024
All shared objects NEEDED by an ELF file are added to shlibs_required
by pkg-create rather than filtering them based on the host system ELF
hints file and installed libraries.

The ALLOW_BASE_SHLIBS option is deprecated and ignored.

Before running the solver, pkg now scans /lib and /usr/lib (taking
--rootdir into account) and builds a list of system-provided shlibs that
is used as an input to the solver. If pkg detects that it is targeting a
pkgbase system this scan for system shlibs is skipped as the base
packages will provide all necessary shlibs.

The detection of a pkgbase system is implemented by checking if
/usr/bin/uname is tracked in the pkg database.

The main user-facing change is that pkg-create now adds *all* shared
objects listed as NEEDED by an ELF file to shlibs_required. In general
this should not cause issues due to the solver taking the shlibs
provided by the base system into account but there may be exceptions.
An option to filter these entries will be added in the next commit.

References:	freebsd#2331
Sponsored by:	The FreeBSD Foundation
ifreund added a commit to ifreund/pkg that referenced this issue Dec 10, 2024
All shared objects NEEDED by an ELF file are added to shlibs_required
by pkg-create rather than filtering them based on the host system ELF
hints file and installed libraries.

The ALLOW_BASE_SHLIBS option is deprecated and ignored.

Before running the solver, pkg now scans /lib and /usr/lib (taking
--rootdir into account) and builds a list of system-provided shlibs that
is used as an input to the solver. If pkg detects that it is targeting a
pkgbase system this scan for system shlibs is skipped as the base
packages will provide all necessary shlibs.

The detection of a pkgbase system is implemented by checking if
/usr/bin/uname is tracked in the pkg database.

The main user-facing change is that pkg-create now adds *all* shared
objects listed as NEEDED by an ELF file to shlibs_required. In general
this should not cause issues due to the solver taking the shlibs
provided by the base system into account but there may be exceptions.
An option to filter these entries will be added in the next commit.

References:	freebsd#2331
Sponsored by:	The FreeBSD Foundation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants