Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statically linked runtime #877

Open
agowa opened this issue Oct 26, 2018 · 135 comments
Open

Statically linked runtime #877

agowa opened this issue Oct 26, 2018 · 135 comments
Milestone

Comments

@agowa
Copy link

agowa commented Oct 26, 2018

AppImages should run on all Linux Platforms, but currently they don't, this is because it is dynamically linked against glibc.
I tried to run a AppImage on Alpine Linux and it failed because Alpine Linux is build around musl libc instead.
I think AppImages should generally include all necessary dependencies and not some of them. Also adding libc would not increase the resulting size much, depending on the used libc, it may only be from 185k to 8M libc Comparison Chart. And if the binary is also stripped it can also be a much less.

AppImage should do something like this:

  1. When creating, check the required symbols of the application and determine the smallest fitting libc to include.
  2. Always produce statically linked binaries, that don't depend on anything on the system.
  3. Recommend using musl libc instead of glibc, for various reasons like License (MIT ./. LGPL), binary size (527k ./. 8M).
@probonopd

This comment was marked as resolved.

@azubieta

This comment was marked as resolved.

@TheAssassin

This comment was marked as resolved.

@TheAssassin

This comment was marked as outdated.

@TheAssassin

This comment was marked as outdated.

@TheAssassin

This comment was marked as outdated.

@TheAssassin TheAssassin changed the title Statically Link binaries Statically linked runtime Dec 10, 2019
@TheAssassin TheAssassin added this to the type-3 milestone Dec 10, 2019
@probonopd
Copy link
Member

Good feature. Since AppImage has no plan to support it

Actually we are very interested in supporting this, as it would allow us to close #1015 - correct? Let's collaborate 👍

@probonopd
Copy link
Member

The next AppImage type should fix this issue.

Shall we state "get rid of FUSE" as a goal?

@probonopd
Copy link
Member

can't run without glibc such as Alpine

Do you think we can change it just so much that it can at least work on Alpine when libc6-compat is installed there? Then it would not even have to be fully static. See #1015

@agowa
Copy link
Author

agowa commented Dec 12, 2019

A completely static linked app would also allow to create docker images without userland. E.g. Only the static linked app without any linux userland surrounding it...

That's not only smaller, but also decreases the attack surface.

@probonopd
Copy link
Member

@TheAssassin would that be something that you think would be doable if we would rewrite the runtime in, say, Rust? Wouldn't the runtime be rather large then because it would have to statically link libfuse? (How large would it become?)

Or should we try to get rid of FUSE altogether for the future type 3 AppImages?

@TheAssassin
Copy link
Member

@TheAssassin would that be something that you think would be doable if we would rewrite the runtime in, say, Rust? Wouldn't the runtime be rather large then because it would have to statically link libfuse? (How large would it become?)

Or should we try to get rid of FUSE altogether for the future type 3 AppImages?

You don't need the entire libfuse, you just need a few bits. I've read a bit into fuse-rs, it doesn't seem that complex to me.

The size is secondary; we can save bloat elsewhere (e.g., by using musl libc properly thinned down to the essential bits, etc.).

Getting rid of FUSE would be awesome, but I have doubts it's all that easy.

@probonopd
Copy link
Member

probonopd commented Dec 18, 2019

Is there any limitation in runtime size?

No. No hard limitation. (We should try to make it as small and efficient as possible.)

-----------------------
static linked loader (ELF)
-----------------------
squashfs image
-----------------------

should be sufficient since we can calculate the length of an ELF (and we are already doing it).

@probonopd
Copy link
Member

probonopd commented Jan 5, 2020

By the way, here is a bare-bones static AppImage type 2 runtime written in Go:

https://github.com/orivej/static-appimage

This runtime is using zip rather than squashfs. It has the added benefit that any existing unzip tool should be able to extract it. (Maybe such AppImages should be named .AppImage.zip to make this more obvious.)

Don't use it for production yet since it may be lacking more advanced features like update information, embedded digital signatures, and such. But it shows that it is doable to make a static AppImage runtime using FUSE.

@probonopd
Copy link
Member

probonopd commented Jan 11, 2020

Go? how large is the final static runtime?

Depends on the architecture, around 2 MB:
https://github.com/kost/static-appimage/releases

When you run upx -9 on it, you can bring it to under 1 MB.

@TheAssassin
Copy link
Member

That other project is doing is really different from us. It's hardly comparable to our runtime. Any sort of size estimation based on that is too imprecise to tell anything useful. Given their runtime is already way larger than ours doesn't really aid your point.

A fully statically linked FUSEless runtime would be great. But I don't see how this can be realized while keeping all the features and characteristics of the existing runtime.

Writing a runtime in Go is also pretty much a bad idea. It adds way too many uncontrollable dependencies. It's a huge mess. Our runtime is embedded in every AppImage. It needs to be absolutely bullet proof license wise. Ideally, it's licensed as permissively as possible, as legally we cannot even safely assume the the resulting AppImage is not considered a derivative work derived from the runtime. This question hasn't been fully answered for the existing runtime.
(Generally, any upcoming AppImage type needs to put a way higher effort into licensing questions.)

@NobodyXu
Copy link

NobodyXu commented Mar 28, 2021

Or maybe a completely different approach can be taken:

Provides a modified version of glibc and musl libc that have appimageRuntime embedded into it by modifing functions _start, dlopen and open (optional).

The _start is modified so that the embedded appimageRuntime can parse the cmdline arg and setup environment (unzip the files the tmpfs and etc).

The dlopen and ld.so is modified so that the dynamic libraries would search in the unzipped environment first.

The open can be modified if the program is close source so that read from absolute path /usr for resources can be redirected to the unzipped environment.

If the program cannot be compiled to use this libc or is a shell script, then a more traditional approach can be used:

Add a header to the program that contains a runtime that decompress the environment including the modified libc to tmpfs and setup environment variables LD_PRELOAD and LD_LIBRARY_PATH so that the program/shell would use the modified libc and the bundled dynamic libraries.

The libc then can have its open function modified so that resource be loaded from the unzipped environment.

Edit:

I found that the interpreter and rpath of ELF can be changed by NixOS/patchelf, so there is no need to use LD_PRELOAD and LD_LIBRARY_PATH for close source software unless it forbidden any modification to the binary.

@probonopd
Copy link
Member

unzip the files

Unzip which files? AppImages are mounted, not extracted. This gives them their speed.

@NobodyXu
Copy link

NobodyXu commented Mar 28, 2021

Unzip which files? AppImages are mounted, not extracted. This gives them their speed.

I was suggesting to throw away fuse and use a compressed tar instead.

Extracting a compressed tar won't be a lot slower than squashfuse while fuse adds overhead to application.

Every read/mmap of the executable or resource bundled with appimage need to go through fuse, which requires the process to wait for at least 2 context switch instead of just one.

@NobodyXu
Copy link

NobodyXu commented Mar 28, 2021

@probonopd I've done a naive benchmark between squashfuse used in appimage and tmpfs using nvim.appimage

[nobodyxu@gentoo:/tmp]$ time tar cf squashfuse .mount_nvim.aDB5FO5/

real    0m0.166s
user    0m0.004s
sys     0m0.038s
[nobodyxu@gentoo:/tmp]$
[nobodyxu@gentoo:/tmp]$ time cp -r .mount_nvim.aDB5FO5/ copied_tmp

real    0m0.040s
user    0m0.004s
sys     0m0.029s
[nobodyxu@gentoo:/tmp]$ time tar cf tmp copied_tmp/

real    0m0.023s
user    0m0.004s
sys     0m0.019s
[nobodyxu@gentoo:/tmp]$ time cp -r copied_tmp/ copied_tmp2/

real    0m0.025s
user    0m0.004s
sys     0m0.021s

.mount_nvim.aDB5FO5 is where the nvim.appimage is mounted.

I found that by looking into /proc/<pid>/.

You can see that operations performed on tmpfs is much faster than squashfuse.

Edit:

The benchmark above test the cold run.

The warm run is much faster, but still slower than tmpfs:

[nobodyxu@gentoo:/tmp]$ tar cf squashfuse .mount_nvim.ax0xNEd/
[nobodyxu@gentoo:/tmp]$ rm squashfuse
[nobodyxu@gentoo:/tmp]$ time tar cf squashfuse .mount_nvim.ax0xNEd/

real    0m0.035s
user    0m0.005s
sys     0m0.023s
[nobodyxu@gentoo:/tmp]$ time tar cf squashfuse .mount_nvim.ax0xNEd/

real    0m0.034s
user    0m0.012s
sys     0m0.016s

@probonopd
Copy link
Member

If I understand it right, it Looks like https://github.com/eth-cscs/spack-batteries-included is providing a solution for this. Should we backport these changes into the AppImage runtime?

Differences and improvements over AppImage runtime
spack.x uses zstd for faster decompression;
spack.x itself is an entirely static binary;
spack.x does not need to dlopen libfuse.so

Reference:
#1120 (comment)
cc @haampie

@AlexTMjugador
Copy link

AlexTMjugador commented Mar 8, 2022

For those interested in running AppImages in musl containers like me (namely, those based on Alpine), a solution that works today is to extract the AppImage to the container filesystem while building it (for example, with a COPY instruction on the Dockerfile, after running ./Whatever.AppImage --appimage-extract).

If the AppImage was generated with a tool like appimage-builder, which bundles every dependency to the AppImage (including the glibc used by the payload), the resulting AppRun should work flawlessly on pretty much anything you throw at it.

The idea stated above is also applicable in any scenario in which it is feasible to extract the AppImage in a glibc system before running it on a maybe musl system.

@TheAssassin

This comment was marked as off-topic.

@TheAssassin

This comment was marked as off-topic.

@bksubhuti

This comment was marked as off-topic.

@TheAssassin
Copy link
Member

@bksubhuti FUSE is the only way to mount things without needing root on Linux. The only other options would be requiring root or fully extracting to a cache directory.

@mgord9518 I think you can do fancy stuff with cgroups. Browsers (including Electron's runtime), tools like podman and buildah and many more examples show you can perform mounts. We may be able to get rid of FUSE in the future. I'm not an expert on cgroups, though, and I'm relatively sure the solution would add some complexity over just using FUSE (where we don't even need to worry about the filesystem, as this software already exists).

I think we should open an issue for that.

@bksubhuti

This comment was marked as off-topic.

@TheAssassin

This comment was marked as off-topic.

@bksubhuti
Copy link

What is the price to pay for static linked lib.? Originally it was 700KB. But you mentioned pulling only necessary items and reducing that.

Some would argue that static will not break whole systems like fuse3 does

@TheAssassin
Copy link
Member

Please stop posting off-topic messages. This has been answered in detail in the discussion above.

@bksubhuti
Copy link

You could just have a newer differently named appimage maker and not worry . Have it in beta for a year and see what happens.. I have a small user base and can test

@TheAssassin
Copy link
Member

This might be a language barrier issue...

@mgord9518
Copy link

@TheAssassin That (using cgroups to mount without root) sounds pretty interesting as a fallback. Assuming the implementation isn't super complex, I wonder if it would be feasible to write a wrapper around it to make the API identical to libFUSE, which could be maintained as a separate project.

I'd be happy to do some research, my current knowledge of cgroups is essentially nil but it's cool to hear there are other options

@bksubhuti
Copy link

I am a native speaker of Merikan English but I mostly live abroad. I just lost track of the issue and previous history.. My question is, is the information below actually the idea where "just pieces were pulled and compiled directly" or if the whole fuse2 library was statically linked?

Here it is.

Go? how large is the final static runtime?

Depends on the architecture, around 2 MB:
https://github.com/kost/static-appimage/releases

When you run upx -9 on it, you can bring it to under 1 MB.

@bksubhuti
Copy link

bksubhuti commented Jul 17, 2023

I also looked at the static release folder and noticed there are releases for each linux OS. Is this for the end user as well or just the user who builds.

I'd be happy to try using it, but I don't know what to do. I'm not so familiar and intuitive with today's assumed knowledge. If you give me a command or a series of commands I can run on an ubuntu 18.04 droplet. I'd be happy to test and also ask a few others to test my app. Our flutter app is quite heavy and uses sqlite3 as well.

my commands to build are below:
https://github.com/bksubhuti/tipitaka-pali-reader/blob/master/serverbuildcommands.sh
Let me know what to do and I will run it.

@mgord9518
Copy link

@bksubhuti the repo you listed is not the runtime being discussed here. The static runtime being talked about here is just the original runtime being linked statically instead of dynamically (at least to the best of my knowledge)

@xplshn
Copy link

xplshn commented Apr 15, 2024

Just to clarify; When you statically link something, it only uses the symbols relevant to the program, sometimes it is even smaller than using dynamically linked binaries, because a dynamically linked binary makes the LD lookup where the libraries are, which means;

  • Increases attack surface, I could change $LD_LIBRARY_PATH and put a library there of the same name and behavior but malicious. Also, if a library, like, let's say xz-libs (which required systemd and glibc to be exploitable...) updates, all programs based on it are vulnerable until xz-libs is fixed, this is both a cons and pro for dynamically linking.
  • No versioning: When Glibc updates, programs compiled against it inevitably break. Or you'd have to have a versioned glibc, which in its vanilla form takes 50MB as is... So, for each version you'd have +50MB of .so ELFs.
  • Load time - If program A depends on /usr/lib/libpipewire-0.3.so.0, and /usr/lib/libpipewire-0.3.so.0 depends on other libraries, which in turn depend on some other libraries, the chain gets bigger until it starts to noticeably slow down the startup, also, that means that memory is loaded in a kind of asynchronous, and thus irregular way (not a problem with today's tech, but still).
  • Invisible size, your binary may be 4megs but the libraries its loading, and the libraries that each library loads could very well be bigger than 30 megs...

There are LOTS of drawbacks, but those are the most important, read this if interested: https://harmful.cat-v.org/software/dynamic-linking/

I see one possible "fix" for now... AppImages could detect missing libraries in the system prior to running, then pull such libraries from a repo containing the missing libraries compiled against Musl, uClibc or even Glibc if need be, so long as they are made portable appropriately, also, note that using Patchelf one can modify ELFs and change where they look for dependencies or you can also replace their hard dependencies or remove them so that the Dynamic Loader lets you run the program even without loading such dependency.
If a repo like that is to ever be created, it could be a simple Github Actions powered repo compiling libraries against various libcs. Flatpaks resolved this issue by having different runtimes, each with glibc included (glibc is the root of all evils, that's why most runtimes weight more than 100megs, because they are compiled against glibc).

:)

@s09bQ5
Copy link

s09bQ5 commented Apr 15, 2024

@xplshn what you talk about sounds very off-topic. This issue is about the application that provides the filesystem for the real application inside the AppImage. The dependencies of the application inside the AppImage are beyond its scope.

There is no increased attack surface due to dynamic linking. If you can run code to change LD_LIBRARAY_PATH, you don't need to use this to persuade another application to run your malicious code. You can run your malicious code directly. When application gain privileges through setuid LD_LIBRARY_PATH is ignored.

Glibc is the prime example of backward compatibility. It has stuck to the same soname for the past 27 years and provides multiple versions of a symbol to support applications that need the old behavior of a function.

Your points about load time and invisible size might also be false. If an application loads libpipewire from outside the AppImage, there is most likely already another application (e.g. pipewire itself) that has this library loaded and the Linux kernel will share pages between these application as long as they remain unchanged. The dynamic loader just has to resolve the symbols, which is usually done lazily.

@xplshn
Copy link

xplshn commented Apr 15, 2024

Glibc is the prime example of backward compatibility. It has stuck to the same soname for the past 27 years and provides multiple versions of a symbol to support applications that need the old behavior of a function.

Try running something compiled in Ubuntu 10.4 in Ubuntu 22...

I cited that webpage because it'd be bad seen if I pasted here a 100 lines long explanation and demonstration of the pitfalls of dynamic linking. If you are interested in learning, read this.

@xplshn
Copy link

xplshn commented Apr 15, 2024

You could also read this very informative paper: http://www.nth-dimension.org.uk/pub/BTL.pdf - Title: "Breaking the links: Exploiting the linker"

@xplshn
Copy link

xplshn commented Apr 15, 2024

@s09bQ5
Copy link

s09bQ5 commented Apr 15, 2024

Try running something compiled in Ubuntu 10.4 in Ubuntu 22...

Tried hdparm, works.

I cited that webpage because it'd be bad seen if I pasted here a 100 lines long explanation and demonstration of the pitfalls of dynamic linking. If you are interested in learning, read this.

Learning from that page? That's just a collection of anecdotes from ancient UNIX systems.

You could also read this very informative paper: http://www.nth-dimension.org.uk/pub/BTL.pdf - Title: "Breaking the links: Exploiting the linker"

How do the quirks described in there affect AppImages? The only takeaway I see is that when setting LD_LIBRARY_PATH in an AppImage, one should add : only if the variable is not empty.

https://catonmat.net/ldd-arbitrary-code-execution

This is an attack on people trying to analyze a binary with ldd, not on people who intend to run the binary.

@xplshn
Copy link

xplshn commented Apr 15, 2024

How it relates to AppImages? Binaries compiled statically are portable when packaged as an AppImage, dynamically linked ones are not, and not only are they not, their pitfalls outweigh the benefits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests