diff --git a/rfcs/0185-redistribute-redistributable.md b/rfcs/0185-redistribute-redistributable.md new file mode 100644 index 000000000..44e70516f --- /dev/null +++ b/rfcs/0185-redistribute-redistributable.md @@ -0,0 +1,225 @@ +--- +feature: redistribute-redistributable +start-date: 2024-12-15 +author: Ekleog +co-authors: (find a buddy later to help out with the RFC) +shepherd-team: @Mic92, @roberth, @Lassulus +shepherd-leader: @Mic92 +related-issues: https://github.com/NixOS/nixpkgs/issues/83884 +--- + +# Summary +[summary]: #summary + +Make Hydra build and provide all redistributable software, while making sure installation methods stay as fully free as today. + +# Motivation +[motivation]: #motivation + +Currently, Hydra builds only free software and unfree redistributable firmware. +This means that unfree redistributable software needs to be rebuilt by all the users. +For example, using MongoDB on a Raspberry Pi 4 (aarch64, which otherwise has access to hydra's cache) takes literally days and huge amounts of swap. + +Hydra could provide builds for unfree redistributable software, at minimal added costs. +This would make life much better for users of such software. +Especially when the software is still source-available even without being free software, like MongoDB. + +# Detailed design +[design]: #detailed-design + +We will add a `runnableOnHydra` field on all licenses, that will be initially set to its `free` field, and set to `true` only for well-known licenses. + +Hydra will build all packages with licenses for which `redistributable && runnableOnHydra`. +It will still fail evaluation if the ISO image build or the Amazon AMIs were to contain any unfree software. + +This will be done by evaluating Nixpkgs twice in `release.nix`. +Once with `allowUnfree = false` like today, plus once with `allowlistedLicenses = builtins.filter (l: l.redistributable && l.runnableOnHydra) lib.licenses`. +Then, most of the jobs will be taken from the allowlisted nixpkgs, while only the builds destined for installation will be taken from the no-unfree nixpkgs. + +The list of jobs destined for installation, that cannot contain unfree software is: +- `amazonImage` +- `amazonImageAutomaticSize` +- `amazonImageZfs` +- `iso_gnome` +- `iso_minimal` +- `iso_minimal_new_kernel` +- `iso_minimal_new_kernel_no_zfs` +- `iso_plasma5` +- `iso_plasma6` +- `sd_image` +- `sd_image_new_kernel` +- `sd_image_new_kernel_no_zfs` + +This RFC offers absolutely no more guarantees than the current statu quo, as to whether proprietary packages will or not build on hydra. +In particular, proprietary packages will not necessarily be part of the Zero Hydra Failures project upon release, +though release managers could, at their own discretion, decide to include some specific proprietary packages in there. + +# Examples and Interactions +[examples-and-interactions]: #examples-and-interactions + +With these changes, here is what could happen as things currently stand, if the licenses were all to be marked `runnableOnHydra`. +This is not meant to be indicative of what should happen or not, but indicative of what could happen. +Each package's individual `license` field setup is left to its maintainers, and nixpkgs governance should conflict arise. +This RFC does not mean to indicate that it is right or wrong, and is not the right place to discuss changes to this field. +Should one have disagreements on any specific package in this list, please bring that up to that package's maintainers. + +It is also suggested in this RFC that people, upon marking licenses as `runnableOnHydra`, check all the derivations that use this license. +They could then have to mark them as either `hydraPlatforms = []`, `preferLocalBuild = true` and/or `allowSubstitutes = false`. +This might be useful for packages like TPTP: +they may not yet be marked as such due to these flags having no impact on unfree packages; +but would take gigabytes on Hydra for basically no local build time improvement + +With this in mind, Hydra could start building, among others: +- CUDA +- DragonflyDB +- MongoDB +- Nomad +- NVIDIA drivers +- Outline +- SurrealDB +- TeamSpeak +- Terraform +- Unrar +- Vagrant +- NixOS tests that involve such software (eg. MongoDB or Nomad) + +And Hydra will keep not building, among others: +- CompCert +- DataBricks +- Elasticsearch +- GeoGebra +- Widevine CDM + +# Drawbacks +[drawbacks]: #drawbacks + +## Technical drawbacks + +The main risk is that NixOS could end up including unfree software in an installation image if: +1. we forgot to add it to the list of no-allowed-unfree jobs, and +2. a maintainer did actually add unfree software to that build. + +This seems exceedingly unlikely, making this change basically risk-free. + +The only remaining drawback is that Hydra would have to evaluate Nixpkgs twice, thus adding to eval times. +However, the second eval (with no-unfree) should be reasonably small and not actually evaluate all packages, as it is only used for installation media. + +## Political drawbacks + +Whether distributing unfree software is a political drawback is left to each reader's opinion. + +Besides that, there are three main political risks. +First is, this RFC could end up completely unused. +Maybe, with proper license investigation, we will notice that none of the packages listed above can actually be redistributed by Hydra. + +The second risk is one of manpower. +We may need the Foundation's input on whether a specific license is ok to redistribute or not. +This could require some manpower from the Foundation's side. + +Finally, the third risk is one of propagation. +With both hydra and some nixos maintainers running with `allowUnfree`, there is a risk that free packages start unnecessarily depending on unfree packages. +This would then break the setup of the people not actually running with `allowUnfree`. + +This being said, all these risks are probably less impactful than the current statu quo. +Indeed, we currently have packages for Mac that are not marked with any license, because they would otherwise have to be marked unfree, +yet we do want to build and test them. +This means that we are already lying on licenses in order to get them through Hydra. +And, in particular, this means they could actually reach the machine of users without `allowUnfree`. +This situation is entirely due to the absence of this RFC, and could only be improved by it. + +# Alternatives +[alternatives]: #alternatives + +### Having Hydra actually only build FOSS derivations, not even unfree redistributable firmware + +This would likely break many installation scenarios, but would bring us to a consistent ethical standpoint, though it's not mine. + +### Keeping the status quo + +This results in very long builds for lots of software, as exhibited by the number of years people have been complaining about it. + +### Having Hydra redistribute redistributable software, without verifying installation media + +This would be slightly simpler to implement, but would not have the benefit of being 100% sure our installation media are free. + +### Having Hydra redistribute redistributable software, with a check for the installation media + +This is the current RFC. + +### Building all software, including unfree non-redistributable software + +This is quite obviously illegal, and thus not an option. + +### Not having the `runnableOnHydra` field on licenses + +This would make it impossible for Hydra to build them as things currently stand: +Hydra would then risk actually running these packages within builds for other derivations (eg. NixOS tests). + +This would thus only be compatible with changes to Hydra, that would allow to tag a package as not allowed to run, but only to redistribute. +Such a change to Hydra would most likely be pretty invasive, and is thus left as future work. + +# Prior art +[prior-art]: #prior-art + +According to [this discussion](https://github.com/NixOS/nixpkgs/issues/83433), the current status quo dates back to the 20.03 release meeting. +More than four years have passed, and it is likely worth rekindling this discussion, especially now that we actually have a Steering Committee. + +Recent exchanges have been happening in [this issue](https://github.com/NixOS/nixpkgs/issues/83884). + +# Resolved questions + +### How large are the packages Hydra would need to additionally store? + +`nix-community`'s Hydra instance can give us approximations. +Its `unfree-redist-full` channel is currently 215G large, including around 200G of NVidia kernel packages and 15G for all the rest of unfree redistributable software. +Its `cuda` channel is currently 482G large. + +Currently, NixOS' hydra pushes around 2TB per month to S3, with rebuilds taken into account. +Noteworthy is the fact that these 2TB are of compressed data. +Hence, the expected increase would not be 700G per rebuild, but something lower than this, which is hard to pre-compute. + +Regardless, Hydra should be able to deal pretty well even with a one-time 700G data dump. +The issues would come only if compression were not good, in addition to rebuilds being frequent enough to significantly increase the amount of data Hydra pushes to S3. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Is the list of installation methods correct? +I took it from my personal history as well as the NixOS website, but there may be others. +Also, I may have the wrong job name, as I tried to guess the correct job name from the various links. + +Do we need a specific `redistributableWhenPatched` field on the license? +It feels like this would be a bit too much, and probably `redistributable` would be enough. +However, we may need to have it still. + +Will we need to redistribute some derivations with `runnableOnHydra = false`? +For example, some firmware might not be legal to run on hydra. +However, Hydra will never actually try to run it, as it cannot be used at runtime to build other packages. +Maybe even `runnableOnHydra` could be better named to encompass this case too? + +# Future work +[future]: #future-work + +- **Actually tagging licenses and packages as `runnableOnHydra`.** + Without this, this RFC would have no impact. + This will be done package-by-package, and should require no RFC, unless there are significant disagreements on whether a license should be runnable on hydra or not. + +- **Monitoring Hydra to confirm it does not push too much data to S3.** + If this change causes Hydra to push an economically non-viable amount of data to S3, then we should revert the addition of `runnableOnHydra` to the relevant packages and reconsider. + +- **Culling NVidia kernels and CUDA derivations.** + We suggest not caring too much about S3 size increases in the first step, considering the numbers from the resolved questions section. + However, if compression is less efficient than could be expected, we could be required to cull old NVidia kernels and/or CUDA derivations. + This would reduce the availability of older or more niche configurations, in exchange with reducing Hydra closure size. + Or we could move them to a set in which Hydra does not recurse. + For now, this is left as future work, that should be handled close to tagging the relevant derivations as `runnableOnHydra`. + +- **Modifying Hydra to allow building and redistributing packages that it is not legally allowed to run.** + This would be a follow-up project that is definitely not covered by this RFC due to its complexity, and would require a new RFC before implementation. + +- **Validating licenses and dependencies.** + We may be interested in figuring out the aggregate license of one derivation. + This could be automatically computed by evaluating the Nix scripts. + In particular, we could have a specific `enforceFree` meta argument that'd enforce that this derivation as well as all dependencies are transitively free. + Implementing this may be doable in pure nix, or could require an additional hydra check. + This is left as future work, because even without validating licenses this RFC probably reduces the risk for FOSS users from installing proprietary software.