Add support for delta encoding to patch PCKs #112011
Open
+863
−91
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves godotengine/godot-proposals#13404.
Supersedes #111731.
Overview
This pull request adds the ability to use delta encoding (sometimes referred to as binary patching or binary diffing) when exporting patch PCKs, first introduced in #97118.
This means that only the parts of the files that have actually changed (or some close approximation of that) will end up being exported, often resulting in much smaller patch exports, which can benefit games that distribute patches by their own means, reducing bandwidth cost and storage usage for all parties involved.
This is especially true when dealing with changes to things like
*.translationfiles,uid_cache.binorglobal_script_class_cache.cfg, which often end up being included in patch exports and which together can easily make your patch several megabytes large despite very little having actually changed.Implementation
This is (unlike #111731) achieved by utilizing Zstandard's
--patch-fromfunctionality (thanks to @fire for pointing me towards it) which is basically an alias for providing the old file as the dictionary to be used in its dictionary-matching stage, resulting in compressed deltas that are very similar to those of other delta encoding tools like bsdiff (when also paired with Zstandard) but with much lower export-time memory usage and (at least on average) lower decompression/runtime overhead as well.With the help of Zstandard we generate a delta between the old file (as found in the "Base Packs" entries) and the new file, which then gets exported instead of the actual file, along with a new
PACK_FILE_DELTAPCK flag.As a result of using compression, exporting a patch PCK with this feature enabled will take slightly longer, especially when leaving the compression level set to its default of 19. This can however mostly be resolved by just lowering the compression level, at the cost of export size. Decompression speed should largely remain the same for Zstandard across all compression levels, but you can also set a negative compression limit, which will enable its "fast mode", where it sacrifices even more compression for the sake of (de)compression speeds.
We also allow the user to tweak the "minimum size reduction", which is a threshold for how much smaller (10% by default) the patch would need to be compared to the new file in order to be exported as a delta. The reason for this being that applying these patches comes at a slight runtime cost during resource loading, not just from decompressing the patch but also the I/O overhead from loading it in the first place, so it needs to be justified with some amount of space saving.
Once exported, patch PCKs (much like before) look and behave just like any other PCK, where you load them through
ProjectSettings.load_resource_pack, ideally through some bootstrap script where nothing else has had a chance to be loaded yet.We then store any delta patches alongside the regular files in
PackedData, and whenever a file is loaded fromPackedDatawe wrap the resultingFileAccessPackinside of the newly introducedFileAccessPatched, which will lazily apply all the patches as needed, meaning it's completely transparent and should hopefully just work with any existing file/resource loading.Important
Note that this means you pay the cost of applying these patches every time you load a patched resource that isn't already cached by
ResourceLoader. The patches are not applied to the original PCKs in any way, nor are the patched resources cached anywhere on disk.Performance
Export size will mostly depend on whether the original file is compressed or not. So things like textures or even GDScript files, when using its default export mode of "Compressed binary tokens", will diff poorly and often fall below the size reduction threshold. The same mostly goes for compressed localization files as well.
Important
Disabling compression on anything you intend to patch is key to getting good results, as compression scrambles data in a way where the old and the new file will look almost nothing alike.
In the real-world projects I've tested with you usually see an average reduction of around 60-90%, depending on the ratio of compressed resources, and an average runtime overhead of about 0.1 milliseconds per individual patch being applied, with patches to larger files sometimes being in the order of a few milliseconds.
DOGWALK
In the interest of showing a real-world example from an actual game, I've generated patches for the two updates of DOGWALK that have been published since its initial release. You can find the Godot project available through its Supporter Pack, with earlier versions being downloadable with the help of the Steam console and SteamDB.
These numbers are from using a compression level of 19. As mentioned already, faster (de)compression speeds can be had by using a negative compression level, at the cost of compression ratio.
(Also note that the files listed here will be slightly different compared to in #111731.)
v1.0.1 to v1.0.2
[Click to expand/collapse]
The first patch contains modifications to the following file types:
*.tscnfiles.jsonfiles*.tresfiles*.gdfileglobal_script_class_cache.cfguid_cache.binproject.godotOut of those changes, 7 files (6 textures and 1 glTF file) fell below the reduction threshold and were not exported as a delta.
Without delta encoding the patch PCK ended up being 8.55 MiB, and with delta encoding the patch PCK ended up being 2.11 MiB, which is a reduction in size of roughly 75%.
For a breakdown of the individual files, see the output from the
--verboseexport:v1.0.2 to v1.0.4
[Click to expand/collapse]
The second patch contains modifications to the following file types:
*.gdfiles*.tresfiles*.tscnfiles*.jsonfileglobal_script_class_cache.cfguid_cache.binproject.godotOut of those changes, 10 files (9 glTF files and 1 texture) fell below the reduction threshold and were not exported as a delta.
Without delta encoding the patch PCK ended up being 14.64 MiB, and with delta encoding the patch PCK ended up being 5.03 MiB, which is a reduction in size of roughly 66%.
For a breakdown of the individual files, see the output from the
--verboseexport:Runtime overhead
[Click to expand/collapse]
On my machine (AMD 9950X3D) the overhead from applying the above mentioned patches averages around 66 µs, with roughly 60% of that being spent actually decoding the patches, and the rest being I/O. The median is 13 µs, with a worst-case of 1858 µs. In total the two patch PCKs added 3722 µs (3.7 ms) of overhead to resource loading across 56 resources when starting a new game.
I've seen very similar numbers on Android (OnePlus 8T from 2020, Snapdragon 865) in another real-world project, with averages hovering around 80 µs, with roughly half of that being spent on I/O.
For a breakdown of the individual files, see the output from a
--verbosestartup of the game:Potential improvements
--verbosefor logging delta encoding details during export.user://folder.