Skip to content

[Clang][Doc][NFC] Improve -offload-compress documentation and error message #17990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions buildbot/configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,9 @@ def do_configure(args, passthrough_args):
if args.use_lld:
llvm_enable_lld = "ON"

if args.use_zstd:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to use this in ci_defaults?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but let's do that change in a separate PR. I am not sure if the new Windows machine that we recently added (on VM) has zstd installed.

Copy link
Contributor

@sarnex sarnex Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i should have installed it everywhere but yeah a separate pr is fine

llvm_enable_zstd = "FORCE_ON"

# CI Default conditionally appends to options, keep it at the bottom of
# args handling
if args.ci_defaults:
Expand Down Expand Up @@ -417,6 +420,9 @@ def main():
"--native-cpu-libclc-targets",
help="Target triples for libclc, used by the Native CPU backend",
)
parser.add_argument(
"--use-zstd", action="store_true", help="Force zstd linkage while building."
)
args, passthrough_args = parser.parse_known_intermixed_args()

print("args:{}".format(args))
Expand Down
10 changes: 7 additions & 3 deletions clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1113,9 +1113,13 @@ class BinaryWrapper {
if (OffloadCompressDevImgs && !llvm::compression::zstd::isAvailable()) {
return createStringError(
inconvertibleErrorCode(),
"'--offload-compress' option is specified but zstd "
"is not available. The device image will not be "
"compressed.");
"'--offload-compress' is specified but the compiler is "
"built without zstd support.\n"
"If you are using a custom DPC++ build, please refer to "
"https://github.com/intel/llvm/blob/sycl/sycl/doc/"
"GetStartedGuide.md#build-dpc-toolchain-with-device-image-"
"compression-support"
" for more information on how to build with zstd support.");
}

// Don't compress if the user explicitly specifies the binary image
Expand Down
33 changes: 32 additions & 1 deletion sycl/doc/GetStartedGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
* [Build DPC++ toolchain with support for ARM processors](#build-dpc-toolchain-with-support-for-arm-processors)
* [Build DPC++ toolchain with support for runtime kernel fusion and JIT compilation](#build-dpc-toolchain-with-support-for-runtime-kernel-fusion-and-jit-compilation)
* [Build DPC++ toolchain with a custom Unified Runtime](#build-dpc-toolchain-with-a-custom-unified-runtime)
* [Build DPC++ toolchain with device image compression support](#build-dpc-toolchain-with-device-image-compression-support)
* [Build Doxygen documentation](#build-doxygen-documentation)
* [Deployment](#deployment)
* [Use DPC++ toolchain](#use-dpc-toolchain)
Expand Down Expand Up @@ -47,6 +48,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
| [Ninja](https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages) | |
| `hwloc` | >= 2.3 (Linux only, `libhwloc-dev` or `hwloc-devel`) |
| C++ compiler | [See LLVM](https://github.com/intel/llvm/blob/sycl/llvm/docs/GettingStarted.rst#host-c-toolchain-both-compiler-and-standard-library) |
|`zstd` (optional) | >= 1.4.8 (see [ZSTD](#build-dpc-toolchain-with-device-image-compression-support)) |

Alternatively, you can create a Docker image that has everything you need for
building pre-installed using the [Ubuntu 24.04 build Dockerfile](https://github.com/intel/llvm/blob/sycl/devops/containers/ubuntu2404_build.Dockerfile).
Expand Down Expand Up @@ -94,7 +96,8 @@ The easiest way to get started is to use the buildbot
[compile](../../buildbot/compile.py) scripts.

In case you want to configure CMake manually the up-to-date reference for
variables is in these files.
variables is in these files. Note that the CMake variables set by default by the [configure.py](../../buildbot/configure.py) script are the ones commonly used by
DPC++ developers and might not necessarily suffice for your project-specific needs.

**Linux**:

Expand Down Expand Up @@ -127,6 +130,7 @@ flags can be found by launching the script with `--help`):
* `-t` -> Build type (Debug or Release)
* `-o` -> Path to build directory
* `--cmake-gen` -> Set build system type (e.g. `--cmake-gen "Unix Makefiles"`)
* `--use-zstd` -> Force link zstd while building LLVM (see [ZSTD](#build-dpc-toolchain-with-device-image-compression-support))

You can use the following flags with `compile.py` (full list of available flags
can be found by launching the script with `--help`):
Expand Down Expand Up @@ -320,6 +324,33 @@ DPC++ toolchain, but add the `--disable-jit` flag.
Both kernel fusion and JIT compilation of AMD and Nvidia kernels are currently
not yet supported on the Windows platform.

### Build DPC++ toolchain with device image compression support

Device image compression enables the compression of device code (SYCL Kernels) during compilation and decompressing them on-demand during the execution of the corresponding SYCL application.
This reduces the size of fat binaries for both Just-in-Time (JIT) and Ahead-of-Time (AOT) compilation. Refer to the [blog post](https://www.intel.com/content/www/us/en/developer/articles/technical/sycl-compilation-device-image-compression.html) for more details on this feature.

To enable device image compression, you need to build the DPC++ toolchain with the
zstd compression library. By default, zstd is optional for DPC++ builds i.e. CMake will search for zstd installation but if not found, it will not fail the build
and this feature will simply be disabled.

To override this behavior and force the build to use zstd, you can use the `--use-zstd` flag in the `configure.py` script or by adding `-DLLVM_ENABLE_ZSTD=FORCE_ON` to the CMake configuration command.

#### How to obtain zstd?

Minimum zstd version that we have tested with is *1.4.8*.

**Linux**:

You can install zstd using the package manager of your distribution. For example, on Ubuntu, you can run:
```sh
sudo apt-get install libzstd-dev
```
Note that the libzstd-dev package provided on Ubuntu 24.04 has a bug ([link](https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/2086543)) and the zstd static library is not built with the `-fPIC` flag. Linking to this library will result in a build failure. For example: [Issue#15935](https://github.com/intel/llvm/issues/15935). As an alternative, zstd can be built from source either manually or by using the [build_zstd_1_5_6_ub24.sh](https://github.com/intel/llvm/blob/sycl/devops/scripts/build_zstd_1_5_6_ub24.sh) script.

**Windows**

For Windows, prebuilt zstd binaries can be obtained from the [facebook/zstd](https://github.com/facebook/zstd/releases/tag/v1.5.6) release page. After obtaining the zstd binaries, you can add the path to the zstd installation directory to the `PATH` environment variable.

### Build Doxygen documentation

Building Doxygen documentation is similar to building the product itself. First,
Expand Down
6 changes: 4 additions & 2 deletions sycl/doc/UsersManual.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,13 +203,15 @@ and not recommended to use in production environment.
**`--offload-compress`**

Enables device image compression for SYCL offloading. Device images
are compressed using `zstd` compression algorithm and only if their size
are compressed using zstd compression algorithm and only if their size
exceeds 512 bytes.
To use this option, DPC++ must be built with zstd support. Otherwise,
the compiler will throw an error during compilation.
Default value is false.

**`--offload-compression-level=<int>`**

`zstd` compression level used to compress device images when `--offload-
zstd compression level used to compress device images when `--offload-
compress` is enabled.
The default value is 10.

Expand Down
3 changes: 2 additions & 1 deletion sycl/test-e2e/Compression/no_zstd_warning.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// using --offload-compress without zstd should throw an error.
// REQUIRES: !zstd
// RUN: not %{build} %O0 -g --offload-compress %S/Inputs/single_kernel.cpp -o %t_compress.out 2>&1 | FileCheck %s
// CHECK: '--offload-compress' option is specified but zstd is not available. The device image will not be compressed.
// CHECK: error: '--offload-compress' is specified but the compiler is built without zstd support.
// CHECK-NEXT: If you are using a custom DPC++ build, please refer to https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#build-dpc-toolchain-with-device-image-compression-support for more information on how to build with zstd support.