[SYCL] Avoid alignment on kernel pointer parameters #11979

premanandrao · 2023-11-22T16:29:06Z

When creating the kernel, we assume alignment of kernel pointer parameters based on the alignment of their pointee types. This can lead to incorrect results.

bader · 2023-11-22T17:14:40Z

clang/test/CodeGenSYCL/kernel-arg-align.cpp

@@ -0,0 +1,52 @@
+// RUN: %clang_cc1 -fsycl-is-device -O0 -internal-isystem %S/Inputs -triple spir64 -emit-llvm -o - %s | FileCheck %s


Suggested change

// RUN: %clang_cc1 -fsycl-is-device -O0 -internal-isystem %S/Inputs -triple spir64 -emit-llvm -o - %s | FileCheck %s

// RUN: %clang_cc1 -fsycl-is-device -Xclang -disable-llvm-passes -internal-isystem %S/Inputs -triple spir64 -emit-llvm -o - %s | FileCheck %s

bader · 2023-11-22T17:18:23Z

clang/lib/CodeGen/CGCall.cpp

+    //
+    // Don't do this for SYCL, as this assumption does not hold.
+    if (!getLangOpts().SYCLIsDevice && TargetDecl &&
+        TargetDecl->hasAttr<OpenCLKernelAttr>() && ParamType->isPointerType()) {


@premanandrao, do I get it right that alignment is not guaranteed for USM allocations only?
Is it true for L0 only or OpenCL is impacted as well?

I'm surprised to see the deviation from OpenCL properties. It might hard to justify in upstream. If SYCL compiler doesn't genuine OpenCL kernel, can we continue using OpenCLKernelAttr or better to have a SYCL specific attribute?

It is not guaranteed for USM allocations, but since we don't do pointer analysis, we can't deduce the alignment in general.

Good questions about L0 vs OpenCL. I agree with your suggestion that perhaps we use a different SYCL attribute if it applies to OpenCL.

I have added @GarveyJoe and @ajaykumarkannan to the PR; they had identified and requested this change. I would like to have their thoughts on this too.

According to https://registry.khronos.org/OpenCL/extensions/intel/cl_intel_unified_shared_memory.html. USM allocation alignment requirements match OpenCL buffer (i.e. it must be a power of two and must be equal to or smaller than the size of the largest data type supported by any OpenCL device in context), so we can re-use OpenCL kernel logic as-is for pointers to USM allocations. We can argue about whether OpenCL logic is correct, but I don't think it should cause the difference between OpenCL and SYCL.

I don't see similar alignment requirements for Level Zero though. Level Zero spec only requires alignment value to be a power of two. @bashbaug, do you know if Level Zero memory allocation functions have additional alignment guarantees like OpenCL?

@bader, the wording you're looking at regarding alignment is about the alignment of the pointer returned by the allocation functions such as clSharedMemAllocINTEL. There is no requirement that the kernel argument passed in via clSetKernelArgMemPointerINTEL has that same alignment. The only restriction on the pointer passed to clSetKernelArgMemPointerINTEL is that it is somewhere within an allocation returned by one of the allocation functions:

Otherwise, the pointer value must be NULL or must point into a Unified Shared Memory allocation returned by clHostMemAllocINTEL, clDeviceMemAllocINTEL, or clSharedMemAllocINTEL.

As a result, the following code is legal OpenCL:

// Kernel kernel void foo(global int *a) { global char *b = (char *)(a); *b = ...; } // Host char *p = (char *) clHostMemAllocINTEL(context, NULL, 2, 0, NULL); p = &(p[1]); clSetKernelArgMemPointerINTEL(kernel, 0, p);

And certainly in this case the kernel argument will not have alignment any higher than that of a char.

@premanandrao, since this same problem can be exposed in OpenCL, as my example demonstrates, I don't think your code should be SYCL-specific.

Note, there is a line in the OpenCL C spec saying:

For arguments to a __kernel function declared to be a pointer to a data type, the OpenCL compiler can assume that the pointee is always appropriately aligned as required by the data type.

A similar line also exists in the OpenCL SPIR-V environment spec:

For OpTypePointer arguments to a function, the compiler may assume that the pointer is appropriately aligned as required by the Type that the pointer points to.

The example above still might be OK, but because the int* kernel argument is not aligned to sizeof(int) == 4 bytes things could easily go wrong.

Does SYCL (or C++, generally) make similar guarantees?

do you know if Level Zero memory allocation functions have additional alignment guarantees like OpenCL?

The Level Zero memory allocation functions have a similar alignment parameter as the OpenCL allocation functions. The Level Zero spec doesn't seem to explicitly say what the behavior is when passing zero as the alignment, but I'm 99% sure it behaves the same as OpenCL, by choosing an implementation-defined alignment that is big enough for all basic data types.

After much digging through the spec, I've concluded that this optimization is actually legal in all C++ programs (and thus in SYCL as well). Without ever saying so explicitly, the language standard goes to great lengths to ensure that any pointer that doesn't have at least the alignment of the type it points to has either an undefined value or produces undefined behaviour even if the pointer is never dereferenced. However, it seems that this is not the approach that clang has taken. This LLVM mailing list discussion started by John McCall in 2016 seems to summarize clang's current position: https://groups.google.com/g/llvm-dev/c/eJRto1ipCYQ. In that thread he proposes that clang maintain a more relaxed position than the C++ standard: that it is only UB to dereference an unaligned pointer. I can't find anything formal in the clang docs to indicate that his proposal was accepted but even in present day clang does not emit the alignment attributes that the stricter definition from the C++ standard would allow. It instead only emits alignment at access sites. At the very least it seems the clang community has tacitly accepted John's proposal.

Now we have to decide if we want to take advantage of the stronger guarantees of the standard or follow clang's looser direction. I suspect we might get push back from the community if we try to upstream code that takes advantage of this guarantee.

The SYCL 2020 spec, in section 5.5 (Built-in scalar data types), requires scalar fundamental data types to have the same size and alignment for the host and device. The alignment annotations look correct to me as is.

I believe the stronger guarantee exists to allow for an implementation to diagnose the creation of an invalid pointer as opposed to having to wait until the pointer is dereferenced.

Following further offline discussion, I now agree with Joe that this is a good change. Without it, code might behave differently in a kernel than in another device function and that just seems weird and unnecessary. I don't think the optimization opportunity is significant.

One of the things that was helpful for me in reaching this conclusion is that alias annotations in LLVM IR are coalesced; when there are multiple relevant annotations (e.g., on a parameter with a pointer type and on a load/store that uses that pointer), code gen can use the more strict one.

I don't think the optimization opportunity is significant.

One of the things that was helpful for me in reaching this conclusion is that alias annotations in LLVM IR are coalesced; when there are multiple relevant annotations (e.g., on a parameter with a pointer type and on a load/store that uses that pointer), code gen can use the more strict one.

I strongly recommend testing this claim using available means.

@jingwan2, FYI.

Testing is always a good idea :)

The reason I think the optimization opportunity is not significant is because the alias information is (currently) lost as soon as one of these pointers is passed to another function (though subject to inlining considerations I'm sure). At any rate, it would be good to get input from someone with actual optimization experience.

GarveyJoe

commented

tahonermann · 2023-12-22T18:46:51Z

clang/lib/CodeGen/CGCall.cpp

+    // Don't do this for SYCL, as this assumption does not hold.
+    if (!getLangOpts().SYCLIsDevice && TargetDecl &&
+        TargetDecl->hasAttr<OpenCLKernelAttr>() && ParamType->isPointerType()) {


Assuming that the OpenCLKernelAttr attribute is only used for OpenCL and SYCL, perhaps it makes sense to restrict the conditional to matching OpenCL specifically rather than any language extension other than SYCL.

Suggested change

// Don't do this for SYCL, as this assumption does not hold.

if (!getLangOpts().SYCLIsDevice && TargetDecl &&

TargetDecl->hasAttr<OpenCLKernelAttr>() && ParamType->isPointerType()) {

if (getLangOpts().OpenCL && TargetDecl &&

TargetDecl->hasAttr<OpenCLKernelAttr>() && ParamType->isPointerType()) {

github-actions · 2024-09-13T01:57:21Z

This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

github-actions · 2025-03-13T02:03:06Z

This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

github-actions · 2025-04-12T02:03:42Z

This pull request was closed because it has been stalled for 30 days with no activity.

[SYCL] Avoid alignment on kernel pointer parameters

371a562

When creating the kernel, we assume alignment of kernel pointer parameters based on the alignment of their pointee types. This can lead to incorrect results.

premanandrao requested a review from a team as a code owner November 22, 2023 16:29

bader reviewed Nov 22, 2023

View reviewed changes

premanandrao requested review from GarveyJoe and ajaykumarkannan November 22, 2023 17:24

premanandrao temporarily deployed to WindowsCILock November 22, 2023 17:34 — with GitHub Actions Inactive

premanandrao temporarily deployed to WindowsCILock November 22, 2023 18:28 — with GitHub Actions Inactive

premanandrao requested review from tiwaria1 and removed request for ajaykumarkannan November 29, 2023 17:16

GarveyJoe reviewed Dec 11, 2023

View reviewed changes

tahonermann reviewed Dec 22, 2023

View reviewed changes

github-actions bot added the Stale label Sep 13, 2024

tiwaria1 removed their request for review September 13, 2024 14:58

github-actions bot removed the Stale label Sep 14, 2024

github-actions bot added the Stale label Mar 13, 2025

github-actions bot closed this Apr 12, 2025

		@@ -0,0 +1,52 @@
		// RUN: %clang_cc1 -fsycl-is-device -O0 -internal-isystem %S/Inputs -triple spir64 -emit-llvm -o - %s \| FileCheck %s

	// RUN: %clang_cc1 -fsycl-is-device -O0 -internal-isystem %S/Inputs -triple spir64 -emit-llvm -o - %s \| FileCheck %s
	// RUN: %clang_cc1 -fsycl-is-device -Xclang -disable-llvm-passes -internal-isystem %S/Inputs -triple spir64 -emit-llvm -o - %s \| FileCheck %s

[SYCL] Avoid alignment on kernel pointer parameters #11979

[SYCL] Avoid alignment on kernel pointer parameters #11979

Uh oh!

Conversation

premanandrao commented Nov 22, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GarveyJoe Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GarveyJoe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 13, 2024

Uh oh!

github-actions bot commented Mar 13, 2025

Uh oh!

github-actions bot commented Apr 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

GarveyJoe Dec 7, 2023 •

edited

Loading