Skip to content

[SYCL][clang-linker-wrapper] fix SYCL offload wrapping in dry-mode in clang-linker-wrapper. #17661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: sycl
Choose a base branch
from

Conversation

maksimsab
Copy link
Contributor

@maksimsab maksimsab commented Mar 26, 2025

The patch allows to test SYCL Offload Wrapping using clang-linker-wrapper in dry-run mode. Previously, it was tested by running the tool in non dry-run mode.

The patch removes printing of offload-wrapper from clang-linker-wrapper since the tool prints only invocations of other tools.

…g in clang-linker-wrapper

The patch allows to test SYCL Offload Wrapping using
clang-linker-wrapper in dry-run mode. Previously, it was tested by
running the tool in non dry-run mode.

The patch unites 2 debuging prints of "offload wrapper" step in
clang-linker-wrapper in Verbose/DryRun modes.

The patch returns the test case in linker-wrapper-sycl.cpp that has been
removed in intel#17546.
@maksimsab maksimsab requested review from a team as code owners March 26, 2025 14:44
@maksimsab
Copy link
Contributor Author

@bader Please, share you feedback if you have any.

@maksimsab maksimsab changed the title [SYCL][ClangLinkerWrapper] Rework the testing of SYCL Offload Wrapping in clang-linker-wrapper [SYCL][clang-linker-wrapper] Rework the testing of SYCL Offload Wrapping in clang-linker-wrapper Mar 26, 2025
@@ -756,8 +756,12 @@ runSYCLPostLinkTool(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
if (!ImageFileOrErr)
return ImageFileOrErr.takeError();

// Arbitrary values are used for the testing of SYCL Offload Wrapping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dry-run != testing.

I don't think it's a good idea to add random properties in dry-run mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed that. However, for full test coverage I need to leave a testing of SYCL Offload Wrapping in non-dry-mode.

@maksimsab maksimsab changed the title [SYCL][clang-linker-wrapper] Rework the testing of SYCL Offload Wrapping in clang-linker-wrapper [SYCL][clang-linker-wrapper] Fix SYCL Offload Wrapping in dry-mode in clang-linker-wrapper. Mar 28, 2025
@maksimsab maksimsab changed the title [SYCL][clang-linker-wrapper] Fix SYCL Offload Wrapping in dry-mode in clang-linker-wrapper. [SYCL][clang-linker-wrapper] fix SYCL Offload Wrapping in dry-mode in clang-linker-wrapper. Mar 28, 2025
@maksimsab maksimsab changed the title [SYCL][clang-linker-wrapper] fix SYCL Offload Wrapping in dry-mode in clang-linker-wrapper. [SYCL][clang-linker-wrapper] fix SYCL offload wrapping in dry-mode in clang-linker-wrapper. Mar 28, 2025
@@ -12,11 +12,11 @@
// RUN: touch %t.devicelib.cpp
// RUN: %clang %t.devicelib.cpp -fsycl -fsycl-targets=spir64-unknown-unknown -c --offload-new-driver -o %t.devicelib.o
//
// Run clang-linker-wrapper test
// Check SYCL Offload Wrapper in non-dry-mode since we need to cover every it's function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that offload wrapping functionality must work in dry-run mode as well.
Dry-run mode skips running 3rd party executables, but all logic implemented in the clang-linker-wrapper tool itself should be executed and therefore tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that offload wrapping functionality must work in dry-run mode as well.

Yes and I added such testing below in this file.

SYCL Offloading contains parts like this:

addPropertySetRegistry(const PropertySetRegistry &PropRegistry) {
// transform all property sets to IR and get the middle column image into
// the PropSetsInits
SmallVector<Constant *> PropSetsInits;
for (const auto &PropSet : PropRegistry) {
// create content in the rightmost column and get begin/end pointers
std::pair<Constant *, Constant *> Props =
addPropertySetToModule(PropSet.second);
// get the next the middle column element
auto *Category = addStringToModule(PropSet.first, "SYCL_PropSetName");
PropSetsInits.push_back(ConstantStruct::get(SyclPropSetTy, Category,
Props.first, Props.second));
}
// now get content for the leftmost column - create the top-level
// PropertySetsBegin/PropertySetsBegin entries and return pointers to them
return addStructArrayToModule(PropSetsInits, SyclPropSetTy);
}

The loop in the function is being covered by test only if properties are non-empty. As you already noticed it is not a good idea to add arbitrary stub values in dry mode like I tried initially with the following code:

    // Arbitrary values are used for the testing of SYCL Offload Wrapping.
    auto Properties = util::PropertySetRegistry();
    Properties.add(util::PropertySetRegistry::SYCL_DEVICE_REQUIREMENTS, "key",
                   util::PropertyValue(uint32_t(0)));
    std::vector Modules = {module_split::SplitModule(
        *ImageFileOrErr, std::move(Properties), "entry1\nentry2")};

In order to cover the whole SYCLOffloadWrapper it leaves me with a choice between:

  1. Do the testing with both dry and non-dry modes.
  2. Do the testing only in dry mode but insert various stub values.

If I do the testing in dry mode with the default value {module_split::SplitModule(*ImageFileOrErr, util::PropertySetRegistry(), "")}; then it would lead to less code coverage.

// CHECK-DRY-NEXT: entry:
// CHECK-DRY-NEXT: call void @__sycl_unregister_lib(ptr @.sycl_offloading.descriptor)
// CHECK-DRY-NEXT: ret void
// CHECK-DRY-NEXT: }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK-DRY-NEXT: }
// CHECK-DRY-NEXT: }

Comment on lines 34 to 35
// LLVM-SPIRV is not called in dry-run
// CHK-SPLIT-CMDS-NEXT: offload-wrapper: input: [[SPIRVOUT]].spv, output: [[WRAPPEROUT:.*]].bc
// CHK-SPLIT-CMDS-NEXT: offload-wrapper: input: [[SPIRVOUT]].spv, output: [[WRAPPEROUT:.*]].bc, compile-opts: , link-opts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that clang-linker-wrapper --dry-run prints only external commands.
Why do we print offload-wrapper which not an external command?

On the other hand. LLVM-SPIRV is not call in dry-run. Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, the previous line checking that llvm-spirv tool is called, but the comment seems to be misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that clang-linker-wrapper --dry-run prints only external commands.

That is right. 'offload-wrapper' printing was done that way to increase debuggability and transparency.
It should be adjusted to the initial approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, we have a complicated option parsing and it is not clear how to test without verbose and dry-run modes.

/// Add any sycl-post-link options that rely on a specific Triple in addition
/// to user supplied options.
/// NOTE: Any changes made here should be reflected in the similarly named
/// function in clang/lib/Driver/ToolChains/Clang.cpp.
static void
getTripleBasedSYCLPostLinkOpts(const ArgList &Args,
SmallVector<StringRef, 8> &PostLinkArgs,
const llvm::Triple Triple) {
const llvm::Triple HostTriple(Args.getLastArgValue(OPT_host_triple_EQ));
bool SYCLNativeCPU = (HostTriple == Triple);
bool SpecConstsSupported = (!Triple.isNVPTX() && !Triple.isAMDGCN() &&
!Triple.isSPIRAOT() && !SYCLNativeCPU);
if (SpecConstsSupported)
PostLinkArgs.push_back("-spec-const=native");
else
PostLinkArgs.push_back("-spec-const=emulation");
// TODO: If we ever pass -ir-output-only based on the triple,
// make sure we don't pass -properties.
PostLinkArgs.push_back("-properties");
// See if device code splitting is already requested. If not requested, then
// set -split=auto for non-FPGA targets.
bool NoSplit = true;
for (auto Arg : PostLinkArgs)
if (Arg.contains("-split=")) {
NoSplit = false;
break;
}
if (NoSplit && (Triple.getSubArch() != llvm::Triple::SPIRSubArch_fpga))
PostLinkArgs.push_back("-split=auto");
// On Intel targets we don't need non-kernel functions as entry points,
// because it only increases amount of code for device compiler to handle,
// without any actual benefits.
// TODO: Try to extend this feature for non-Intel GPUs.
if ((!Args.hasFlag(OPT_no_sycl_remove_unused_external_funcs,
OPT_sycl_remove_unused_external_funcs, false) &&
!SYCLNativeCPU) &&
!Args.hasArg(OPT_sycl_allow_device_image_dependencies) &&
!Triple.isNVPTX() && !Triple.isAMDGPU())
PostLinkArgs.push_back("-emit-only-kernels-as-entry-points");
if (!Triple.isAMDGCN())
PostLinkArgs.push_back("-emit-param-info");
// Enable program metadata
if (Triple.isNVPTX() || Triple.isAMDGCN() || SYCLNativeCPU)
PostLinkArgs.push_back("-emit-program-metadata");
bool SplitEsimdByDefault = Triple.isSPIROrSPIRV();
bool SplitEsimd =
Args.hasFlag(OPT_sycl_device_code_split_esimd,
OPT_no_sycl_device_code_split_esimd, SplitEsimdByDefault);
if (!Args.hasArg(OPT_sycl_thin_lto))
PostLinkArgs.push_back("-symbols");
// Specialization constant info generation is mandatory -
// add options unconditionally
PostLinkArgs.push_back("-emit-exported-symbols");
PostLinkArgs.push_back("-emit-imported-symbols");
if (SplitEsimd)
PostLinkArgs.push_back("-split-esimd");
PostLinkArgs.push_back("-lower-esimd");
bool IsAOT = Triple.isNVPTX() || Triple.isAMDGCN() || Triple.isSPIRAOT();
if (Args.hasFlag(OPT_sycl_add_default_spec_consts_image,
OPT_no_sycl_add_default_spec_consts_image, false) &&
IsAOT)
PostLinkArgs.push_back("-generate-device-image-default-spec-consts");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The work done by the tool be split into two areas:

  1. Link offload code by calling external linker tool.
  2. Wrap offload code.

Technically the tool calls host linker, but it's tested the same way as (1). (1) is tested by printing external commands in dry-run mode.

(2) is tested by printing the wrapper added by the tool. There is a special option for that.

Community version of the tool does not do the linking itself. It calls an external linker, we the test checks that external linker is invoked correctly. Assuming that external linker it tested, it's good enough check.

The main problem here is that clang-linker-wrapper is not supported call sycl-post-link directly. sycl-post-link should be invoked by clang-sycl-linker.

We can add an option to print the LLVM IR module, which SYCL code produces.

Eventually, we should use dry-run mode to validate that clang-linker-wrapper calls device linker in the SYCL mode.

Copy link
Contributor Author

@maksimsab maksimsab May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-post-link should be invoked by clang-sycl-linker.

We can add an option to print the LLVM IR module, which SYCL code produces.

This could be tested by inspecting the output of clang-sycl-linker.

Removed the printing of offload-wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants