Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add support for -foffload-fp32-prec-div/sqrt options. #15836

Open
wants to merge 16 commits into
base: sycl
Choose a base branch
from

Conversation

zahiraam
Copy link
Contributor

@zahiraam zahiraam commented Oct 23, 2024

Add support for options -f[no]-offload-fp32-prec-div and -f[no-]-offload-fp32-prec-sqrt.
These options are added to allow users to control whether fdiv and sqrt operations in offload device code are required to return correctly rounded results. In order to communicate this to the device code, we need the front end to generate IR that reflects the choice.

When the correctly rounded setting is used, we can just generate the fdiv instruction and llvm.sqrt intrinsic, because these operations are required to be correctly rounded by default in LLVM IR.

When the result is not required to be correctly rounded, the front end should generate a call to the llvm.fpbuiltin.fdiv or llvm.fpbuiltin.sqrt intrinsic with the fpbuiltin-max-error attribute set. For single precision fdiv, the setting should be 2.5. For single-precision sqrt, the setting should be 3.0.

If the -ffp-accuracy option is used, we should issue warnings if the settings conflict with an explicitly set -foffload-fp32-prec-div or -foffload-fp32-prec-sqrt option.

@zahiraam zahiraam changed the title Add support for -ftarget-prec-div/sqrt options. Add support for -foffload-fp32-prec-div/sqrt options. Oct 24, 2024
@zahiraam zahiraam marked this pull request as ready for review October 28, 2024 17:25
@zahiraam zahiraam requested review from a team as code owners October 28, 2024 17:25
@zahiraam zahiraam changed the title Add support for -foffload-fp32-prec-div/sqrt options. [SYCL] Add support for -foffload-fp32-prec-div/sqrt options. Oct 29, 2024
Comment on lines 1736 to 1739
if (!strcmp(A->getValue(), "fast")) {
CmdArgs.push_back("-fno-offload-fp32-prec-div");
CmdArgs.push_back("-fno-offload-fp32-prec-sqrt");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow users to override with -foffload-fp32-prec-div|sqrt?

Suggested change
if (!strcmp(A->getValue(), "fast")) {
CmdArgs.push_back("-fno-offload-fp32-prec-div");
CmdArgs.push_back("-fno-offload-fp32-prec-sqrt");
}
if (!strcmp(A->getValue(), "fast")) {
if (!Args.hasFlag(option::OPT_foffload_fp32_prec_div,
option::OPT_fno_offload_fp32_prec_div, false))
CmdArgs.push_back("-fno-offload-fp32-prec-div");
if (!Args.hasFlag(option::OPT_foffload_fp32_prec_sqrt,
option::OPT_fno_offload_fp32_prec_sqrt, false))
CmdArgs.push_back("-fno-offload-fp32-prec-sqrt");
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I would think that users could choose to compile with:
clang -fsycl -ffp-model=fast -foffload-fp32-prec-sqrt hello.cpp
or:
clang -fsycl -foffload-fp32-prec-sqrt -ffp-model=fast hello.cpp
These shouldn't give the same result. In the first one, the sqrt results are precise. In the second one, they are rounded.

I think that's just following the last command wins rule. In which case we need a compilated process here to find the order in which the options interact with one another.

Copy link
Contributor

@mdtoguchi mdtoguchi Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... If that's the case we may want to integrate the logic into where all of the other FP model options are being manipulated in the larger for loop here:

static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
and only add the -cc1 option under the IsDeviceOffloading condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay and that would work for OpenMP too!

Comment on lines 1747 to 1750
if (Args.getLastArg(options::OPT_fno_offload_fp32_prec_div))
CmdArgs.push_back("-fno-offload-fp32-prec-div");
else
CmdArgs.push_back("-foffload-fp32-prec-div");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Args.getLastArg(options::OPT_fno_offload_fp32_prec_div))
CmdArgs.push_back("-fno-offload-fp32-prec-div");
else
CmdArgs.push_back("-foffload-fp32-prec-div");
if (!Args.hasFlag(option::OPT_foffload_fp32_prec_div,
option::OPT_fno_offload_fp32_prec_div, true))
CmdArgs.push_back("-fno-offload-fp32-prec-div");

Since -foffload-fp32-prec-div is default

if (Args.getLastArg(options::OPT_fno_offload_fp32_prec_sqrt))
CmdArgs.push_back("-fno-offload-fp32-prec-sqrt");
else
CmdArgs.push_back("-foffload-fp32-prec-sqrt");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment to above.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@premanandrao can you review this please?

function instead of adding a JobAction to handle it.
Comment on lines 33 to 34
OPTION(OffloadFp32PrecDiv, bool, 1, ComplexRange)
OPTION(OffloadFp32PrecSqrt, bool, 1, OffloadFp32PrecDiv)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OPTION(OffloadFp32PrecDiv, bool, 1, ComplexRange)
OPTION(OffloadFp32PrecSqrt, bool, 1, OffloadFp32PrecDiv)
OPTION(OffloadFP32PrecDiv, bool, 1, ComplexRange)
OPTION(OffloadFP32PrecSqrt, bool, 1, OffloadFP32PrecDiv)

Comment on lines 375 to 376
LANGOPT(OffloadFp32PrecDiv, 1, 1, "Return correctly rounded results of fdiv")
LANGOPT(OffloadFp32PrecSqrt, 1, 1, "Return correctly rounded results of sqrt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LANGOPT(OffloadFp32PrecDiv, 1, 1, "Return correctly rounded results of fdiv")
LANGOPT(OffloadFp32PrecSqrt, 1, 1, "Return correctly rounded results of sqrt")
LANGOPT(OffloadFP32PrecDiv, 1, 1, "Return correctly rounded results of fdiv")
LANGOPT(OffloadFP32PrecSqrt, 1, 1, "Return correctly rounded results of sqrt")

@@ -1157,6 +1157,22 @@ defm cx_fortran_rules: BoolOptionWithoutMarshalling<"f", "cx-fortran-rules",
NegFlag<SetFalse, [], [ClangOption, CC1Option], "Range reduction is disabled "
"for complex arithmetic operations">>;

defm offload_fp32_prec_div: BoolOption<"f", "offload-fp32-prec-div",
LangOpts<"OffloadFp32PrecDiv">, DefaultTrue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LangOpts<"OffloadFp32PrecDiv">, DefaultTrue,
LangOpts<"OffloadFP32PrecDiv">, DefaultTrue,

Group<f_Group>;

defm offload_fp32_prec_sqrt: BoolOption<"f", "offload-fp32-prec-sqrt",
LangOpts<"OffloadFp32PrecSqrt">, DefaultTrue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LangOpts<"OffloadFp32PrecSqrt">, DefaultTrue,
LangOpts<"OffloadFP32PrecSqrt">, DefaultTrue,

Comment on lines 24204 to 24205
!LangOpts.FPAccuracyVal.empty() || !LangOpts.OffloadFp32PrecDiv ||
!LangOpts.OffloadFp32PrecSqrt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!LangOpts.FPAccuracyVal.empty() || !LangOpts.OffloadFp32PrecDiv ||
!LangOpts.OffloadFp32PrecSqrt) {
!LangOpts.FPAccuracyVal.empty() || !LangOpts.OffloadFP32PrecDiv ||
!LangOpts.OffloadFP32PrecSqrt) {

// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown

// DEFINE: %{common_opts_spirv64} = -internal-isystem %S/Inputs \
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv64-unknown-unknown

// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown

// DEFINE: %{common_opts_spir} = -internal-isystem %S/Inputs \
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown
// DEFINE: -fsycl-is-device -emit-llvm -triple spir32-unknown-unknown

// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown

// DEFINE: %{common_opts_spir64} = -internal-isystem %S/Inputs \
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv32-unknown-unknown
// DEFINE: -fsycl-is-device -emit-llvm -triple spir64-unknown-unknown

if (Name == "fdiv" && !getLangOpts().OffloadFp32PrecDiv)
FPAccuracyVal = "2.5";
if (!FPAccuracyVal.empty())
FuncAttrs.addAttribute("fpbuiltin-max-error", FPAccuracyVal);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the combination supposed to work? If the condition in 1894 was true, would two fpbuiltin-max-error attributes get added? Once in 1898 and again in 1907?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the condition in 1894 is satisfied, then the FuncAttrs.size() != 0); we will not get into this code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, maybe I wasn't clear before. Let me type out what I am asking:

  if (FuncAttrs.attrs().size() == 0) {
    StringRef FPAccuracyVal;
    if (!getLangOpts().FPAccuracyVal.empty()) {
      ...
      FPAccuracyVal = llvm::fp::getAccuracyForFPBuiltin(...);
      FuncAttrs.addAttribute("fpbuiltin-max-error", FPAccuracyVal);  // #Attr here 1
      ... 
    }
    if (Name == "sqrt" && !getLangOpts().OffloadFp32PrecSqrt)
      FPAccuracyVal = "3.0";
    if (Name == "fdiv" && !getLangOpts().OffloadFp32PrecDiv)
      FPAccuracyVal = "2.5";
    if (!FPAccuracyVal.empty())
      FuncAttrs.addAttribute("fpbuiltin-max-error", FPAccuracyVal);    // #Attr here 2

Couldn't you get into the size == 0 block, set FPAccuracyVal, add the attribute (#1), and if name is one of sqrt or fdiv, set FPAccuracyVal again, and then add the attribute again (#2)?

Is this combination supposed to work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! good catch. I think this will fix it.

@@ -1781,7 +1781,6 @@ void Clang::RenderTargetOptions(const llvm::Triple &EffectiveTriple,
switch (TC.getArch()) {
default:
break;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inadvertent line removal?

@@ -55,6 +55,9 @@ class LLVM_LIBRARY_VISIBILITY Clang : public Tool {
const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs,
bool KernelOrKext) const;
void AddSPIRTargetArgs(const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs, const JobAction &JA,
const Driver &D) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes not needed anymore?

Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it should go to this PR (guess it should). The feature uses SPV_INTEL_fp_max_error SPIR-V extension. Currently it's enabled only for AOT CPU compilation. So now we should also enable it when the options are passed.

@zahiraam
Copy link
Contributor Author

zahiraam commented Nov 4, 2024

@MrSidims Are you saying these new options should only be enabled with SPV_INTEL_fp_max_error (I am not familiar with this option). I see that this latter option is used only for CPU compilation https://github.com/intel/llvm/blob/sycl/clang/lib/Driver/ToolChains/Clang.cpp#L10742.
The options implemented in this patch are for device compilation only.

@zahiraam
Copy link
Contributor Author

zahiraam commented Nov 4, 2024

Thanks @mdtoguchi for the explanation. I will add the extension to this PR.

@MrSidims
Copy link
Contributor

MrSidims commented Nov 4, 2024

@zahiraam I believe https://github.com/intel/llvm/blob/sycl/clang/lib/Driver/ToolChains/Clang.cpp#L10742 should be changed to something like:
if (IsCPU || non-precise-div-opt-enabled || non-precise-sqrt-opt-enabled)

// RUN: %clang -target x86_64-unknown-linux-gnu -fsycl --no-offload-new-driver -fsycl-targets=spir64_x86_64-unknown-unknown -fno-offload-fp32-prec-div -fno-offload-fp32-prec-sqrt %s -### 2>&1 \
// RUN: | FileCheck %s -check-prefixes=CHECK-CPU
// RUN: %clang -target x86_64-unknown-linux-gnu -fsycl --no-offload-new-driver -fsycl-targets=spir64_x86_64-unknown-unknown -foffload-fp32-prec-sqrt %s -### 2>&1 \
// RUN: | FileCheck %s -check-prefixes=CHECK-CPU-NFPME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce the duplication of all of the extensions, can you do something like: -check-prefixes=CHECK-CPU,CHECK-CPU-NFPME
Here, CHECK-CPU does not add the fp_max_error, but is rather checked with a CHECK-CPU-NFPME-NOT

clang/test/Driver/sycl-spirv-ext-old-model.c Show resolved Hide resolved
@@ -129,3 +186,110 @@
// CHECK-CPU-SAME:,+SPV_KHR_non_semantic_info
// CHECK-CPU-SAME:,+SPV_KHR_cooperative_matrix
// CHECK-CPU-SAME:,+SPV_INTEL_fp_max_error"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK-CPU-SAME:,+SPV_INTEL_fp_max_error"
// CHECK-CPU-FPME:,+SPV_INTEL_fp_max_error"

To match the suggested change with -check-prefixes above.

Comment on lines 190 to 226
// CHECK-CPU-NFPME: llvm-spirv{{.*}}"-spirv-allow-unknown-intrinsics=llvm.genx.,llvm.fpbuiltin"
// CHECK-CPU-NFPME-SAME: {{.*}}"-spirv-ext=-all
// CHECK-CPU-NFPME-SAME:,+SPV_EXT_shader_atomic_float_add
// CHECK-CPU-NFPME-SAME:,+SPV_EXT_shader_atomic_float_min_max
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_no_integer_wrap_decoration,+SPV_KHR_float_controls
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_expect_assume,+SPV_KHR_linkonce_odr
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_subgroups,+SPV_INTEL_media_block_io
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_device_side_avc_motion_estimation
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_loop_controls
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_unstructured_loop_controls,+SPV_INTEL_fpga_reg
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_blocking_pipes,+SPV_INTEL_function_pointers
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_kernel_attributes,+SPV_INTEL_io_pipes
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_inline_assembly,+SPV_INTEL_arbitrary_precision_integers
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_float_controls2
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_vector_compute,+SPV_INTEL_fast_composite
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_arbitrary_precision_fixed_point
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_arbitrary_precision_floating_point
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_variable_length_array,+SPV_INTEL_fp_fast_math_mode
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_long_constant_composite
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_arithmetic_fence
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_cache_controls
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_buffer_location
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_argument_interfaces
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_invocation_pipelining_attributes
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_latency_control
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_task_sequence
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_token_type
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_bfloat16_conversion
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_joint_matrix
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_hw_thread_queries
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_uniform_group_instructions
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_masked_gather_scatter
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_tensor_float32_conversion
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_optnone
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_non_semantic_info
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_cooperative_matrix
// CHECK-CPU-NFPME-NOT:,+SPV_INTEL_fp_max_error"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK-CPU-NFPME: llvm-spirv{{.*}}"-spirv-allow-unknown-intrinsics=llvm.genx.,llvm.fpbuiltin"
// CHECK-CPU-NFPME-SAME: {{.*}}"-spirv-ext=-all
// CHECK-CPU-NFPME-SAME:,+SPV_EXT_shader_atomic_float_add
// CHECK-CPU-NFPME-SAME:,+SPV_EXT_shader_atomic_float_min_max
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_no_integer_wrap_decoration,+SPV_KHR_float_controls
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_expect_assume,+SPV_KHR_linkonce_odr
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_subgroups,+SPV_INTEL_media_block_io
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_device_side_avc_motion_estimation
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_loop_controls
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_unstructured_loop_controls,+SPV_INTEL_fpga_reg
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_blocking_pipes,+SPV_INTEL_function_pointers
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_kernel_attributes,+SPV_INTEL_io_pipes
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_inline_assembly,+SPV_INTEL_arbitrary_precision_integers
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_float_controls2
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_vector_compute,+SPV_INTEL_fast_composite
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_arbitrary_precision_fixed_point
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_arbitrary_precision_floating_point
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_variable_length_array,+SPV_INTEL_fp_fast_math_mode
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_long_constant_composite
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_arithmetic_fence
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_cache_controls
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_buffer_location
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_argument_interfaces
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_invocation_pipelining_attributes
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_fpga_latency_control
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_task_sequence
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_token_type
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_bfloat16_conversion
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_joint_matrix
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_hw_thread_queries
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_uniform_group_instructions
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_masked_gather_scatter
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_tensor_float32_conversion
// CHECK-CPU-NFPME-SAME:,+SPV_INTEL_optnone
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_non_semantic_info
// CHECK-CPU-NFPME-SAME:,+SPV_KHR_cooperative_matrix
// CHECK-CPU-NFPME-NOT:,+SPV_INTEL_fp_max_error"
// CHECK-CPU-NFPME-NOT:,+SPV_INTEL_fp_max_error"

To match up with the -check-prefixes suggestion above.

// CHECK-FPGA-HW-FPME-SAME:,+SPV_INTEL_fpga_dsp_control
// CHECK-FPGA-HW-FPME-SAME:,+SPV_INTEL_fpga_memory_accesses
// CHECK-FPGA-HW-FPME-SAME:,+SPV_INTEL_fpga_memory_attributes
// CHECK-FPGA-HW-FPME-SAME:,+SPV_INTEL_fp_max_error"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar changes here as suggested above to reduce string redundancy.

Comment on lines 10696 to 10697
if (IsCPU && hasNoOffloadFP32PrecOption(TCArgs) ||
!IsCPU && shouldUseOffloadFP32PrecOption(TCArgs)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (IsCPU && hasNoOffloadFP32PrecOption(TCArgs) ||
!IsCPU && shouldUseOffloadFP32PrecOption(TCArgs)) {
if ((IsCPU && hasNoOffloadFP32PrecOption(TCArgs)) ||
shouldUseOffloadFP32PrecOption(TCArgs)) {

I believe the option settings should always trigger regardless if doing AOT for CPU.

// DEFINE: -fsycl-is-device -emit-llvm -triple spirv-unknown-unknown

// DEFINE: %{common_opts_spir64} = -internal-isystem %S/Inputs \
// DEFINE: -fsycl-is-device -emit-llvm -triple spirv64-unknown-unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common_opts_spir64 seems identical to common_opts_spirv64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now.

Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPV_INTEL_fp_max_error related changes LGTM

}
};

auto ParseFPAccOption = [&](StringRef Val, bool &NoOffloadFlag) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto ParseFPAccOption = [&](StringRef Val, bool &NoOffloadFlag) {
auto parseFPAccOption = [&](StringRef Val, bool &NoOffloadFlag) {

Function naming should start with lowercase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants