Skip to content

[mlir][openmp] - Fix crash in OpenMPIRBuilder when converting to LLVMIR #84611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

bhandarkar-pranav
Copy link
Contributor

@bhandarkar-pranav bhandarkar-pranav commented Mar 9, 2024

This patch fixes an issue in the conversion of the omp MLIR dialect to LLVMIR using the OpenMPIRBuilder seen during offloading.
The finalize method in OpenMPIRBuilder outlines regions previoulsy marked for outlining (For example, when processing omp.task ops). To do so, it uses a datastructure that holds BasicBlocks that need to be outline.
However, if these regions belong to host code in a device module then these regions will have been removed when the omp.declare_target attribute is converted. As a result the finalize method ends up accessing bad data and crashes.

Fixes #84606

This patch fixes an issue in the conversion of the omp MLIR dialect
to LLVMIR using the OpenMPIRBuilder seen during offloading.
The finalize method in OpenMPIRBuilder outlines regions marked for
outlining. To do so, it uses a datastructure that holds blocks that
need to be outline. However, if these regions belong to host code in
a device module then these regions will have been removed when the
"omp.declare_target" attribute is converted. As a result the finalize
method ends up accessing bad data and crashes.
@llvmbot
Copy link
Member

llvmbot commented Mar 9, 2024

@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir-openmp

Author: Pranav Bhandarkar (bhandarkar-pranav)

Changes

This patch fixes an issue in the conversion of the omp MLIR dialect to LLVMIR using the OpenMPIRBuilder seen during offloading.
The finalize method in OpenMPIRBuilder outlines regions previoulsy marked for outlining (For example, when processing omp.task ops). To do so, it uses a datastructure that holds BasicBlocks that need to be outline.
However, if these regions belong to host code in a device module then these regions will have been removed when the omp.declare_target attribute is converted. As a result the finalize method ends up accessing bad data and crashes.

Fixes #84606


Full diff: https://github.com/llvm/llvm-project/pull/84611.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+7)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+12)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+2)
  • (added) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+27)
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index 5bbaa8c208b8cd..54908d01bc006c 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -465,6 +465,10 @@ class OpenMPIRBuilder {
 
   void setConfig(OpenMPIRBuilderConfig C) { Config = C; }
 
+  /// Remove all references or state about Func that OpenMPIRBuilder may
+  /// be keeping
+  void dropFunction(Function *Func);
+
   /// Finalize the underlying module, e.g., by outlining regions.
   /// \param Fn                    The function to be finalized. If not used,
   ///                              all functions are finalized.
@@ -1518,6 +1522,9 @@ class OpenMPIRBuilder {
   /// Add a new region that will be outlined later.
   void addOutlineInfo(OutlineInfo &&OI) { OutlineInfos.emplace_back(OI); }
 
+  /// Remove outlining information if it refers to a certain function
+  void removeFuncFromOutlineInfo(llvm::Function *Func);
+
   /// An ordered map of auto-generated variables to their unique names.
   /// It stores variables with the following names: 1) ".gomp_critical_user_" +
   /// <critical_section_name> + ".var" for "omp critical" directives; 2)
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index d65ed8c11d86cc..e634e221a96de7 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -488,6 +488,14 @@ void OpenMPIRBuilderConfig::setHasRequiresDynamicAllocators(bool Value) {
 // OpenMPIRBuilder
 //===----------------------------------------------------------------------===//
 
+void OpenMPIRBuilder::removeFuncFromOutlineInfo(llvm::Function *Func) {
+  OutlineInfos.erase(std::remove_if(OutlineInfos.begin(), OutlineInfos.end(),
+                                    [&](const OutlineInfo &OI) {
+                                      return OI.getFunction() == Func;
+                                    }),
+                     OutlineInfos.end());
+}
+
 void OpenMPIRBuilder::getKernelArgsVector(TargetKernelArgs &KernelArgs,
                                           IRBuilderBase &Builder,
                                           SmallVector<Value *> &ArgsVector) {
@@ -794,6 +802,10 @@ void OpenMPIRBuilder::finalize(Function *Fn) {
     createOffloadEntriesAndInfoMetadata(ErrorReportFn);
 }
 
+void OpenMPIRBuilder::dropFunction(Function *Func) {
+  removeFuncFromOutlineInfo(Func);
+}
+
 OpenMPIRBuilder::~OpenMPIRBuilder() {
   assert(OutlineInfos.empty() && "There must be no outstanding outlinings");
 }
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index bef227f2c583ed..ab836251e15eb0 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2853,6 +2853,8 @@ convertDeclareTargetAttr(Operation *op, mlir::omp::DeclareTargetAttr attribute,
       if (declareType == omp::DeclareTargetDeviceType::host) {
         llvm::Function *llvmFunc =
             moduleTranslation.lookupFunction(funcOp.getName());
+
+        moduleTranslation.getOpenMPBuilder()->dropFunction(llvmFunc);
         llvmFunc->dropAllReferences();
         llvmFunc->eraseFromParent();
       }
diff --git a/mlir/test/Target/LLVMIR/openmp-task-target-device.mlir b/mlir/test/Target/LLVMIR/openmp-task-target-device.mlir
new file mode 100644
index 00000000000000..c167604bcd12c0
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-task-target-device.mlir
@@ -0,0 +1,27 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// This tests the fix for https://github.com/llvm/llvm-project/issues/84606
+// We are only interested in ensuring that the -mlir-to-llmvir pass doesn't crash.
+// CHECK: {{.*}} = add i32 {{.*}}, 5
+module attributes {omp.is_target_device = true } {
+  llvm.func @_QQmain() attributes {fir.bindc_name = "main", omp.declare_target = #omp.declaretarget<device_type = (host), capture_clause = (to)>} {
+    %0 = llvm.mlir.constant(0 : i32) : i32
+    %1 = llvm.mlir.constant(1 : i64) : i64
+    %2 = llvm.alloca %1 x i32 {bindc_name = "a"} : (i64) -> !llvm.ptr<5>
+    %3 = llvm.addrspacecast %2 : !llvm.ptr<5> to !llvm.ptr
+    omp.task {
+      llvm.store %0, %3 : i32, !llvm.ptr
+      omp.terminator
+    }
+    %4 = omp.map_info var_ptr(%3 : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "a"}
+    omp.target map_entries(%4 -> %arg0 : !llvm.ptr) {
+    ^bb0(%arg0: !llvm.ptr):
+      %5 = llvm.mlir.constant(5 : i32) : i32
+      %6 = llvm.load %arg0  : !llvm.ptr -> i32
+      %7 = llvm.add %6, %5  : i32
+      llvm.store %7, %arg0  : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
+  }
+}

Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this Pranav. It looks related to this downstream patch by @DominikAdamski: ROCm#40.

It's yet another manifestation of the issues we have due to having to keep host code when compiling for the device just so that we're able to lower target regions inside. When we had the early outlining pass we didn't have this problem, because only device code remained by the time we lowered to LLVM IR. Now we're lowering all of the host code that exists in functions with an omp.target inside to then remove it later, which causes problems like what you're fixing here.

I think the long term solution for this is to actually remove host code from device compilation early (re-introducing some sort of early outlining, or at least somehow removing host operations in a pass while keeping target regions untouched). It seems difficult to selectively detect and skip lowering for these operations in advance in all cases, like Dominik's PR tries to do. And fixing the IR after lowering, like this patch does, is very difficult to get right for all situations, since lowering host operations might result in the creation of some OMPIRBuilder state used after the function has been deleted and this will change over time.

So I think this PR looks good as a short term solution to the issue, but we need to figure out a better plan long term. I'd like to hear from others' opinions about this more general issue. I don't oppose merging this PR, but I'll leave the approval to someone else.

@bhandarkar-pranav
Copy link
Contributor Author

Thanks for looking into this Pranav. It looks related to this downstream patch by @DominikAdamski: ROCm#40.

It's yet another manifestation of the issues we have due to having to keep host code when compiling for the device just so that we're able to lower target regions inside. When we had the early outlining pass we didn't have this problem, because only device code remained by the time we lowered to LLVM IR. Now we're lowering all of the host code that exists in functions with an omp.target inside to then remove it later, which causes problems like what you're fixing here.

I think the long term solution for this is to actually remove host code from device compilation early (re-introducing some sort of early outlining, or at least somehow removing host operations in a pass while keeping target regions untouched). It seems difficult to selectively detect and skip lowering for these operations in advance in all cases, like Dominik's PR tries to do. And fixing the IR after lowering, like this patch does, is very difficult to get right for all situations, since lowering host operations might result in the creation of some OMPIRBuilder state used after the function has been deleted and this will change over time.

So I think this PR looks good as a short term solution to the issue, but we need to figure out a better plan long term. I'd like to hear from others' opinions about this more general issue. I don't oppose merging this PR, but I'll leave the approval to someone else.

Thanks for pointing me to ROCm#40 @skatrak
I agree this fix is very awkward and one can never be sure that all the OMPIRBuilder state has been cleaned up properly. The best is to avoid creating unnecessary state in the first place. I am curious about why early outlining was removed in the first place because on the surface early outlining makes sense to me (There must have been a good reason). In the meantime, I'll follow the discussion on @DominikAdamski's PR.

Do you suggest, I close this PR then? I need to fix this problem so I can continue working on lowering the depend clause. The lowering of the depend clause goes through a transformation (#83966) that converts

omp.target depend(...)

to

omp.task depend(...) {
 omp.target {
 }
}

So, we will end up introducing host code anyway even in a function that has only the omp.target in it

@skatrak
Copy link
Member

skatrak commented Mar 12, 2024

My preference would be to still merge it, since it does fix some problems. We just need to be aware of its limitations and start thinking of a better way to deal with the root cause.

I think @agozillon and @jsjodin are more familiar with the reasons for the early outlining to be removed, but I seem to remember that it may have been done too early, resulting in complications when dealing with map clauses. I might be misremembering, though, so they might be able to correct me on that.

@agozillon
Copy link
Contributor

My preference would be to still merge it, since it does fix some problems. We just need to be aware of its limitations and start thinking of a better way to deal with the root cause.

I think @agozillon and @jsjodin are more familiar with the reasons for the early outlining to be removed, but I seem to remember that it may have been done too early, resulting in complications when dealing with map clauses. I might be misremembering, though, so they might be able to correct me on that.

From my recollection, as the early outlining was only being applied for the device, we still had (at least one of) the issues that it was originally there to prevent on the host pass, which was optimisation passes performing "malicious" (to TargetOp, debate-ably) movements of data out of the target region that caused issues in later lowering. This may or may not be addressable by another pass that occurs later after all optimisations have been ran that will re-affirm/canonicalize the map info fixing things that have been moved in and out of the region. The IsolatedFromAbove changes that we moved to addressed this as well though, so perhaps both approaches are not mutually exclusive.

The other issue that was more of a maintainability related issue that we encountered is that when utilised as a pass at HLFIR/FIR level it's very prone to requiring changes based on alterations to the HLFIR/FIR dialect as well as the OpenMP dialect as map_info (and likely other operations dependent on information that can be tied to Operations rather than ArgBlocks, i.e. anything that relies on attributes, or things like FIR/HLFIR address_of/declare which point to things like globals, bounds data etc.) and there associated captures variables would need cloned over into the newly generated outlined function and re-linked to the TargetOp. This might be alleviated to some extent by having the pass as something that occurs at an LLVM dialect level where the operations are simpler (at least from a map type perspective where everything is a pointer), but I imagine it'd still be prone to breakage from changes to either dialect, which is perhaps just something we'd have to live with. My primary recollection is having to change this pass a lot whenever implementing something new related to map clauses, however, this was early stages and might be something that would stop being an issue when we have the appropriate level of coverage!

I could go either way with an re-introduction of the EarlyOutliningPass, I am sure it has its merits as well as it's detriments to other methods we may introduce.

I think @jsjodin had some other concerns as well that I cannot recollect off the top of my head unfortunately.

Copy link
Contributor

@DominikAdamski DominikAdamski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that an MLIR pass, which operates on LLVM/OpenMP dialects and outlines target regions, is a good idea. I opt for a pass for the LLVM dialect because this dialect will be common for both Fortran and future C/C++ MLIR code. Its main advantage will be the removal of redundant LLVM IR code generation.

Currently, we remove this redundant code in the convertDeclareTargetAttr function. We had a discussion about whether this approach is acceptable, and we decided to accept it (see https://reviews.llvm.org/D147641 ). If we remove the original host function, we should also remove the information about planned LLVM IR outlining. Having said that, I tend to accept this patch as a temporary solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a more detailed test that proves no host-related LLVM-IR is present in the generated target LLVM IR?

jsjodin added a commit that referenced this pull request Apr 5, 2024
…#85239)

This patch separates the lowering dispatch for host and target devices.
For the target device, if the current operation is not a top-level
operation (e.g. omp.target) or is inside a target device code region it
will be ignored, since it belongs to the host code.


This is an alternative approach to #84611, the new test in this PR was
taken from there.
@bhandarkar-pranav
Copy link
Contributor Author

Closing this PR because an alternative, #85239, was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[flang][openmp][mlir] - flang-new crashes when omp.task and omp.target are used in the same function
5 participants