Skip to content

[msan] Add off-by-default flag to fix false negatives from partially undefined constant fixed-length vectors #143837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

thurstond
Copy link
Contributor

@thurstond thurstond commented Jun 12, 2025

This patch adds an off-by-default flag that, when enabled, fixes a false negative in MSan (partially-undefined constant fixed-length vectors). The fix can be enabled using '-mllvm -msan-poison-undef-vectors=true'. It is currently off by default since, by fixing the false positive, code/tests that previously passed MSan may start failing. The default will be changed in a future patch.

Prior to this patch, MSan computes that partially-undefined constant fixed-length vectors are fully initialized, which leads to false negatives; moreover, benign vector rewriting could theoretically flip MSan's shadow computation from initialized to uninitialized or vice-versa (*). -msan-poison-undef-vectors=true calculates the shadow precisely: for each element of the vector, the corresponding shadow is fully uninitialized if the element is undefined/poisoned, otherwise it is fully initialized.

Updates the test from #143823

(*) For example:

%x = insertelement <2 x i64> <i64 0, i64 poison>, i64 42, i64 0
%y = insertelement <2 x i64> <i64 poison, i64 poison>, i64 42, i64 0

%x and %y are equivalent but, prior to this patch, MSan incorrectly computes the shadow of %x as <0, 0> rather than <0, -1>.

…length vectors

For each element of the vector, the shadow is initialized iff the
element is not undefined/poisoned. Previously, partially undefined
constant fixed-length vectors always had fully initialized shadows,
which leads to false negatives.

Updates the tests from llvm#143823

Note: since this patch corrects a false negative, MSan can now detect more bugs (corollary: code/tests that passed MSan may start failing).
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-llvm-transforms

Author: Thurston Dang (thurstond)

Changes

Previously, partially undefined constant fixed-length vectors always had fully initialized shadows, which leads to false negatives. This patch calculates the shadow precisely: for each element of the vector, the corresponding shadow is fully uninitialized if the element is undefined/poisoned, otherwise it is fully initialized.

Updates the test from #143823

Note: since this patch corrects a false negative, MSan can now detect more bugs (corollary: code/tests that passed MSan may start failing).


Full diff: https://github.com/llvm/llvm-project/pull/143837.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (+21-2)
  • (modified) llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll (+5-6)
diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index d3c6a7151ec37..b345d9dc06654 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -1989,6 +1989,8 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
       }
       return Shadow;
     }
+    // Handle fully undefined values
+    // (partially undefined constant vectors are handled later)
     if (UndefValue *U = dyn_cast<UndefValue>(V)) {
       Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
                                                         : getCleanShadow(V);
@@ -2086,8 +2088,25 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
       return ShadowPtr;
     }
 
-    // TODO: Partially undefined vectors are handled by the fall-through case
-    //       below (see partial-poison.ll); this causes false negatives.
+    // Check for partially undefined constant vectors
+    // TODO: scalable vectors (this is hard because we do not have IRBuilder)
+    if (   isa<FixedVectorType>(V->getType())
+        && isa<Constant>(V)
+        && (cast<Constant>(V))->containsUndefOrPoisonElement()
+        && PropagateShadow
+        && PoisonUndef) {
+      unsigned NumElems = (cast<FixedVectorType>(V->getType()))->getNumElements();
+      SmallVector<Constant *, 32> ShadowVector(NumElems);
+      for (unsigned i = 0; i != NumElems; ++i) {
+        Constant *Elem = (cast<Constant>(V))->getAggregateElement(i);
+        ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem) : getCleanShadow(Elem);
+      }
+
+      Value *ShadowConstant = ConstantVector::get(ShadowVector);
+      LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> " << *ShadowConstant << "\n");
+
+      return ShadowConstant;
+    }
 
     // For everything else the shadow is zero.
     return getCleanShadow(V);
diff --git a/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll b/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll
index 5164441c17e10..e058086b040ad 100644
--- a/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll
+++ b/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll
@@ -1,8 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
 ; RUN: opt < %s -S -passes='msan' 2>&1 | FileCheck %s
 ;
-; Test case to show that MSan computes shadows for partially poisoned vectors
-; as fully initialized, resulting in false negatives.
+; Regression test case for computing shadows of partially poisoned vectors
 
 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
@@ -11,7 +10,7 @@ define <2 x i64> @left_poison(ptr %add.ptr) sanitize_memory {
 ; CHECK-LABEL: define <2 x i64> @left_poison(
 ; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0:[0-9]+]] {
 ; CHECK-NEXT:    call void @llvm.donothing()
-; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    store <2 x i64> <i64 -1, i64 0>, ptr @__msan_retval_tls, align 8
 ; CHECK-NEXT:    ret <2 x i64> <i64 poison, i64 42>
 ;
   ret <2 x i64> <i64 poison, i64 42>
@@ -21,7 +20,7 @@ define <2 x i64> @right_poison(ptr %add.ptr) sanitize_memory {
 ; CHECK-LABEL: define <2 x i64> @right_poison(
 ; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
 ; CHECK-NEXT:    call void @llvm.donothing()
-; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    store <2 x i64> <i64 0, i64 -1>, ptr @__msan_retval_tls, align 8
 ; CHECK-NEXT:    ret <2 x i64> <i64 42, i64 poison>
 ;
   ret <2 x i64> <i64 42, i64 poison>
@@ -51,7 +50,7 @@ define <2 x i64> @left_undef(ptr %add.ptr) sanitize_memory {
 ; CHECK-LABEL: define <2 x i64> @left_undef(
 ; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
 ; CHECK-NEXT:    call void @llvm.donothing()
-; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    store <2 x i64> <i64 -1, i64 0>, ptr @__msan_retval_tls, align 8
 ; CHECK-NEXT:    ret <2 x i64> <i64 undef, i64 42>
 ;
   ret <2 x i64> <i64 undef, i64 42>
@@ -61,7 +60,7 @@ define <2 x i64> @right_undef(ptr %add.ptr) sanitize_memory {
 ; CHECK-LABEL: define <2 x i64> @right_undef(
 ; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
 ; CHECK-NEXT:    call void @llvm.donothing()
-; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    store <2 x i64> <i64 0, i64 -1>, ptr @__msan_retval_tls, align 8
 ; CHECK-NEXT:    ret <2 x i64> <i64 42, i64 undef>
 ;
   ret <2 x i64> <i64 42, i64 undef>

@fmayer
Copy link
Contributor

fmayer commented Jun 12, 2025

Given the corollary, do we want to add a flag to revert back to the old behavior? That would make debugging easier.

@vitalybuka
Copy link
Collaborator

Given the corollary, do we want to add a flag to revert back to the old behavior? That would make debugging easier.

If it affects a limited set of tests, I don't think it deserve the flag, otherwise we will have to add them for each patch.

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but please wait until we resolve the current regression

@fmayer
Copy link
Contributor

fmayer commented Jun 12, 2025

otherwise we will have to add them for each patch.

For this patch it is just extremely easy because it's all contained anyway. But no super strong opinion.

(cast<Constant>(V))->containsUndefOrPoisonElement() &&
PropagateShadow && PoisonUndef) {
unsigned NumElems =
(cast<FixedVectorType>(V->getType()))->getNumElements();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray (..)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

(cast<FixedVectorType>(V->getType()))->getNumElements();
SmallVector<Constant *, 32> ShadowVector(NumElems);
for (unsigned i = 0; i != NumElems; ++i) {
Constant *Elem = (cast<Constant>(V))->getAggregateElement(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@thurstond
Copy link
Contributor Author

Given the corollary, do we want to add a flag to revert back to the old behavior? That would make debugging easier.

If it affects a limited set of tests, I don't think it deserve the flag, otherwise we will have to add them for each patch.

As @fmayer notes, it could be useful for debugging and is fairly easy to implement: 97bfb34

I'm happy to drop it if you still think it's too much bloat.

Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll

The following files introduce new uses of undef:

  • llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@thurstond thurstond changed the title [msan] Fix shadow computation for partially undefined constant fixed-length vectors [msan] Fix false negatives from partially undefined constant fixed-length vectors Jun 12, 2025
@thurstond
Copy link
Contributor Author

LGTM but please wait until we resolve the current regression

Can I land it earlier if I change the fix to be off by default (controlled by the new flag)? (And then eventually flip the flag on by default in a separate CL.)

The advantages are:

  • once the fix lands (disabled by default) in toolchains, it's easier to let owners of buggy code re-run MSan with the flag enabled, rather than telling them they need to custom patch the compiler
  • the fix will be lightly tested in CI by the new test case

@fmayer
Copy link
Contributor

fmayer commented Jun 13, 2025

LGTM but please wait until we resolve the current regression

Can I land it earlier if I change the fix to be off by default (controlled by the new flag)? (And then eventually flip the flag on by default in a separate CL.)

The advantages are:

  • once the fix lands (disabled by default) in toolchains, it's easier to let owners of buggy code re-run MSan with the flag enabled, rather than telling them they need to custom patch the compiler
  • the fix will be lightly tested in CI by the new test case

IMO, ship it

@thurstond
Copy link
Contributor Author

LGTM but please wait until we resolve the current regression

Can I land it earlier if I change the fix to be off by default (controlled by the new flag)? (And then eventually flip the flag on by default in a separate CL.)
The advantages are:

  • once the fix lands (disabled by default) in toolchains, it's easier to let owners of buggy code re-run MSan with the flag enabled, rather than telling them they need to custom patch the compiler
  • the fix will be lightly tested in CI by the new test case

IMO, ship it

To clarify, ship it off by default?

@fmayer
Copy link
Contributor

fmayer commented Jun 13, 2025

LGTM but please wait until we resolve the current regression

Can I land it earlier if I change the fix to be off by default (controlled by the new flag)? (And then eventually flip the flag on by default in a separate CL.)
The advantages are:

  • once the fix lands (disabled by default) in toolchains, it's easier to let owners of buggy code re-run MSan with the flag enabled, rather than telling them they need to custom patch the compiler
  • the fix will be lightly tested in CI by the new test case

IMO, ship it

To clarify, ship it off by default?

Yes. That should sufficiently address Vitaly's concern and doesn't hurt.

@thurstond thurstond changed the title [msan] Fix false negatives from partially undefined constant fixed-length vectors [NFCI][msan] Add off-by-default flag to fix false negatives from partially undefined constant fixed-length vectors Jun 13, 2025
@thurstond thurstond changed the title [NFCI][msan] Add off-by-default flag to fix false negatives from partially undefined constant fixed-length vectors [msan] Add off-by-default flag to fix false negatives from partially undefined constant fixed-length vectors Jun 13, 2025
@thurstond thurstond requested a review from vitalybuka June 16, 2025 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants