-
Notifications
You must be signed in to change notification settings - Fork 8
Feature/merge upstream 20230828 #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reduction is heavily used for many DL workload especially with softmax/Attention layers. Wave/Warp shuffle and reduction is known to be a speedy/efficient way to do these reductions. In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D158684
…mpAndBinOp` to work for more binops" (2nd Try) Was missing a nullptr check before derefencing. Fixed + test case included in the patch. Re-Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148414
Differential Revision: https://reviews.llvm.org/D157301
Just covering an additional case. Proof: https://alive2.llvm.org/ce/z/MJz9fT Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D157302
The Driver for NaCl already handled the header paths so disable the default fallback path.
structured-block where clause is one of the following: private(list) reduction([reduction-modifier ,] reduction-identifier : list) nowait Differential Revision: https://reviews.llvm.org/D157933
I am trying to clean up GCCInstallationDetector::init and noticed that Myriad.cpp is the only toolchain using `ExtraTripleAliases`. This is a little overhead, but I figured that Myriad.cpp is unused. Its sanitizer runtime part was removed in 2021 by D104279. It seems time to retire it. Reviewed By: waltl Differential Revision: https://reviews.llvm.org/D158706
This adds fields to AsmParserState to track attribute and type alias definitions and uses and teachers the parser to inform the AsmParserState about them. This will be used to add LSP support for goto definition and find references for aliases. Attribute aliases are tolerant to use before def, because certain location aliases may be deferred. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D158781
/data/llvm-project/mlir/lib/AsmParser/AsmParserState.cpp:293:8: error: unused variable '[it, inserted]' [-Werror,-Wunused-variable] auto [it, inserted] = ^ 1 error generated.
D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the fact that resource allocation is now represented as an interval, relatively to the issue cycle of the instruction. This patch implements that TODO. This naming clarifies how to use these fields in the scheduler. In addition it was confusing that `StartAtCycle` was singular but `Cycles` was plural. This renaming fixes this inconsistency. This commit as previously reverted since it missed renaming that came down after rebasing. This version of the commit fixes those problems. Differential Revision: https://reviews.llvm.org/D158568
…tterns with floating points." This reverts commit 5ec1353.
Differential Revision: https://reviews.llvm.org/D158809
…`vd == v0` According to `riscv-v-spec-1.0.pdf` page 52: > masked va >= x, vd == v0 > pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt > expansion: vmslt{u}.vx vt, va, x; vmandn.mm vd, vd, vt The resulting `vmslt{u}.vx` is not masked. This patch fixes the logic in `RISCVAsmParser`, to make the behavior consistent with the case "masked va >= x, any vd" in the later part of the code, where no mask op is added. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158392
Certain instrumentations set the !nosanitize metadata for inserted instructions, which are generally not interested for sanitizers. Skip tsan instrumentation like we do for asan (D126294)/msan/hwasan. -fprofile-arcs instrumentation has data race unless -fprofile-update=atomic is specified. Let's remove the the `__llvm_gcov` special case from commit 0222adb (2016) as the racy instructions have the !nosanitize metadata. (-fprofile-arcs instrumentation does not use `__llvm_gcda` as global variables.) ``` std::atomic<int> c; void foo() { c++; } int main() { std::thread th(foo); c++; th.join(); } ``` Tested that `clang++ --coverage -fsanitize=thread a.cc && ./a.out` does not report spurious tsan errors. Also remove the default CC1 option -fprofile-update=atomic for -fsanitize=thread to make options more orthogonal. Reviewed By: Enna1 Differential Revision: https://reviews.llvm.org/D158385
- Update license, display name, repository and keywords - Add homepage, issue page - Fix formatting and typos Differential revision: https://reviews.llvm.org/D158801
%x = shl i64 %w, n %y = add i64 %x, c %z = ashr i64 %y, m The above given instruction triplet is seen many times in the generated LLVM IR, but SCEV model is not able to compute the SCEV value of AShr instruction in this case. This patch models the two cases of the above instruction pattern using the following expression: => sext(add(mul(trunc(w), 2^(n-m)), c >> m)) 1) when n = m the expression reduces to sext(add(trunc(w), c >> n)) as n-m=0, and multiplying with 2^0 gives the same result. 2) when n > m the expression works as given above. It also adds several unittest to verify that SCEV is able to compute the value. $ opt sext-add-inreg.ll -passes="print<scalar-evolution>" Comparing the snippets of the result of SCEV analysis: * SCEV of ashr before change ---------------------------- %idxprom = ashr exact i64 %sext, 32 --> %idxprom U: [-2147483648,2147483648) S: [-2147483648,2147483648) Exits: 8 LoopDispositions: { %for.body: Variant } * SCEV of ashr after change --------------------------- %idxprom = ashr exact i64 %sext, 32 --> {0,+,1}<nuw><nsw><%for.body> U: [0,9) S: [0,9) Exits: 8 LoopDispositions: { %for.body: Computable } LoopDisposition of the given SCEV was LoopVariant before, after adding the new way to model the instruction, the LoopDisposition becomes LoopComputable as it is able to compute the SCEV of the instruction. Differential Revision: https://reviews.llvm.org/D152278
This wires in attribute and type aliases into the MLIR LSP server. This will allow goto definition and find references on attribute and type references, which should make debugging locations and other metadata easier. Depends on D158781 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D158782
In https://reviews.llvm.org/D157928 ellison of printing resources was added. In the refactor, the proper printing of escape characters was mistakenly removed. This patch adds it back in and adds a small unit test. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D158700
sparse_tensor ops cannot be bufferized with One-Shot Bufferize. (They can only be analyzed.) The sparse compiler does the actual lowering to memref. Produce a proper error message instead of crashing. This fixes #61311. Differential Revision: https://reviews.llvm.org/D158728
…n a cast Fixes #64949 Reviewed By: Fznamznon, erichkeane, shafik Differential Revision: https://reviews.llvm.org/D158733
…ainst uninitialized object BEFORE this patch, compound assignment operator against uninitialized object such as uninit += 1 was diagnosed as subexpression not valid This patch clarifies the reason for the error by saying that uninit is an uninitialized object. Fixes llvm/llvm-project#51536 Reviewed By: shafik, tbaeder Differential Revision: https://reviews.llvm.org/D157855
As a side effect, introduce AtomicExpr::getOpAsString() to dump the AtomicOp string representation. This is a recommit with the target fully specified. Differential Revision: https://reviews.llvm.org/D158558
isLegalToHoistInto() currently return true for callbr instructions. That means that a callbr with one successor will be considered a proper loop preheader, which may result in instructions that use the callbr return value being hoisted past it. Fix this by adding callbr to isExceptionTerminator (with a rename to isSpecialTerminator), which also fixes similar assumptions in other places. Fixes llvm/llvm-project#64215. Differential Revision: https://reviews.llvm.org/D158609
…SelectICmpAndBinOp` to work for more binops" (2nd Try)" Still appears to be buggy: https://lab.llvm.org/buildbot/#/builders/124/builds/8260 This reverts commit 397a9cc.
There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D152993
There seems to be something target-specific in the test, but I cannot get why, revering. s Failing buildbot: https://lab.llvm.org/buildbot/#/builders/216/builds/26256 This reverts commit 01b2554.
This test is simplified from lld/test/MachO/compact-unwind-lsda-folding.s, which tests .uleb128 A-B where A and B are in different fragments (not tested in llvm/). `.uleb128 Lfunc_end0-Ltmp1` requires evaluateKnownAbsolute in MCAssembler::relaxLEB to be foldable.
…strnlen in printf_common The return type of `internal_strlen()` is 'uptr', but in `printf_common()` we store the result of `internal_strlen()` into an 'int' type variable. When the result value of `internal_strlen()` is larger than the largest possible value of 'int' type, the implicit conversion from 'uptr' to 'int' will change the result value to a negative value. Without this change, asan reports a false positive negative-size-param in the added testcase. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D157266
…size Add a check that the DILocalVariable fragment size in dbg.declare does not exceed the size of the alloca. This would have caught the invalid debuginfo regenerated by rustc in llvm/llvm-project#64149. Differential Revision: https://reviews.llvm.org/D158743
Since td_allow_completion_event is a member of the taskdata struct, not all firstprivate/shared variables are stored at the end of the task memory allocation. Simply report the whole allocation instead. Furthermore, the function should always return 0 since in no case there is another block to report. Differential Review: https://reviews.llvm.org/D158080
Following the lead of the Linux code, this patch passes the `ld -z` options as two separate args on Solaris, improving legibility. For lack of a variadic `std::push_back`, `getAsNeededOption` had to be changed to `addAsNeededOption`, matching other `add*Options` functions, changing callers accordingly. The additional args are also used in a WIP revision of the Solaris GNU ld patch D85309 <https://reviews.llvm.org/D85309>, which will allow runtime selection of the linker to use. Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D158955
We were treating enum constants more like regular decls, which results in ignoring type aliases/exports. This patch brings the handling to be closer to member-like decls, with one caveat. When we encounter reference to an enum constant we still report an explicit reference to the particular enum constant, as otherwise we might not see any references to the enum itself. Also drops implicit references from qualified names to containers, as we already have explicit references from the qualifier to relevant container. Differential Revision: https://reviews.llvm.org/D158515
At the moment Archer segfaults due to a null-pointer access, if an application uses taskwait with depend clause as used in the two new tests. This patch cleans up the task_schedule function, moves semantic blocks into functions and replaces the if blocks by a single switch statement. The switch statement will warn, when new enum values are added in OMPT and makes clear what code is executed for the different cases. With free-agent tasks coming up in OpenMP 6.0, we should expect more null-pointer task_data, so additional null-pointer checks were added. We also cannot rely on having an implicit task on the stack, so the BarrierIndex is stored during task creation. Differential Revision: https://reviews.llvm.org/D158072
…ragment size" This reverts commit 183f49c. The lang/cpp/trivial_abi/TestTrivialABI.py lldb test fails on buildbots.
`strtol("0b1", 0, 0)` can be (pre-C23) 0 or (C23) 1. `sscanf("0b10", "%i", &x)` is similar. glibc 2.38 introduced `__isoc23_strtol` and `__isoc23_scanf` family functions for binary compatibility. When `_ISOC2X_SOURCE` is defined (implied by `_GNU_SOURCE`) or `__STDC_VERSION__ > 201710L`, `__GLIBC_USE_ISOC2X` is defined to 1 and these `__isoc23_*` symbols are used. Add `__isoc23_` versions for the following interceptors: * sanitizer_common_interceptors.inc implements strtoimax/strtoumax. Remove incorrect FIXME about google/sanitizers#321 * asan_interceptors.cpp implements just strtol and strtoll. The default `replace_str` mode checks `nptr` is readable and `endptr` is writable. atoi reuses the existing strtol interceptor. * msan_interceptors.cpp implements strtol family functions and their `_l` versions. Tested by lib/msan/tests/msan_test.cpp * sanitizer_common_interceptors.inc implements scanf family functions. The strtol family functions are spreaded, which is not great, but the patch (intended for release/17.x) does not attempt to address the issue. Add symbols to lib/sanitizer_common/symbolizer/scripts/global_symbols.txt to support both glibc pre-2.38 and 2.38. When build bots migrate to glibc 2.38+, we will lose test coverage for non-isoc23 versions since the existing C++ unittests imply `_GNU_SOURCE`. Add test/sanitizer_common/TestCases/{strtol.c,scanf.c}. They catch msan false positive in the absence of the interceptors. Fix llvm/llvm-project#64388 Fix llvm/llvm-project#64946 Link: https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00010.html ("The GNU C Library version 2.38 is now available") Reviewed By: #sanitizers, vitalybuka, mgorny Differential Revision: https://reviews.llvm.org/D158943
…ve unused alloc-dealloc pairs Deallocation operations where the allocated value is the 'memref' and 'retained' list are currently not supported. This is because when values are in the retained list, they typically have a use-site at a later point and another deallocation op exists at that later point to free the memref then. There alrady exists a canonicalization pattern in the buffer deallocation simplification pass that removes the allocated value from the earlier dealloc because it will never be actually deallocated in that case and thus does not have to be considered in this new pattern. Differential Revision: https://reviews.llvm.org/D158740
…s as part of BufferDeallocationSimplification Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D158744
Extends the existing mix-in for VectorizeOp with support for the missing unit attributes. Also fixes the unintuitive implementation where `structured.VectorizeOp(target=target, vectorize_padding=False)` still resulted in the creation of the UnitAttr `vectorize_padding`. Reviewed By: ingomueller-net Differential Revision: https://reviews.llvm.org/D158726
BOLT uses `MCAsmLayout` to calculate the output values of functions and basic blocks. This means output values are calculated based on a pre-linking state and any changes to symbol values during linking will cause incorrect values to be used. This issue can be triggered by enabling linker relaxation on RISC-V. Since linker relaxation can remove instructions, symbol values may change. This causes, among other things, the symbol table created by BOLT in the output executable to be incorrect. This patch solves this issue by using `BOLTLinker` to get symbol values instead of `MCAsmLayout`. This way, output values are calculated based on a post-linking state. To make sure the linker can update all necessary symbols, this patch also makes sure all these symbols are not marked as temporary so that they end-up in the object file's symbol table. Note that this patch only deals with symbols of binary functions (`BinaryFunction::updateOutputValues`). The technique described above turned out to be too expensive for basic block symbols so those are handled differently in D155604. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D154604
This reverts commit 0e63f1a. clang-format started to crash with contents like: a.h: ``` ``` $ clang-format a.h ``` PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: ../llvm/build/bin/clang-format a.h #0 0x0000560b689fe177 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:723:13 #1 0x0000560b689fbfbe llvm::sys::RunSignalHandlers() /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Signals.cpp:106:18 #2 0x0000560b689feaca SignalHandler(int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:413:1 #3 0x00007f030405a540 (/lib/x86_64-linux-gnu/libc.so.6+0x3c540) #4 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/include/clang/Lex/Token.h:98:44 #5 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:562:51 #6 0x0000560b68a9a980 startsSequenceInternal<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:831:9 #7 0x0000560b68a9a980 startsSequence<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:600:12 #8 0x0000560b68a9a980 getFunctionName /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3131:17 #9 0x0000560b68a9a980 clang::format::TokenAnnotator::annotate(clang::format::AnnotatedLine&) /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3191:17 Segmentation fault ```
Prefer to use .empty() instead of checking for size() > 0.
D158607 switched this code to use CMAKE_INSTALL_LIBDIR, but kept the explicit LLVM_DIR_SUFFIX. However, CMAKE_INSTALL_LIBDIR already contains the suffix, so we end up installing into a path like lib6464.
…waiter is not empty The original patch is incorrect since it marks too many calls to be noinline. It shows that it is bad to do analysis in the frontend again. This patch tries to mark the await_suspend function as noinlne only. --- Close llvm/llvm-project#56301 Close llvm/llvm-project#64151 Close llvm/llvm-project#65018 See the summary and the discussion of https://reviews.llvm.org/D157070 to get the full context. As @rjmccall pointed out, the key point of the root cause is that currently we didn't implement the semantics for '@llvm.coro.save' well ("after the await-ready returns false, the coroutine is considered to be suspended ") well. Since the semantics implies that we (the compiler) shouldn't write the spills into the coroutine frame in the await_suspend. But now it is possible due to some combinations of the optimizations so the semantics are broken. And the inlining is the root optimization of such optimizations. So in this patch, we tried to add the `noinline` attribute to the await_suspend function. This looks slightly problematic since the users are able to call the await_suspend function standalone. This is limited by the implementation. On the one hand, we don't want the workaround solution (See the proposed solution later) to be too complex. On the other hand, it is rare to call await_suspend standalone. Also it is not semantically incorrect to do so since the inlining is not part of the C++ standard. Also as an optimization, we don't add the `noinline` attribute to the await_suspend function if the awaiter is an empty class. This should be correct since the programmers can't access the local variables in await_suspend if the awaiter is empty. I think this is necessary for the performance since it is pretty common. The long term solution is: call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle, ptr @awaitSuspendFn) Then it is much easier to perform the safety analysis in the middle end. If it is safe to inline the call to awaitSuspend, we can replace it in the CoroEarly pass. Otherwise we could replace it in the CoroSplit pass. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D157833
The sync pipeline should always contain the candidate ID. If it doesn't something's gone awry. assert on that. Reviewed by: jrbyrnes Differential Revision: https://reviews.llvm.org/D158845
Fixes llvm/llvm-project#48974 Reviewed By: shafik Differential Revision: https://reviews.llvm.org/D158827
This improves some cases where a splat_vector uses a build_pair that can be simplified, e.g: (rotl x:i64, splat_vector (build_pair x1:i32, x2:i32)) rotl only demands the bottom 6 bits, so this patch allows it to simplify it to: (rotl x:i64, splat_vector (build_pair x1:i32, undef:i32)) Which in turn improves some cases where a splat_vector_parts is lowered on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158839
We can work out the known bits for a given lane by concatenating the known bits of each scalar operand. In the description of ISD::SPLAT_VECTOR_PARTS in ISDOpcodes.h it says that the total size of the scalar operands must cover the output element size, but I've added a stricter assertion here that the total width of the scalar operands must be exactly equal to the element size. It doesn't seem to trigger, and I'm not sure if there any targets that use SPLAT_VECTOR_PARTS for anything other than v4i32 -> v2i64 splats. We also need to include it in isTargetCanonicalConstantNode, otherwise returning the known bits introduces an infinite combine loop. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158852
…rRange checker The checker assumed that it could safely cast an SVal to Nonloc. This surfaced because, with std::ranges, we can unintentionally match on other APIs as well, thus increasing the likelihood of violating checker assumptions about the context it's invoked. https://godbolt.org/z/13vEb3K76 See the discourse post on CallDescriptions and std::ranges here. https://discourse.llvm.org/t/calldescriptions-should-not-skip-the-ranges-part-in-std-names-when-matching/73076 Fixes llvm/llvm-project#65009 Differential Revision: https://reviews.llvm.org/D158968
This patch fixes a compiler crash that would happen during translation to LLVM IR if the optional `map` argument of the `omp.target` operation was not present. A unit test is added to ensure this has been fixed. Differential Revision: https://reviews.llvm.org/D158722
…merge-upstream-20230828
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge upstream upto 2023/8/28.
This patch has been passed the internal regression tests.