Skip to content

Feature/merge upstream 20230828 #260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,028 commits into from
Aug 29, 2023
Merged

Feature/merge upstream 20230828 #260

merged 1,028 commits into from
Aug 29, 2023

Conversation

kaz7
Copy link
Collaborator

@kaz7 kaz7 commented Aug 29, 2023

Merge upstream upto 2023/8/28.

This patch has been passed the internal regression tests.

raikonenfnu and others added 30 commits August 24, 2023 17:35
Reduction is heavily used for many DL workload especially with
softmax/Attention layers. Wave/Warp shuffle and reduction is known to be
a speedy/efficient way to do these reductions.

In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D158684
…mpAndBinOp` to work for more binops" (2nd Try)

Was missing a nullptr check before derefencing. Fixed + test case
included in the patch.

Re-Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148414
The Driver for NaCl already handled the header paths so disable
the default fallback path.
structured-block

where clause is one of the following:

private(list)
reduction([reduction-modifier ,] reduction-identifier : list)
nowait

Differential Revision: https://reviews.llvm.org/D157933
I am trying to clean up GCCInstallationDetector::init and noticed that
Myriad.cpp is the only toolchain using `ExtraTripleAliases`. This is a
little overhead, but I figured that Myriad.cpp is unused.
Its sanitizer runtime part was removed in 2021 by D104279. It seems time
to retire it.

Reviewed By: waltl

Differential Revision: https://reviews.llvm.org/D158706
This adds fields to AsmParserState to track attribute and type alias
definitions and uses and teachers the parser to inform the
AsmParserState about them. This will be used to add LSP support for goto
definition and find references for aliases.

Attribute aliases are tolerant to use before def, because certain
location aliases may be deferred.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D158781
/data/llvm-project/mlir/lib/AsmParser/AsmParserState.cpp:293:8: error: unused variable '[it, inserted]' [-Werror,-Wunused-variable]
  auto [it, inserted] =
       ^
1 error generated.
D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.

This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.

This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.

Differential Revision: https://reviews.llvm.org/D158568
…tterns with floating points."

This reverts commit 5ec1353.
…`vd == v0`

According to `riscv-v-spec-1.0.pdf` page 52:

> masked va >= x, vd == v0
>   pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt
>   expansion: vmslt{u}.vx vt, va, x; vmandn.mm vd, vd, vt

The resulting `vmslt{u}.vx` is not masked. This patch fixes the logic in `RISCVAsmParser`, to make the behavior consistent with the case "masked va >= x, any vd" in the later part of the code, where no mask op is added.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D158392
Certain instrumentations set the !nosanitize metadata for inserted
instructions, which are generally not interested for sanitizers. Skip
tsan instrumentation like we do for asan (D126294)/msan/hwasan.

-fprofile-arcs instrumentation has data race unless
-fprofile-update=atomic is specified. Let's remove the the `__llvm_gcov`
special case from commit 0222adb (2016)
as the racy instructions have the !nosanitize metadata.
(-fprofile-arcs instrumentation does not use `__llvm_gcda` as global variables.)

```
std::atomic<int> c;
void foo() { c++; }
int main() {
  std::thread th(foo);
  c++;
  th.join();
}
```
Tested that `clang++ --coverage -fsanitize=thread a.cc && ./a.out` does
not report spurious tsan errors.

Also remove the default CC1 option -fprofile-update=atomic for
-fsanitize=thread to make options more orthogonal.

Reviewed By: Enna1

Differential Revision: https://reviews.llvm.org/D158385
 - Update license, display name, repository and keywords
 - Add homepage, issue page
 - Fix formatting and typos

Differential revision: https://reviews.llvm.org/D158801
%x = shl i64 %w, n
%y = add i64 %x, c
%z = ashr i64 %y, m

The above given instruction triplet is seen many times in the generated
LLVM IR, but SCEV model is not able to compute the SCEV value of AShr
instruction in this case.

This patch models the two cases of the above instruction pattern using
the following expression:

=> sext(add(mul(trunc(w), 2^(n-m)), c >> m))

1) when n = m the expression reduces to sext(add(trunc(w), c >> n))
as n-m=0, and multiplying with 2^0 gives the same result.

2) when n > m the expression works as given above.

It also adds several unittest to verify that SCEV is able to compute
the value.

$ opt sext-add-inreg.ll -passes="print<scalar-evolution>"

Comparing the snippets of the result of SCEV analysis:

* SCEV of ashr before change
----------------------------
%idxprom = ashr exact i64 %sext, 32
  -->  %idxprom U: [-2147483648,2147483648) S: [-2147483648,2147483648)
  Exits: 8                LoopDispositions: { %for.body: Variant }

* SCEV of ashr after change
---------------------------
%idxprom = ashr exact i64 %sext, 32
  -->  {0,+,1}<nuw><nsw><%for.body> U: [0,9) S: [0,9)
  Exits: 8                LoopDispositions: { %for.body: Computable }

LoopDisposition of the given SCEV was LoopVariant before, after adding
the new way to model the instruction, the LoopDisposition becomes
LoopComputable as it is able to compute the SCEV of the instruction.

Differential Revision: https://reviews.llvm.org/D152278
This wires in attribute and type aliases into the MLIR LSP server. This
will allow goto definition and find references on attribute and type
references, which should make debugging locations and other metadata
easier.

Depends on D158781

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D158782
In https://reviews.llvm.org/D157928 ellison of printing resources was added.
In the refactor, the proper printing of escape characters was mistakenly removed.
This patch adds it back in and adds a small unit test.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D158700
sparse_tensor ops cannot be bufferized with One-Shot Bufferize. (They can only be analyzed.) The sparse compiler does the actual lowering to memref. Produce a proper error message instead of crashing.

This fixes #61311.

Differential Revision: https://reviews.llvm.org/D158728
…n a cast

Fixes #64949

Reviewed By: Fznamznon, erichkeane, shafik

Differential Revision: https://reviews.llvm.org/D158733
…ainst uninitialized object

BEFORE this patch, compound assignment operator against uninitialized object such as uninit += 1 was diagnosed as subexpression not valid
This patch clarifies the reason for the error by saying that uninit is an uninitialized object.

Fixes llvm/llvm-project#51536

Reviewed By: shafik, tbaeder
Differential Revision: https://reviews.llvm.org/D157855
As a side effect, introduce AtomicExpr::getOpAsString() to dump the
AtomicOp string representation.

This is a recommit with the target fully specified.

Differential Revision: https://reviews.llvm.org/D158558
isLegalToHoistInto() currently return true for callbr instructions.
That means that a callbr with one successor will be considered a
proper loop preheader, which may result in instructions that use
the callbr return value being hoisted past it.

Fix this by adding callbr to isExceptionTerminator (with a rename
to isSpecialTerminator), which also fixes similar assumptions in
other places.

Fixes llvm/llvm-project#64215.

Differential Revision: https://reviews.llvm.org/D158609
…SelectICmpAndBinOp` to work for more binops" (2nd Try)"

Still appears to be buggy:
https://lab.llvm.org/buildbot/#/builders/124/builds/8260

This reverts commit 397a9cc.
There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152993
There seems to be something target-specific in the test, but I cannot
get why, revering.
s
Failing buildbot: https://lab.llvm.org/buildbot/#/builders/216/builds/26256

This reverts commit 01b2554.
khei4 and others added 28 commits August 28, 2023 14:44
This test is simplified from
lld/test/MachO/compact-unwind-lsda-folding.s, which tests .uleb128 A-B
where A and B are in different fragments (not tested in llvm/).

`.uleb128 Lfunc_end0-Ltmp1` requires evaluateKnownAbsolute in
MCAssembler::relaxLEB to be foldable.
…strnlen in printf_common

The return type of `internal_strlen()` is 'uptr', but in `printf_common()` we store the result of `internal_strlen()` into an 'int' type variable.
When the result value of `internal_strlen()` is larger than the largest possible value of 'int' type, the implicit conversion from 'uptr' to 'int' will change the result value to a negative value.

Without this change, asan reports a false positive negative-size-param in the added testcase.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D157266
…size

Add a check that the DILocalVariable fragment size in dbg.declare
does not exceed the size of the alloca.

This would have caught the invalid debuginfo regenerated by rustc
in llvm/llvm-project#64149.

Differential Revision: https://reviews.llvm.org/D158743
Since td_allow_completion_event is a member of the taskdata struct, not all
firstprivate/shared variables are stored at the end of the task memory
allocation. Simply report the whole allocation instead.

Furthermore, the function should always return 0 since in no case there is
another block to report.

Differential Review: https://reviews.llvm.org/D158080
Following the lead of the Linux code, this patch passes the `ld -z` options
as two separate args on Solaris, improving legibility.  For lack of a
variadic `std::push_back`, `getAsNeededOption` had to be changed to
`addAsNeededOption`, matching other `add*Options` functions, changing
callers accordingly.  The additional args are also used in a WIP revision
of the Solaris GNU ld patch D85309 <https://reviews.llvm.org/D85309>, which
will allow runtime selection of the linker to use.

Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D158955
We were treating enum constants more like regular decls, which results
in ignoring type aliases/exports.
This patch brings the handling to be closer to member-like decls, with
one caveat. When we encounter reference to an enum constant we still
report an explicit reference to the particular enum constant, as
otherwise we might not see any references to the enum itself.

Also drops implicit references from qualified names to containers, as
we already have explicit references from the qualifier to relevant
container.

Differential Revision: https://reviews.llvm.org/D158515
At the moment Archer segfaults due to a null-pointer access, if an application
uses taskwait with depend clause as used in the two new tests.
This patch cleans up the task_schedule function, moves semantic blocks into
functions and replaces the if blocks by a single switch statement. The switch
statement will warn, when new enum values are added in OMPT and makes clear
what code is executed for the different cases.

With free-agent tasks coming up in OpenMP 6.0, we should expect more
null-pointer task_data, so additional null-pointer checks were added.
We also cannot rely on having an implicit task on the stack, so the
BarrierIndex is stored during task creation.

Differential Revision: https://reviews.llvm.org/D158072
…ragment size"

This reverts commit 183f49c.

The lang/cpp/trivial_abi/TestTrivialABI.py lldb test fails on
buildbots.
`strtol("0b1", 0, 0)` can be (pre-C23) 0 or (C23) 1.
`sscanf("0b10", "%i", &x)` is similar. glibc 2.38 introduced
`__isoc23_strtol` and `__isoc23_scanf` family functions for binary
compatibility.

When `_ISOC2X_SOURCE` is defined (implied by `_GNU_SOURCE`) or
`__STDC_VERSION__ > 201710L`, `__GLIBC_USE_ISOC2X` is defined to 1 and
these `__isoc23_*` symbols are used.

Add `__isoc23_` versions for the following interceptors:

* sanitizer_common_interceptors.inc implements strtoimax/strtoumax.
  Remove incorrect FIXME about google/sanitizers#321
* asan_interceptors.cpp implements just strtol and strtoll. The default
  `replace_str` mode checks `nptr` is readable and `endptr` is writable.
  atoi reuses the existing strtol interceptor.
* msan_interceptors.cpp implements strtol family functions and their
  `_l` versions. Tested by lib/msan/tests/msan_test.cpp
* sanitizer_common_interceptors.inc implements scanf family functions.

The strtol family functions are spreaded, which is not great, but the
patch (intended for release/17.x) does not attempt to address the issue.

Add symbols to lib/sanitizer_common/symbolizer/scripts/global_symbols.txt to
support both glibc pre-2.38 and 2.38.

When build bots migrate to glibc 2.38+, we will lose test coverage for
non-isoc23 versions since the existing C++ unittests imply `_GNU_SOURCE`.
Add test/sanitizer_common/TestCases/{strtol.c,scanf.c}.
They catch msan false positive in the absence of the interceptors.

Fix llvm/llvm-project#64388
Fix llvm/llvm-project#64946

Link: https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00010.html
("The GNU C Library version 2.38 is now available")

Reviewed By: #sanitizers, vitalybuka, mgorny

Differential Revision: https://reviews.llvm.org/D158943
…ve unused alloc-dealloc pairs

Deallocation operations where the allocated value is the 'memref' and
'retained' list are currently not supported. This is because when values
are in the retained list, they typically have a use-site at a later
point and another deallocation op exists at that later point to free the
memref then. There alrady exists a canonicalization pattern in the
buffer deallocation simplification pass that removes the allocated value
from the earlier dealloc because it will never be actually deallocated
in that case and thus does not have to be considered in this new
pattern.

Differential Revision: https://reviews.llvm.org/D158740
…s as part of BufferDeallocationSimplification

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D158744
Extends the existing mix-in for VectorizeOp with support for the missing unit attributes.

Also fixes the unintuitive implementation where
`structured.VectorizeOp(target=target, vectorize_padding=False)` still resulted in the creation of the UnitAttr `vectorize_padding`.

Reviewed By: ingomueller-net

Differential Revision: https://reviews.llvm.org/D158726
BOLT uses `MCAsmLayout` to calculate the output values of functions and
basic blocks. This means output values are calculated based on a
pre-linking state and any changes to symbol values during linking will
cause incorrect values to be used.

This issue can be triggered by enabling linker relaxation on RISC-V.
Since linker relaxation can remove instructions, symbol values may
change. This causes, among other things, the symbol table created by
BOLT in the output executable to be incorrect.

This patch solves this issue by using `BOLTLinker` to get symbol values
instead of `MCAsmLayout`. This way, output values are calculated based
on a post-linking state. To make sure the linker can update all
necessary symbols, this patch also makes sure all these symbols are not
marked as temporary so that they end-up in the object file's symbol
table.

Note that this patch only deals with symbols of binary functions
(`BinaryFunction::updateOutputValues`). The technique described above
turned out to be too expensive for basic block symbols so those are
handled differently in D155604.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D154604
This reverts commit 0e63f1a.

clang-format started to crash with contents like:
a.h:
```
```
$ clang-format a.h
```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ../llvm/build/bin/clang-format a.h
 #0 0x0000560b689fe177 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000560b689fbfbe llvm::sys::RunSignalHandlers() /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000560b689feaca SignalHandler(int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f030405a540 (/lib/x86_64-linux-gnu/libc.so.6+0x3c540)
 #4 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/include/clang/Lex/Token.h:98:44
 #5 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:562:51
 #6 0x0000560b68a9a980 startsSequenceInternal<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:831:9
 #7 0x0000560b68a9a980 startsSequence<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:600:12
 #8 0x0000560b68a9a980 getFunctionName /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3131:17
 #9 0x0000560b68a9a980 clang::format::TokenAnnotator::annotate(clang::format::AnnotatedLine&) /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3191:17
Segmentation fault
```
Prefer to use .empty() instead of checking for size() > 0.
D158607 switched this code to use CMAKE_INSTALL_LIBDIR, but kept
the explicit LLVM_DIR_SUFFIX. However, CMAKE_INSTALL_LIBDIR already
contains the suffix, so we end up installing into a path like
lib6464.
…waiter is not empty

The original patch is incorrect since it marks too many calls to be
noinline. It shows that it is bad to do analysis in the frontend again.
This patch tries to mark the await_suspend function as noinlne only.

---

Close llvm/llvm-project#56301
Close llvm/llvm-project#64151
Close llvm/llvm-project#65018

See the summary and the discussion of
https://reviews.llvm.org/D157070
to get the full context.

As @rjmccall pointed out, the key point of the root cause is that
currently we didn't implement the semantics for '@llvm.coro.save'
well ("after the await-ready returns false, the coroutine is considered
to be suspended ") well.
Since the semantics implies that we (the compiler) shouldn't write
the spills into the coroutine frame in the await_suspend. But now it is
possible due to some combinations of the optimizations so the semantics are
broken. And the inlining is the root optimization of such optimizations.
So in this patch, we tried to add the `noinline` attribute to the
await_suspend function.

This looks slightly problematic since the users are able to call the
await_suspend function standalone. This is limited by the
implementation. On the one hand, we don't want the workaround solution
(See the proposed solution later) to be too complex. On the other hand,
it is rare to call await_suspend standalone. Also it is not semantically
incorrect to do so since the inlining is not part of the C++ standard.

Also as an optimization, we don't add the `noinline` attribute to
the await_suspend function if the awaiter is an empty class. This should be
correct since the programmers can't access the local variables in
await_suspend if the awaiter is empty. I think this is necessary for
the performance since it is pretty common.

The long term solution is:

    call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle,
                                  ptr @awaitSuspendFn)

Then it is much easier to perform the safety analysis in the middle
end. If it is safe to inline the call to awaitSuspend, we can replace it
in the CoroEarly pass. Otherwise we could replace it in the CoroSplit
pass.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D157833
The sync pipeline should always contain the candidate ID. If it doesn't
something's gone awry. assert on that.

Reviewed by: jrbyrnes

Differential Revision: https://reviews.llvm.org/D158845
This improves some cases where a splat_vector uses a build_pair that can be
simplified, e.g:

(rotl x:i64, splat_vector (build_pair x1:i32, x2:i32))

rotl only demands the bottom 6 bits, so this patch allows it to simplify it to:

(rotl x:i64, splat_vector (build_pair x1:i32, undef:i32))

Which in turn improves some cases where a splat_vector_parts is lowered on
RV32.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D158839
We can work out the known bits for a given lane by concatenating the known bits of each scalar operand.

In the description of ISD::SPLAT_VECTOR_PARTS in ISDOpcodes.h it says that the
total size of the scalar operands must cover the output element size, but I've
added a stricter assertion here that the total width of the scalar operands
must be exactly equal to the element size. It doesn't seem to trigger, and I'm
not sure if there any targets that use SPLAT_VECTOR_PARTS for anything other
than v4i32 -> v2i64 splats.

We also need to include it in isTargetCanonicalConstantNode, otherwise
returning the known bits introduces an infinite combine loop.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D158852
…rRange checker

The checker assumed that it could safely cast an SVal to Nonloc.
This surfaced because, with std::ranges, we can unintentionally match
on other APIs as well, thus increasing the likelihood of violating
checker assumptions about the context it's invoked.
https://godbolt.org/z/13vEb3K76

See the discourse post on CallDescriptions and std::ranges here.
https://discourse.llvm.org/t/calldescriptions-should-not-skip-the-ranges-part-in-std-names-when-matching/73076

Fixes llvm/llvm-project#65009

Differential Revision: https://reviews.llvm.org/D158968
This patch fixes a compiler crash that would happen during translation to LLVM
IR if the optional `map` argument of the `omp.target` operation was not
present. A unit test is added to ensure this has been fixed.

Differential Revision: https://reviews.llvm.org/D158722
@kaz7 kaz7 merged commit f1cd67f into develop Aug 29, 2023
@kaz7 kaz7 deleted the feature/merge-upstream-20230828 branch August 29, 2023 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.