Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix live interval empty issue #1

Open
wants to merge 870 commits into
base: main
Choose a base branch
from
Open

Conversation

huaatian
Copy link
Owner

bcardosolopes and others added 30 commits March 5, 2025 15:21
These are very common when using intrinsics (e.g. ARM NEON).

For more context: ClangIR has currently been blocked on such intrinsics
emission because of this lacking capability.
…lvm#124624)

It is known that for vector whose element fits in i16 will be split and
scalarized in SelectionDag's type legalizer
(see SIISelLowering::getPreferredVectorAction).

LRO attempts to undo the scalarizing of vectors across basic block
boundary and shoehorn Values in VGPRs. LRO is beneficial for operations
that natively work on illegal vector types to prevent flip-flopping
between unpacked and packed. If we know that operations on vector will
be split and scalarized, then we don't want to shoehorn them back to
packed VGPR.

Operations that we know to work natively on illegal vector types usually
come in the form of intrinsics (MFMA, DOT8), buffer store, shuffle, phi
nodes to name a few.
… prefetches.

ld64 doesn't currently support the PAGEOFF relocations on anything but load/stores
so we need to bail-out here to fix the build failures on greendragon.

rdar://145495288
…lvm#129948)

Consistently use LLDB_INVALID_LINE_NUMBER & LLDB_INVALID_COLUMN_NUMBER
when parsing line and column numbers respectively.
Static analysis flags the final return statement in `ReadExtensionBlock`
as unreachable and indeed it is since there is no way to exit the
`while(true)` loop besides a *return statement*.

So I am converting it into a `llvm_unreachable` to explicitly document
this.
…lvm#129737)

Currently, we error on non-variable or non-local variable declarations
in `for` loops such as `for (struct S {}; 0; ) {}`. However, this is
valid in C23, so this patch changes the error to a compatibilty warning
and also allows this as an extension in earlier language modes. This
also matches GCC’s behaviour.
…L op (llvm#127137)

Fixes llvm#99205.

- Implements the HLSL intrinsic `AddUint64` used to perform unsigned
64-bit integer addition by using pairs of unsigned 32-bit integers
instead of native 64-bit types
- The LLVM intrinsic `uadd_with_overflow` is used in the implementation
of `AddUint64` in `CGBuiltin.cpp`
- The DXIL op `UAddc` was defined in `DXIL.td`, and a lowering of the
LLVM intrinsic `uadd_with_overflow` to the `UAddc` DXIL op was
implemented in `DXILOpLowering.cpp`

Notes:
- `__builtin_addc` was not able to be used to implement `AddUint64` in
`hlsl_intrinsics.h` because its `CarryOut` argument is a pointer, and
pointers are not supported in HLSL
- A lowering of the LLVM intrinsic `uadd_with_overflow` to SPIR-V
[already
exists](https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/SPIRV/llvm-intrinsics/uadd.with.overflow.ll)
- When lowering the LLVM intrinsic `uadd_with_overflow` to the `UAddc`
DXIL op, the anonymous struct type `{ i32, i1 }` is replaced with a
named struct type `%dx.types.i32c`. This aspect of the implementation
may be changed when issue llvm#113192 gets addressed
- Fixes issues mentioned in the comments on the original PR llvm#125319

---------

Co-authored-by: Finn Plummer <[email protected]>
Co-authored-by: Farzon Lotfi <[email protected]>
Co-authored-by: Chris B <[email protected]>
Co-authored-by: Justin Bogner <[email protected]>
Summary:
Somehow these got the `!` dropped and it wasn't tested because the
existing test only used the 32-bit variant.
Instead of hardcoding the decision on what mangling scheme to use based
on targets, use TargetInfo to make the decision.
We should not try to overwrite the pointer of struct, also need to add 1
for end of line.
…lvm#128034)

This provides a range to decide how to subdivide the vector register
budget on gfx90a+. A single value declares the minimum AGPRs that
should be allocatable. Eventually this should replace amdgpu-no-agpr.

I want this primarily for testing agpr allocation behavior. We should
have a heuristic try to detect a reasonable number of AGPRs to keep
allocatable.
This performs the minimal replacment of amdgpu-no-agpr to
amdgpu-agpr-alloc=0. Most of the test diffs are due to the new
attribute sorting later alphabetically.

We could do better by trying to perform range merging in the attributor,
and trying to pick non-0 values.
According to the commit history, the constructors removed by LWG4140
have never been added to libc++.

Existence of non-public or deleted default constructor is observable,
this patch tests that there's no such default constructor at all.
Forked from llvm/test/CodeGen/AArch64/arm64-ld1.ll

Incorrectly handled by handleUnknownInstruction:
- llvm.aarch64.neon.ld1x2, llvm.aarch64.neon.ld1x3,
llvm.aarch64.neon.ld1x4
- llvm.aarch64.neon.ld2, llvm.aarch64.neon.ld3, llvm.aarch64.neon.ld4
- llvm.aarch64.neon.ld2lane, llvm.aarch64.neon.ld3lane,
llvm.aarch64.neon.ld4lane
- llvm.aarch64.neon.ld2r, llvm.aarch64.neon.ld3r, llvm.aarch64.neon.ld4r
…rCreator.

ExecutionSession can provide the Triple, so this argument has been redundant
for a while, and no in-tree clients use it.
In order for the union APFloat::Storage to permit access to the
semantics field when another union member is stored there, all members
of Storage must be standard layout. This is not necessarily the case
for DoubleAPFloat which may be non-standard layout because there is no
requirement that its std::unique_ptr member is standard layout. Fix this
by converting Floats to a raw pointer.

Reviewers: arsenm

Reviewed By: arsenm

Pull Request: llvm#129981
…king..."

This reverts commit f905bf3 while I fix
some compile errors reported on the buildbots (see e.g.
https://lab.llvm.org/buildbot/#/builders/53/builds/13369).
The StringRef overload is often error-prone as users might forget to
register the MCSymbol.

Add comments to MCTargetExpr and MCSymbolRefExpr::VariantKind.
In the distant future the VariantKind parameter might be removed.
…th fixes.

This re-applies f905bf3, which was reverted in
c861c1a due to compiler errors, with a fix for
MLIR.
)

This extension adds thirty eight  bit manipulation instructions.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.6

This patch adds assembler only support.

Co-authored-by: Sudharsan Veeravalli <[email protected]>
llvm#129980)

We used to filter out relocations corresponding to NOP+ADR instruction
pairs that were a result of linker "relaxation" optimization. However,
these relocations will be useful for reversing the linker optimization.
Keep the relocations and ignore them while symbolizing ADR instruction
operands.
kazutakahirata and others added 24 commits March 6, 2025 23:28
This patch fixes:

  llvm/lib/Target/X86/X86ISelLowering.cpp:31886:11: error: unused
  variable 'M' [-Werror,-Wunused-variable]
Deprecate the `match` and `rewrite` functions. They mainly exist for
historic reasons. This PR also updates all remaining uses of in the MLIR
codebase.

This is addressing a
[comment](llvm#129861 (review))
on an earlier PR.

Note for LLVM integration: `SplitMatchAndRewrite` will be deleted soon,
update your patterns to use `matchAndRewrite` instead of separate
`match` / `rewrite`.

---------

Co-authored-by: Jakub Kuderski <[email protected]>
Strengthen out-of-bounds guarantees for buffer accesses by disallowing
buffer accesses with alignment lower than natural alignment.

This is needed to specifically address the edge case where an access
starts out-of-bounds and then enters in-bounds, as the hardware would
treat the entire access as being out-of-bounds. This is normally not
needed for most users, but at least one graphics device extension
(VK_EXT_robustness2) has very strict requirements - in-bounds accesses
must return correct value, and out-of-bounds accesses must return zero.

The direct consequence of the patch is that a buffer access at negative
address is not merged by load-store-vectorizer with one at a positive
address, which fixes a CTS test.

Targets that do not care about the new behavior are advised to use the
new target feature relaxed-buffer-oob-mode that maintains the state from
before the patch.
…xpr (llvm#129198)

Track whether a LambdaExpr is an immediate operand of a
CXXOperatorCallExpr using a new flag, isInCXXOperatorCall. This enables
special handling of capture initializations to detect uninitialized
variable uses, such as in `S s = [&]() { return s; }();`.

Fix llvm#128058
Previously, commit 042f07e claimed that
P0767R1 was implemented in LLVM 7.0, but no deprecation warning was
implemented. This patch adds the missing warnings.
Introduce RISCVLoadStoreOptimizer MIR Pass that will do the
optimization. The load/store pairing pass identifies adjacent load/store
instructions operating on consecutive memory locations and merges them
into a single paired instruction.

This is part of MIPS extensions for the p8700 CPU.

Production of ldp/sdp instructions is OFF by default, since it is
beneficial for -Os only in the case of p8700 CPU.
…29531)

This change means that llvm-strip no longer exits immediately upon
encountering an error when modifying a file and will instead continue
modifying the other inputs. Fixes llvm#129412
…#128714)

The interval of newly generated reg in ModuloScheduleExpander is empty.
This will cause crush at some corner case. This patch recalculate the
live intervals of these regs.
@huaatian huaatian force-pushed the fix_live_interval_empty_issue branch from 0a088de to a5a28f9 Compare March 7, 2025 11:09
huaatian pushed a commit that referenced this pull request Mar 17, 2025
These are macOS tests only and are currently failing on the x86_64 CI
and on arm64 on recent versions of macOS/Xcode.

The tests are failing because we're stopping in:
```
Process 17458 stopped
* thread #1: tid = 0xbda69a, 0x00000002735bd000
  libsystem_malloc.dylib`purgeable_print_self.cold.1, stop reason = EXC_BREAKPOINT (code=1, subcode=0x2735bd000)
```
instead of the libsanitizers library. This seems to be related to
`-fsanitize-trivial-abi` support

Skip these for now until we figure out the root cause.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment