Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM and SPIRV-LLVM-Translator pulldown (WW33 2024) #15106

Merged
merged 1,795 commits into from
Aug 27, 2024
Merged
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jul 31, 2024

  1. Configuration menu
    Copy the full SHA
    79996cd View commit details
    Browse the repository at this point in the history
  2. [mlir][emitc] Lower arith.divui, remui (#99313)

    This commit lowers `arith.divui` and `arith.remui` to EmitC by wrapping
    those operations with type conversions.
    cferry-AMD authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    36b2c22 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f395d82 View commit details
    Browse the repository at this point in the history
  4. [Clang][Interp] Fix the location of uninitialized base warning (#100761)

    Fix the location of `diag::note_constexpr_uninitialized_base`, make it
    same as current interpreter.
    This PR does not print type name with namespacethat was used to improve
    the current interpreter's type dump of base class type.
    
    ---------
    
    Signed-off-by: yronglin <[email protected]>
    yronglin authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    6434dce View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    38e6453 View commit details
    Browse the repository at this point in the history
  6. [MLIR][OpenMP] NFC: Sort clauses alphabetically (1/2) (#101193)

    This patch sorts the clause lists for the following OpenMP operations:
    - omp.parallel
    - omp.teams
    - omp.sections
    - omp.wsloop
    - omp.distribute
    - omp.task
    
    This change results in the reordering of operation arguments, so
    impacted unit tests are updated accordingly.
    skatrak authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    b3b4696 View commit details
    Browse the repository at this point in the history
  7. [MLIR][OpenMP] NFC: Sort clauses alphabetically (2/2) (#101194)

    This patch sorts the clause lists for the following OpenMP operations:
    - omp.taskloop
    - omp.taskgroup
    - omp.target_data
    - omp.target_enter_data
    - omp.target_exit_data
    - omp.target_update
    - omp.target
    
    This change results in the reordering of operation arguments, so
    impacted unit tests are updated accordingly.
    skatrak authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    a3800a6 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    5c406ea View commit details
    Browse the repository at this point in the history
  9. [AMDGPU,test] Add one more while-break case (#101300)

    which suffers from v_mov issue.
    ruiling authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    dae7fb8 View commit details
    Browse the repository at this point in the history
  10. Revert "[DAG][NFC] Use SDPatternMatch for VScale in some instances"

    This reverts commit d230442.
    
    The m_Add and m_Mul are commutative but the code does not expect the
    communtativity.
    michaelmaitland committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    22ce333 View commit details
    Browse the repository at this point in the history
  11. [libclang/python] Factor out unsaved files processing (#101308)

    Factor out the processing of unsaved files into its own function as
    suggested by @Endilll
    [here](https://github.com/llvm/llvm-project/pull/78114/files#r1697730196)
    DeinAlptraum authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    4c670b2 View commit details
    Browse the repository at this point in the history
  12. [libclang/python] type-ignore Any returns from library calls (#101310)

    On its own, this change leads to _more_ strict typing errors as the
    functions are mostly not annotated so far, so the `# type: ignore`s are
    reported as Unused. This is part of the work leading up to #78114
    though, and one of the bigger parts factored out from it, so these will
    later lead to less strict typing errors as the functions are annotated
    with return types.
    DeinAlptraum authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    5525566 View commit details
    Browse the repository at this point in the history
  13. Merge from 'sycl' to 'sycl-web'

    iclsrc committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    af8011d View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    d8b985c View commit details
    Browse the repository at this point in the history
  15. [libc++] Refactor tests for shared_mutex and shared_timed_mutex (#100…

    …783)
    
    This makes the tests less flaky and also makes a few other refactorings
    like using traits instead of .compile.fail.cpp tests.
    ldionne authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    29ef92b View commit details
    Browse the repository at this point in the history
  16. [libc++][docs] Remove misadded entry for P1937R2 from Cxx20Papers.csv…

    … (#100741)
    
    P1937R2 only contains core language change and doesn't touch the library
    at all.
    
    Closes #100613.
    frederick-vs-ja authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    569d8ce View commit details
    Browse the repository at this point in the history
  17. [lldb] Fixed lldb-server crash (TestLogHandler was not thread safe) (…

    …#101326)
    
    Host::LaunchProcess() requires to SetMonitorProcessCallback. This
    callback is called from the child process monitor thread. We cannot
    control this thread anyway. lldb-server may crash if there is a logging
    around this callback because TestLogHandler is not thread safe. I faced
    this issue debugging 100 simultaneous child processes. Note
    StreamLogHandler::Emit() in lldb/source/Utility/Log.cpp already contains
    the similar mutex.
    slydiman authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    93fecc2 View commit details
    Browse the repository at this point in the history
  18. [CycleInfo] skip unreachable predecessors (#101316)

    If an unreachable block B branches to a block S inside a cycle, it may
    cause S to be incorrectly treated as an entry to the cycle. We avoid
    that by skipping unreachable predecessors when locating entries.
    ssahasra authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    05c3a4b View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    9b017db View commit details
    Browse the repository at this point in the history
  20. [cmake] switch to CMake's native check_{compiler,linker}_flag (#96171)

    Broken out from #93429
    
    Somewhat closing the loop opened by 7017e6c.
    
    Co-authored-by: Ryan Prichard <[email protected]>
    h-vetinari and rprichard authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    89946bd View commit details
    Browse the repository at this point in the history
  21. [libc++][NFC] Remove two unused implementation details __find_end (…

    …#100685)
    
    Those two `__find_end` functions are no longer used after 101d1e9.
    After that commit, `std::find_end` started dispatching to `__find_end_classic`,
    and `ranges::find_end` to `__find_end_impl`, which means that the two `__find_end`
    functions were no longer necessary.
    
    Fixes #100569
    hewillk authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    5b6b488 View commit details
    Browse the repository at this point in the history
  22. [libcxx][test] Mark sort.pass.cpp as a long test (#100720)

    Picolib testing skips any test requiring this feature, I just didn't
    know the feature existed until now.
    DavidSpickett authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    f90e51a View commit details
    Browse the repository at this point in the history
  23. [libcxx][test] Require long_tests for eval.PR44847.pass.cp (#100722)

    This takes 1m40s to run when testing picolib on qemu. This isn't the end
    of the world but that's on an AArch64 server. So if someone felt the
    need to mark this unsupported in the first place, it's likely much
    slower on average hardware.
    DavidSpickett authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    23d188e View commit details
    Browse the repository at this point in the history
  24. [libclang] Use check_linker_flag instead of llvm_check_linker_flag

    Follow-up to #96171 in an attempt to fix the Solaris bots.
    ldionne committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    7ab6433 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    b5a7d3b View commit details
    Browse the repository at this point in the history
  26. [libc++] Make std::unique_lock available with _LIBCPP_HAS_NO_THREADS …

    …(#99562)
    
    This is a follow up to llvm/llvm-project#98717,
    which made lock_guard available under _LIBCPP_HAS_NO_THREADS. We can
    make unique_lock available under similar circumstances. This patch
    follows the example in #98717, by:
    
      - Removing the preprocessor guards for _LIBCPP_HAS_NO_THREADS in the
        unique_lock header.
      - providing a set of custom mutex implementations in a local header.
      - using custom locks in tests that can be made to work under
        `no-threads`.
    ilovepi authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    e9d5842 View commit details
    Browse the repository at this point in the history
  27. [libc++] Move the benchmarks under libcxx/test (#99371)

    This is an intermediate and fairly mechanical step towards unifying the
    benchmarks with the rest of the test suite. Moving this around requires
    a few changes, notably making sure we don't throw a wrench into the
    discovery process of the normal test suite. This won't be a problem
    anymore once benchmarks are taken into account by the test setup out of
    the box.
    ldionne authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    78b4b5c View commit details
    Browse the repository at this point in the history
  28. [clang][CUDA] Add 'noconvergent' function and statement attribute

    - For languages following SPMD/SIMT programming model, functions and
      call sites are marked 'convergent' by default. 'noconvergent' is added
      in this patch to allow developers to remove that 'convergent'
      attribute when it's safe.
    
    Reviewers:
    nhaehnle, Sirraide, yxsamliu, Artem-B, ilovepi, jayfoad, ssahasra, arsenm
    
    Reviewed By: arsenm
    
    Pull Request: llvm/llvm-project#100637
    darkbuck authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    fa84297 View commit details
    Browse the repository at this point in the history
  29. [libc][AArch64] Add an AArch64 setjmp/longjmp. (#101177)

    Previously, building libc for AArch64 in `LLVM_LIBC_FULL_BUILD` mode
    would fail because no implementation of setjmp/longjmp was available.
    This was the only obstacle, so now a full AArch64 build of libc is
    possible.
    
    This implementation automatically supports PAC and BTI if compiled with
    the appropriate options. I would have liked to do the same for MTE stack
    tagging, but as far as I can see there's currently no predefined macro
    that allows detection of `-fsanitize=memtag-stack`, so I've left that
    one as a TODO.
    
    AAPCS64 delegates the x18 register to individual platform ABIs, and
    allows them to choose what it's used for, which may or may not require
    setjmp and longjmp to save and restore it. To accommodate this, I've
    introduced a libc configuration option. The default is on, because the
    only use of x18 I've so far encountered uses it to store information
    specific to the current stack frame (so longjmp does need to restore
    it), and this is also safe behavior in the default situation where the
    platform ABI specifies no use of x18 and it becomes a temporary register
    (restoring it to its previous value is no worse than any _other_ way for
    a function call to clobber it). But if a platform ABI needs to use x18
    in a way that requires longjmp to leave it alone, they can turn the
    option off.
    statham-arm authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    2a6268d View commit details
    Browse the repository at this point in the history
  30. [scudo] Separated committed and decommitted entries. (#100818)

    Initially, the LRU list stored all mapped entries with no distinction
    between the committed (non-madvise()'d) entries and decommitted
    (madvise()'d) entries. Now these two types of entries are separated into
    two lists, allowing future cache logic to branch depending on whether or
    not entries are committed or decommitted. Furthermore, the retrieval
    algorithm will prioritize committed entries over decommitted entries.
    Specifically, valid-fit, committed entries (not necessarily optimal-fit)
    are retrieved before optimal-fit, decommitted entries.
    JoshuaMBa authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    8b2688b View commit details
    Browse the repository at this point in the history
  31. [InstCombine] Recognize copysign idioms (#101324)

    This patch folds `(bitcast (or (and (bitcast X to int), signmask), nneg
    Y) to fp)` into `copysign((bitcast Y to fp), X)`. I found this pattern
    exists in some graphics applications/math libraries.
    
    Alive2: https://alive2.llvm.org/ce/z/ggQZV2
    dtcxzyw authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    b455edb View commit details
    Browse the repository at this point in the history
  32. [SandboxIR] Implement AddrSpaceCastInst (#101260)

    This patch implements sandboxir::AddrSpaceCastInst which mirrors
    llvm::AddrSpaceCastInst.
    vporpo authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    d36c9f8 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    3715035 View commit details
    Browse the repository at this point in the history
  34. Add llvm::Error C API, LLVMCantFail

    It's barely testable - the test does exercise the code, but wouldn't
    fail on an empty implementation. It would cause a memory leak though
    (because the error handle wouldn't be unwrapped/reowned) which could be
    detected by asan and other leak detectors.
    dwblaikie committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    45ef0d4 View commit details
    Browse the repository at this point in the history
  35. Merge from 'main' to 'sycl-web' (200 commits)

      CONFLICT (content): Merge conflict in clang/lib/Sema/SemaDecl.cpp
    calebwat committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    f712941 View commit details
    Browse the repository at this point in the history
  36. [SandboxIR] Implement IntToPtrInst (#101359)

    This patch implements sandboxir::IntToPtrInst which mirrors
    llvm::IntToPtrInst.
    vporpo authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    f0197a7 View commit details
    Browse the repository at this point in the history
  37. [SCEV] Add coverage for flag inference with vscale strided IVs

    Given vscale is a power of two, we should be able to prove no-self-wrap
    in these cases.  We currently don't, but an upcoming change will fix this.
    preames committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    faf3333 View commit details
    Browse the repository at this point in the history
  38. [lldb] Unify the way we get the Target in CommandObject (#101208)

    Currently, CommandObjects are obtaining a target in a variety of ways.
    Often the command incorrectly operates on the selected target. As an
    example, when a breakpoint command is running, the current target is
    passed into the command but the target that hit the breakpoint is not
    the selected target. In other places we use the CommandObject's
    execution context, which is frozen during the execution of the command,
    and comes with its own limitations. Finally, we often want to fall back
    to the dummy target if no real target is available.
    
    Instead of having to guess how to get the target, this patch introduces
    one helper function in CommandObject to get the most relevant target. In
    order of priority, that's the target from the command object's execution
    context, from the interpreter's execution context, the selected target
    or the dummy target.
    
    rdar://110846511
    JDevlieghere authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    8398ad9 View commit details
    Browse the repository at this point in the history
  39. [libc++][NFC] Add missing license headers

    Also standardize the license comment in several files where it was
    different from what we normally do.
    ldionne committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    6a54dfb View commit details
    Browse the repository at this point in the history
  40. Remove already implemented target independent optimization opportunit…

    …y (#101233)
    
    Fixes #101127
    
    See this working example: https://godbolt.org/z/z15oj15eP
    marcauberer authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    a847b0f View commit details
    Browse the repository at this point in the history
  41. Fix typo: tyep -> type.

    Brian Yahn authored and d0k committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    28a0792 View commit details
    Browse the repository at this point in the history
  42. [mlir][math] Fix polynomial math.asin approximation (#101247)

    The polynomial approximation for asin is only good between [-9/16,
    9/16]. Values beyond that range must be remapped to achieve good numeric
    results. This is done by the equation below:
    
    `arcsin(x) = PI/2 - arcsin(sqrt(1.0 - x*x))`
    rsuderman authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    a3fb301 View commit details
    Browse the repository at this point in the history
  43. [SandboxIR] Implement FPToSIInst (#101362)

    This patch implements sandboxir::FPToSIInst which mirrors
    llvm::FPToSIInst.
    vporpo authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    9718f3d View commit details
    Browse the repository at this point in the history
  44. [MVT][TableGen] Extend Machine Value Type to uint16_t (#99657)

    RFC:
    https://discourse.llvm.org/t/rfc-extend-machine-value-type-from-uint8-t-to-uint16-t/80274
    compile-time-tracker:
    https://llvm-compile-time-tracker.com/compare.php?from=4b9fab591916eec9fd1942f37afe3b137b564089&to=177d28247efe5a4d59a8d8150b4daf01e4f57d74&stat=wall-time
    
    Currently 208 out of 256 MVTs are used, it will be run out soon, so
    ultimately we need to extend the original `MVT::SimpleValueType` from
    `uint8_t` to `uint16_t` to accomodate more types.
    The `MatcherTable` uses `unsigned char` for encoding the matcher code,
    so the extended MVTs are no longer fit into the table, thus we need to
    use VBR to encode them as we do on others that are wider than 8 bits.
    
    The statistics below shows the difference of "Total Array size" of the
    matcher table that appears in every files:
    ```
    Table                       Before     After     Change(%)
    WebAssemblyGenDAGISel.inc   23576      23775     0.844
    NVPTXGenDAGISel.inc         173498     173498    0
    RISCVGenDAGISel.inc         2179121    2369929   8.756
    AVRGenDAGISel.inc           2754       2754      0
    PPCGenDAGISel.inc           163315     163617    0.185
    MipsGenDAGISel.inc          47280      47447     0.353
    SystemZGenDAGISel.inc       56243      56461     0.388
    AArch64GenDAGISel.inc       467893     487830    4.261
    MSP430GenDAGISel.inc        8069       8069      0
    LoongArchGenDAGISel.inc     78928      79131     0.257
    XCoreGenDAGISel.inc         3432       3432      0
    BPFGenDAGISel.inc           3733       3733      0
    VEGenDAGISel.inc            65174      66456     1.967
    LanaiGenDAGISel.inc         2067       2067      0
    X86GenDAGISel.inc           628787     636987    1.304
    ARMGenDAGISel.inc           170968     171036    0.040
    HexagonGenDAGISel.inc       155764     155764    0
    SparcGenDAGISel.inc         5762       5798      0.625
    AMDGPUGenDAGISel.inc        504356     504463    0.021
    R600GenDAGISel.inc          29785      29785     0
    ```
    
    The statistics below shows the runtime peak memory usage by compiling a
    simple C program:
    `/bin/time -v clang -target $TARGET -O3 -c test.c`
    ```
      int test(int a) {
        return a * 3;
      }
    ```
    ```
    Target        Before(kbytes)    After(kbytes)    Change(%)
    wasm64        110172            110088           -0.076
    nvptx64       109784            109980            0.179
    riscv64       114020            113656           -0.319
    avr           110352            110068           -0.257
    ppc64         112612            112476           -0.120
    mips64        113588            113668            0.070
    systemz       110860            110760           -0.090
    aarch64       113704            113432           -0.239
    msp430        110284            110200           -0.076
    loongarch64   111052            110756           -0.267
    xcore         108340            108020           -0.295
    bpf           110620            110708            0.080
    ve            110960            110920           -0.036
    lanai         110180            109960           -0.200
    x86_64        113640            113304           -0.296
    arm64         113540            113172           -0.324
    hexagon       114620            114684            0.056
    sparc         110412            110136           -0.250
    amdgcn        118164            117144           -0.863
    r600          111200            110508           -0.622
    ```
    4vtomat authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    a4c6ebe View commit details
    Browse the repository at this point in the history
  45. [Support] Erase blocks after DomTree::eraseNode (#101195)

    Change eraseNode to require that the basic block is still contained
    inside the function. This is a preparation for using numbers of basic
    blocks inside the dominator tree, which are invalid for blocks that are
    not inside a function.
    aengelke authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    6d103d7 View commit details
    Browse the repository at this point in the history
  46. [lldb] Add constant value mode for RegisterLocation in UnwindPlans (#…

    …100624)
    
    This is useful for language runtimes that compute register values by
    inspecting the state of the currently running process. Currently, there
    are no mechanisms enabling these runtimes to set register values to
    arbitrary values.
    
    The alternative considered would involve creating a dwarf expression
    that produces an arbitrary integer (e.g. using OP_constu). However, the
    current data structure for Rows is such that they do not own any memory
    associated with dwarf expressions, which implies any such expression
    would need to have static storage and therefore could not contain a
    runtime value.
    
    Adding a new rule for constants leads to a simpler implementation. It's
    also worth noting that this does not make the "Location" union any
    bigger, since it already contains a pointer+size pair.
    felipepiovezan authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    9fe455f View commit details
    Browse the repository at this point in the history
  47. [SandboxIR] Implement FPToUIInst (#101369)

    This patch implements sandboxir::FPToUIInst which mirrors
    llvm::FPToUIInst.
    vporpo authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    8b17b12 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    35a2e6d View commit details
    Browse the repository at this point in the history
  49. [Modules][Diagnostic] Don't claim a METADATA mismatch is always in PC…

    …H file. (#101280)
    
    You can provide more than one AST file as an input. Emit a path for a
    file with a problem, so you can disambiguate between multiple files.
    
    rdar://65005546
    vsapsai authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    f9827e6 View commit details
    Browse the repository at this point in the history
  50. [flang][OpenMP] Reland Fix copyprivate semantic checks (#95799) (#101…

    …009)
    
    There are some cases in which variables used in OpenMP constructs
    are predetermined as private. The semantic checks for copyprivate
    were not handling those cases.
    
    Besides that, shared symbols were not being properly represented
    in some cases. When there was no previously declared private
    (implicit) symbol, no new association symbols, representing
    shared ones, were being created.
    
    These symbols must always be inserted in constructs that may
    privatize the original symbol: parallel, teams and task
    generating constructs.
    
    Fixes #87214 and #86907
    luporl authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    366eade View commit details
    Browse the repository at this point in the history
  51. [AMDGPU][True16][MC] duplicate vop1 tests to fake16 and update real-t…

    …rue16 flags for GFX12 (#100849)
    
    duplicate vop1 tests to fake16 and update real-true16 flags for GFX12
    
    creating duplications here to avoid bulk copy in the following true16
    patches
    
    ---------
    
    Co-authored-by: guochen2 <[email protected]>
    broxigarchen and broxigarchen authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    055893f View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    6aa723d View commit details
    Browse the repository at this point in the history
  53. [lldb] Allow mapping object file paths (#101361)

    This introduces a `target.object-map` which allows us to remap module
    locations, much in the same way as source mapping works today. This is
    useful, for instance, when debugging coredumps, so we can replace some
    of the locations where LLDB attempts to load shared libraries and
    executables from, without having to setup an entire sysroot.
    aperez authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    0a01e8f View commit details
    Browse the repository at this point in the history
  54. [SandboxIR] Implement SIToFPInst (#101374)

    This patch implements sandboxir::SIToFPInst which mirrors
    llvm::SIToFPInst.
    vporpo authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    6d3317e View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    496feda View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    d0b4b6b View commit details
    Browse the repository at this point in the history
  57. [Sema][sycl] Restore additional SYCL condition in alignas handling

    bf02f41 changed sema handling of alignas to accomodate C23, which implements alignas as a type specifier instead of attribute. When merged, the SYCL-specific conditions that were applied before for CXX11 weren't brought over. This patch re-adds it, which addresses a number of test regressions in SemaSYCL.
    calebwat committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    eab0074 View commit details
    Browse the repository at this point in the history
  58. [BOLT][DWARF] Sort GDBIndexTUEntryVector (#101264)

    Sorts GDBIndexTUEntryVector in decreasing order by hash to ensure
    determinism when parallelized.
    sayhaan authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    33960ce View commit details
    Browse the repository at this point in the history
  59. [Offload] Allow to record kernel launch stack traces (#100472)

    Similar to (de)allocation traces, we can record kernel launch stack
    traces and display them in case of an error. However, the AMD GPU plugin
    signal handler, which is invoked on memroy faults, cannot pinpoint the
    offending kernel. Insteade print `<NUM>`, set via
    `OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=<NUM>`, many traces. The
    recoding/record uses a ring buffer of fixed size (for now 8).
    For `trap` errors, we print the actual kernel name, and trace if
    recorded.
    jdoerfert authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    9a10132 View commit details
    Browse the repository at this point in the history
  60. [libc][math][c23] Refactor expf16 (#101373)

    Also updates and sorts CMake target dependencies, and corrects the smoke
    test that expected expf16(sNaN) to return sNaN instead of aNaN, although
    the test still passed, as FPMatcher only checks whether both sides are
    NaN, not whether they're the same NaN value.
    overmighty authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    b66aa3b View commit details
    Browse the repository at this point in the history
  61. AMDGPU: Add testcase for materializing sgpr frame indexes (#101306)

    These add some IR tests for 57d10b4.
    These do rely on some lucky MIR placement to test the scc input, but I
    haven't found a better way to do it. Also, scc handling in inline asm
    is extremely buggy.
    arsenm authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    ef67664 View commit details
    Browse the repository at this point in the history
  62. [mlir][Linalg] Deprecate linalg::tileToForallOp and `linalg::tileTo…

    …ForallOpUsingTileSizes` (#91878)
    
    The implementation of these methods are legacy and they are removed in
    favor of using the `scf::tileUsingSCF` methods as replacements. To get
    the latter on par with requirements of the deprecated methods, the
    tiling allows one to specify the maximum number of tiles to use instead
    of specifying the tile sizes. When tiling to `scf.forall` this
    specification is used to generate the `num_threads` version of the
    operation.
    
    A slight deviation from previous implementation is that the deprecated
    method always generated the `num_threads` variant of the `scf.forall`
    operation. Instead now this is driven by the tiling options specified.
    This reduces the indexing math generated when the tile sizes are
    specified.
    
    **Moving from `linalg::tileToForallOp` to `scf::tileUsingSCF`**
    
    ```
    OpBuilder b;
    TilingInterface op;
    ArrayRef<OpFoldResult> numThreads;
    ArrayAttr mapping;
    FailureOr<ForallTilingResult> result =linalg::tileToForallOp(b, op, numThreads, mapping);
    ```
    
    can be replaced by
    ```
    scf::SCFTilingOptions options;
    options.setNumThreads(numThreads);
    options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp);
    options.setMapping(mapping.getValue()); /*note the difference that setMapping takes an ArrayRef<Attribute> */
    FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options);
    ```
    
    This generates the `numThreads` version of the `scf.forall` for the
    inter-tile loops, i.e.
    
    ```
    ... = scf.forall (%arg0, %arg1) in (%nt0, %nt1) shared_outs(...)
    ```
    
    **Moving from `linalg::tileToForallOpUsingTileSizes` to
    `scf::tileUsingSCF`**
    
    ```
    OpBuilder b;
    TilingInterface op;
    ArrayRef<OpFoldResult> tileSizes;
    ArrayAttr mapping;
    FailureOr<ForallTilingResult> result =linalg::tileToForallOpUsingTileSizes(b, op, tileSizes, mapping);
    ```
    
    can be replaced by
    ```
    scf::SCFTilingOptions options;
    options.setTileSizes(tileSizes);
    options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp);
    options.setMapping(mapping.getValue()); /*note the difference that setMapping takes an ArrayRef<Attribute> */
    FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options);
    ```
    
    Also note that `linalg::tileToForallOpUsingTileSizes` would effectively
    call the `linalg::tileToForallOp` by computing the `numThreads` from the
    `op` and `tileSizes` and generate the `numThreads` version of the
    `scf.forall`. That is not the case anymore. Instead this will directly
    generate the `tileSizes` version of the `scf.forall` op
    
    ```
    ... = scf.forall(%arg0, %arg1) = (%lb0, %lb1) to (%ub0, %ub1) step(%step0, %step1) shared_outs(...)
    ```
    
    If you actually want to use the `numThreads` version, it is upto the
    caller to compute the `numThreads` and set `options.setNumThreads`
    instead of `options.setTileSizes`. Note that there is a slight
    difference in the num threads version and tile size version. The former
    requires an additional `affine.max` on the tile size to ensure
    non-negative tile sizes. When lowering to `numThreads` version this
    `affine.max` is not needed since by construction the tile sizes are
    non-negative. In previous implementations, the `numThreads` version
    generated when using the `linalg::tileToForallOpUsingTileSizes` method
    would avoid generating the `affine.max` operation. To get the same
    state, downstream users will have to additionally normalize the
    `scf.forall` operation.
    
    **Changes to `transform.structured.tile_using_forall`**
    
    The transform dialect op that called into `linalg::tileToForallOp` and
    `linalg::tileToForallOpUsingTileSizes` have been modified to call
    `scf::tileUsingSCF`. The transform dialect op always generates the
    `numThreads` version of the `scf.forall` op. So when `tile_sizes` are
    specified for the transform dialect op, first the `tile_sizes` version
    of the `scf.forall` is generated by the `scf::tileUsingSCF` method which
    is then further normalized to get back to the same state. So there is no
    functional change to `transform.structured.tile_using_forall`. It always
    generates the `numThreads` version of the `scf.forall` op (as it did
    before this change).
    
    ---------
    
    Signed-off-by: MaheshRavishankar <[email protected]>
    MaheshRavishankar authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    6740d70 View commit details
    Browse the repository at this point in the history
  63. [Clang] Suppress missing architecture error when doing LTO (#100652)

    Summary:
    The `nvlink-wrapper` can do LTO now, which means we can still create
    some LLVM-IR without needing an architecture. In the case that we try to
    invoke `nvlink` internally, that will still fail. This patch simply
    defers the error until later so we can use `--lto-emit-llvm` to get the
    IR without specifying an architecture.
    jhuber6 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    2bf58f5 View commit details
    Browse the repository at this point in the history
  64. [ELF] Add -z nosectionheader

    GNU ld since 2.41 supports this option, which is mildly useful. It omits
    the section header table and non-ALLOC sections (including
    .symtab/.strtab (--strip-all)).
    
    This option is simple to implement and might be used by LLDB to test
    program headers parsing without the section header table (#100900).
    
    -z sectionheader, which is the default, is also added.
    
    Pull Request: llvm/llvm-project#101286
    MaskRay authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    5d972c5 View commit details
    Browse the repository at this point in the history
  65. [NFC][LLVM] Add RealtimeSanitizer LLVM code owners (#101231)

    Split from #100596
    cjappl authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    bf5e56d View commit details
    Browse the repository at this point in the history
  66. [RISCV] Use X0 for VLMax for slide1up/slide1down in lowerVectorIntrin…

    …sicScalars. (#101384)
    
    Previously, we created a vsetvlimax intrinsic. Using X0 simplifies the
    code and enables some optimizations to kick when the exact value of
    vlmax is known.
    topperc authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    3626443 View commit details
    Browse the repository at this point in the history
  67. [libc][math][c23] Add dfma{l,f128} and dsub{l,f128} C23 math function…

    …s (#101089)
    
    Co-authored-by: OverMighty <[email protected]>
    aaryanshukla and overmighty authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    30b5d4a View commit details
    Browse the repository at this point in the history
  68. [cmake] Replace remaining uses of llvm_check_linker_flag with CMake b…

    …uiltin
    
    89946bd changed uses of llvm_check_{compiler,linker} calls with equivalent CMake builtins and removed the llvm versions. Some references still existed to llvm_check_linker_flag, so this commit replaces those.
    calebwat committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    b287447 View commit details
    Browse the repository at this point in the history
  69. [BOLT][DWARF][NFC] Split DIEBuilder::finish (#101244)

    Split DIEBuilder::finish so that code updating .debug_names is in a
    separate function.
    sayhaan authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    910012e View commit details
    Browse the repository at this point in the history
  70. [sanitizer] Make file headers more conventional

    Add "-*- C++ -*-"
    MaskRay committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    c6a3f4e View commit details
    Browse the repository at this point in the history
  71. Revert "[lldb] Reland 2402b32 with /H to debug the windows build is…

    …sue"
    
    This reverts commit e72cdae, which broke
    LLVM's lldb builder for Windows msvc.
    zeroomega committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    9effefb View commit details
    Browse the repository at this point in the history
  72. [SCEV] Use power of two facts involving vscale when inferring wrap fl…

    …ags (#101380)
    
    SCEV has logic for inferring wrap flags on AddRecs which are known to
    control an exit based on whether the step is a power of two. This logic
    only considered constants, and thus did not trigger for steps such as (4
    x vscale) which are common in scalably vectorized loops.
    
    The net effect is that we were very sensative to the preservation of
    nsw/nuw flags on such IVs, and could not infer trip counts if they got
    lost for any reason.
    
    ---------
    
    Co-authored-by: Nikita Popov <[email protected]>
    preames and nikic authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    7583c48 View commit details
    Browse the repository at this point in the history
  73. [mlir][Transforms] Dialect conversion: Skip materializations when run…

    …ning without converter (#101318)
    
    TODO: test case
    matthias-springer authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    2aa96fc View commit details
    Browse the repository at this point in the history
  74. [mlir][sparse] implement sparse_tensor.extract_value operation. (#1…

    …01220)
    Peiming Liu authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    951a363 View commit details
    Browse the repository at this point in the history
  75. [TableGen] Pass ValueTypeByHwMode by const reference in a couple plac…

    …es. NFC
    
    ValueTypeByHwMode contains a std::map. We shouldn't copy it if
    we don't need to .
    
    Fixes #101406.
    topperc committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    c2dc46c View commit details
    Browse the repository at this point in the history
  76. [TableGen] Add an explicit cast to allow one TypeSetByHwMode construc…

    …tor to be removed. NFC
    
    This constructor was taking a ValueTypeByMode by value to create
    an ArrayRef. By adding an explicit cast from ValueTypeByHwMode
    to TypeSetByHwMode we allow the ArrayRef to be implicitly converted
    from a single element.
    topperc committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    24f8d10 View commit details
    Browse the repository at this point in the history
  77. [libc++] Drop support for the C++20 Synchronization Library before C+…

    …+20 (#82008)
    
    When we initially implemented the C++20 synchronization library, we
    reluctantly accepted for the implementation to be backported to C++03
    upon request from the person who provided the patch. This was when we
    were only starting to have experience with the issues this can create,
    so we flinched. Nowadays, we have a much stricter stance about not
    backporting features to previous standards.
    
    We have recently started fixing several bugs (and near bugs) in our
    implementation of the synchronization library. A recurring theme during
    these reviews has been how difficult to understand the current code is,
    and upon inspection it becomes clear that being able to use a few recent
    C++ features (in particular lambdas) would help a great deal. The code
    would still be pretty intricate, but it would be a lot easier to reason
    about the flow of callbacks through things like
    __thread_poll_with_backoff.
    
    As a result, this patch drops support for the synchronization library
    before C++20. This makes us more strictly conforming and opens the door
    to major simplifications, in particular around atomic_wait which was
    supported all the way to C++03.
    
    This change will probably have some impact on downstream users, however
    since the C++20 synchronization library was added only in LLVM 10 (~3
    years ago) and it's quite a niche feature, the set of people trying to
    use this part of the library before C++20 should be reasonably small.
    ldionne authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    bf1666f View commit details
    Browse the repository at this point in the history
  78. [libc] Add vsscanf function (#101402)

    Summary:
    Adds support for the `vsscanf` function similar to `sscanf`.
    Based off of llvm/llvm-project#97529.
    jhuber6 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    38ef692 View commit details
    Browse the repository at this point in the history
  79. [mlir][sparse] introduce sparse_tensor.coiterate operation. (#101100)

    This PR introduces `sparse_tensor.coiterate` operation, which represents
    a loop that traverses multiple sparse iteration space.
    Peiming Liu authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    785a24f View commit details
    Browse the repository at this point in the history
  80. [RISCV] Remove unncessary FP extensions from some integer only vector…

    … tests.
    
    I'm going to do a review to make sure we are testing Zvfhmin instead of
    Zvfh where clang expects it to work for half types, like loads/stores. Removing
    unnecessary FP makes less things to review.
    topperc committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    26766a0 View commit details
    Browse the repository at this point in the history
  81. Configuration menu
    Copy the full SHA
    74f9579 View commit details
    Browse the repository at this point in the history
  82. [Clang] [NFC] Fix potential dereferencing of nullptr (#101405)

    This patch replaces getAs with castAs and dyn_cast with cast to ensure
    type safety and prevents potential null pointer dereferences. These
    changes enforce compile-time checks for correct type casting in
    ASTContext and CodeGenModule.
    smanna12 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    cf79aba View commit details
    Browse the repository at this point in the history
  83. [NVPTX] Make minimum/maximum work on older GPUs

    We want to use newer instructions if we are targeting sufficiently new
    SM and PTX versions. If we cannot use those newer instructions, let LLVM
    synthesize the sequence from more fundamental instructions.
    majnemer committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    6f318d4 View commit details
    Browse the repository at this point in the history
  84. [SandboxIR][NFC] Move BasicBlock class definition up (#101422)

    To make future PRs smaller.
    aeubanks authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    ee0f43a View commit details
    Browse the repository at this point in the history
  85. [RISCV][GlobalISel] Legalize Scalable Vector Loads and Stores (#84965)

    This patch supports legalizing load and store instruction for scalable
    vectors in RISCV
    jiahanxie353 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    a0d8fa5 View commit details
    Browse the repository at this point in the history
  86. [GISEL][RISCV] RegBank Select for Scalable Vector Load/Store (#99932)

    This patch supports GlobalISel for register bank selection for scalable vector
    load and store instructions in RISC-V
    jiahanxie353 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    1c66ef9 View commit details
    Browse the repository at this point in the history
  87. [NFC][Clang] Clean up VisitUnaryPlus by removing unused FP feature ch…

    …eck (#101412)
    
    This commit removes an unnecessary call to `E->hasStoredFPFeatures()`
    within the `VisitUnaryPlus` function. The method's return value was not
    being used, leading to a redundant operation. The removal of this line
    streamlines the function and eliminates an unneeded check for stored
    floating-point features.
    smanna12 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    d5d1cf0 View commit details
    Browse the repository at this point in the history
  88. Configuration menu
    Copy the full SHA
    65d3c22 View commit details
    Browse the repository at this point in the history
  89. [SandboxIR] Implement PHINodes (#101111)

    This patch implements sandboxir::PHINode which mirrors llvm::PHINode.
    
    Based almost entirely on work by vporpo.
    Sterling-Augustine authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    3403b59 View commit details
    Browse the repository at this point in the history
  90. Forward declare OSSpinLockLock on MacOS since it's not shipped on the…

    … system. (#101392)
    
    Fixes build errors on some SDKs.
    
    rdar://132607572
    aemerson authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    3a4c7cc View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2024

  1. [LegalizeTypes][RISCV][LoongArch] Optimize promotion of ucmp. (#101366)

    ucmp can be promoted with either sext or zext. RISC-V and LoongArch
    prefer sext for promoting i32 to i64 unless the inputs are known to be
    zero extended already.
    
    This patch uses the existing SExtOrZExtPromotedOperands function that is
    used by SETCC promotion to intelligently handle this.
    topperc authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    307d124 View commit details
    Browse the repository at this point in the history
  2. [DirectX] Rename backend DXIL resource analysis passes to DXILResourc…

    …eMD*. NFC
    
    These passes will be replaced soon as we move to the target extension based
    resource handling in the DirectX backend, but removing them now before the
    replacement stuff is all up and running would be very disruptive. However, we
    do need to move these passes out of the way to avoid symbol conflicts with the
    new DXILResourceAnalysis in the Analysis library.
    
    Note: I tried an even simpler hack in #100698 but it doesn't really work. A
    rename is the most expedient path forward here.
    
    Pull Request: llvm/llvm-project#101393
    bogner authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    1c5f6cf View commit details
    Browse the repository at this point in the history
  3. [lldb] Use Target references instead of pointers in CommandObject (NFC)

    The GetTarget helper returns a Target reference so there's reason to
    convert it to a pointer and check its validity.
    JDevlieghere committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    5dbbc3b View commit details
    Browse the repository at this point in the history
  4. [RISCV] Use experimental.vp.splat to splat specific vector length ele…

    …ments. (#101329)
    
    Previously, llvm IR is hard to create a scalable vector splat with a
    specific vector length, so we use riscv.vmv.v.x and riscv.vmv.v.f to do
    this work. But the two rvv intrinsics needs strict type constraint which
    can not support fixed vector types and illegal vector types. Using
    vp.splat could preserve old functionality and also generate more
    optimized code for vector types and illegal vectors.
    This patch also fixes crash for getEVT not serving ptr types.
    yetingk authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    87af9ee View commit details
    Browse the repository at this point in the history
  5. [TableGen][MVT] Lower the maximum 16-bit MVT from 16384 to 511. (#101…

    …401)
    
    MachineValueTypeSet in tablegen allocates an array with a bit per MVT.
    This used to be 256 bits, with the introduction of 16-bit MVT it
    ballooned to 65536 bits. I suspect this is increasing the memory usage
    of many of the data structures used by CodeGenDAGPatterns.
    
    Since we don't need the full 16-bit range yet, this patch proposes
    lowering the maximum MVT to 511 and using only 512 bits for
    MachineValueTypeSet's storage.
    topperc authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e2c74aa View commit details
    Browse the repository at this point in the history
  6. [RISCV][GISel] Slightly simplify the regbank selection for G_LOAD/STO…

    …RE. NFC (#101431)
    
    Merge the isVector early out with the previous check for isVector.
    topperc authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    a1ba4fb View commit details
    Browse the repository at this point in the history
  7. [mlir][spirv] Fix tablegen generator script's stripping of prefixes (…

    …#101378)
    
    This script looks for existing definitions with the `SPIRV_` prefix, so
    that it can preserve them when updating the file. When the commit
    2d62833 changed the prefix from `SPV_`,
    the number of characters to strip from matched names was not updated,
    which broke this feature. This commit fixes remaining cases that weren't
    fixed by 339c87a.
    
    The relationship of this script to the files it is meant to maintain is
    still bitrotten in other ways.
    andfau-amd authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    bc6834f View commit details
    Browse the repository at this point in the history
  8. [MemProf] Fix when function has indirect call (#101170)

    When function has indirect call in LTO mode, it causes `assert(Alias)`
    in `findProfiledCalleeThroughTailCalls`
    lifengxiang1025 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e6aeb3f View commit details
    Browse the repository at this point in the history
  9. [SandboxIR][NFC] Factor out common test for CastInst subclasses (#101…

    …410)
    
    The tests for most CastInst sub-classes, except AddrSpaceCastInst, are
    very similar.
    This patch creates a common template function for all of them.
    vporpo authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    9227fd7 View commit details
    Browse the repository at this point in the history
  10. [mlir][Transforms] Preserve all analysis in print passes (#101315)

    PrintIRPass, PrintOpStatsPass and PrintOpGraphPass don't mutate IR so
    preserve all analysis to save computation resource a bit.
    uenoku authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    42c413b View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    ed12f80 View commit details
    Browse the repository at this point in the history
  12. [nsan][NFC] Use cast when dyn_cast is not needed. (#101147)

    Use `cast` instead to replace `dyn_cast` when `dyn_cast` is not
    needed/not checked.
    yingcong-wu authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    430b90f View commit details
    Browse the repository at this point in the history
  13. [RISCV] Increase default tail duplication threshold to 6 at -O3 (#98873)

    This is just like AArch64.
    
    Changing the threshold to 6 will increase the code size, but will
    also decrease unconditional branches. CPUs with wide fetch/issue units
    can benefit from it.
    
    The value 6 may be debatable, we can set it to `SchedModel.IssueWidth`.
    wangpc-pp authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    27b6080 View commit details
    Browse the repository at this point in the history
  14. [TargetLowering] Remove weird use of MVT::isVoid in an assert. (#101436)

    At the time this was written there were no vector types in MVT. The
    order was:
    -scalar integer types
    -scalar FP types
    -isVoid
    
    I believe this isVoid check was to catch walking off the end of the
    scalar FP types. While the isInteger()==isInteger caught walking off the
    end of scalar integer types.
    
    These days we have:
    -scalar integer types
    -scalar FP types
    -fixed vector integer types
    -fixed vector FP types
    -scalable vector integer types
    -scalable vector FP types.
    -Glue
    -isVoid
    
    So checking isVoid doesn't detect what it used to. I've changed it to
    check isFloatingPoint() == isFloatingPoint() instead.
    topperc authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    991a621 View commit details
    Browse the repository at this point in the history
  15. [BOLT][NFC] Add timers for MetadataManager invocations

    Test Plan: added bolt/test/timers.c
    
    Reviewers: ayermolo, maksfb, rafaelauler, dcci
    
    Reviewed By: dcci
    
    Pull Request: llvm/llvm-project#101267
    aaupov authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    fb97b4f View commit details
    Browse the repository at this point in the history
  16. [BOLT][NFC] Print timers in perf2bolt invocation

    When BOLT is run in AggregateOnly mode (perf2bolt), it exits with code
    zero so destructors are not run thus TimerGroup never prints the timers.
    
    Add explicit printing just before the exit to honor options requesting
    timers (`--time-rewrite`, `--time-aggr`).
    
    Test Plan: updated bolt/test/timers.c
    
    Reviewers: ayermolo, maksfb, rafaelauler, dcci
    
    Reviewed By: dcci
    
    Pull Request: llvm/llvm-project#101270
    aaupov authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    3f51bec View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    9d068f7 View commit details
    Browse the repository at this point in the history
  18. AMDGPU/GlobalISel: Permit mapping G_FRAME_INDEX to sgprs (#101325)

    eliminateFrameIndex should now properly handle materializing
    frame indices in SGPRs, so treat this like the other constant
    operand types.
    
    On average this will produce worse code; we need to detect
    VGPR uses, and improve SGPR->VGPR frame index folds.
    arsenm authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    86815a1 View commit details
    Browse the repository at this point in the history
  19. [MIR] Remove separate Size variable from parseMachineMemoryOperand. N…

    …FC (#101453)
    
    Size is updated in sync with MemoryType. Instead of maintaining a
    separate Size, use the size from MemoryType where needed.
    topperc authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    72ed808 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    129a8e1 View commit details
    Browse the repository at this point in the history
  21. [GlobalISel][TableGen] MIR Pattern Variadics (#100563)

    Allow for matching & rewriting a variable number of arguments in an
    instructions.
    
    Solves #87459
    Pierre-vh authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    972c029 View commit details
    Browse the repository at this point in the history
  22. [RISCV] Add vector bf16 load/store intrinsic tests. NFC

    This adds bf16 to the unit stride, strided, and index load and
    store intrinsics. clang already assumes these work with Zvfbfmin.
    topperc committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    04e8433 View commit details
    Browse the repository at this point in the history
  23. [RISCV] Replace Zvfh with Zvfhmin on vector load/store intrinsic test…

    …s. NFC
    
    clang uses these with Zvfhmin so we should test them.
    topperc committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    84a3739 View commit details
    Browse the repository at this point in the history
  24. [GlobalISel][TableGen] Make variadic-errors.td test more robust

    Use a regex instead of hardcoded numbers for anonymous pattern suffixes.
    Pierre-vh committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    ab33c3d View commit details
    Browse the repository at this point in the history
  25. [C++20] [Modules] Always emit the inline builtins (#101278)

    See the attached test for the motivation example. If we're too greedy to
    not emit the definition for inline builtins, we may meet a middle end
    crash. And it should be good to emit inline builtins always.
    ChuanqiXu9 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e167f75 View commit details
    Browse the repository at this point in the history
  26. AMDGPU: Cleanup extract_subvector actions (NFC) (#101454)

    The base AMDGPUISelLowering was setting custom action on 16-bit
    vector types, but also set in SIISelLowering.
    arsenm authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    1d2b2d2 View commit details
    Browse the repository at this point in the history
  27. [RISCV] Add back missing vmv_v_x_vl pattern predicates (#101455)

    Looks like these got left behind in
    17e2d07
    lukel97 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    fdce0bf View commit details
    Browse the repository at this point in the history
  28. [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm,mips64,powerpc}…

    … declarations (#101403)
    
    Similar to #97796, fix the type of the `native_thread` parameter for the
    arm, mips64 and powerpc variants of `NativeRegisterContextFreeBSD_*`.
    
    Otherwise, this leads to compile errors similar to:
    
    ```
    lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.cpp:85:39: error: out-of-line definition of 'NativeRegisterContextFreeBSD_powerpc' does not match any declaration in 'lldb_private::process_freebsd::NativeRegisterContextFreeBSD_powerpc'
       85 | NativeRegisterContextFreeBSD_powerpc::NativeRegisterContextFreeBSD_powerpc(
          |                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ```
    DimitryAndric authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    7088a5e View commit details
    Browse the repository at this point in the history
  29. Revert "[mlir][Transforms] Dialect conversion: Skip materializations …

    …when running without converter (#101318)"
    
    This reverts commit 2aa96fc.
    
    This was merged without a test. Also it seems it was only fixing an
    issue for users which used a particular workaround that is not actually
    needed anymore (skipping UnrealizedConversionCast operands).
    akuegel committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    17ba4f4 View commit details
    Browse the repository at this point in the history
  30. [libc++][NFC] Avoid opening namespace std in the tests (#94160)

    This also adds a few FIXMEs where we use UB in the tests.
    philnik777 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    5dfdac7 View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    f51a479 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    4f42deb View commit details
    Browse the repository at this point in the history
  33. [C++20][Modules] Allow using stdarg.h with header units (#100739)

    Summary:
    Macro like `va_start`/`va_end` marked as builtin functions that makes
    these identifiers special and it results in redefinition of the
    identifiers as builtins and it hides macro definitions during preloading
    C++ modules. In case of modules Clang ignores special identifiers but
    `PP.getCurrentModule()` was not set. This diff fixes IsModule detection
    logic for this particular case.
    
    Test Plan: check-clang
    
    ---------
    
    Co-authored-by: Chuanqi Xu <[email protected]>
    dmpolukhin and ChuanqiXu9 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    f3761a4 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    65c000a View commit details
    Browse the repository at this point in the history
  35. [AMDGPU] SIWholeQuadMode: avoid execz effects in exact regions (#101157)

    Exact mode regions within WQM may have EXEC=0 in divergent control flow.
    This occurs if a branch is only taken by helper lanes and an instruction
    requiring WQM disabling is encountered.
    
    The current code extends the exact region as far as possible; however,
    this can result in it including instructions with unwanted side effects
    at EXEC=0.
    In particular readfirstlane combined with scalar loads can produce
    invalid memory accesses in this circumstance.
    
    Workaround this by shrinking exact regions to only the instructions
    requiring WQM disabling when unwanted side effects are present.
    Eventually we should branch over these regions when EXEC=0, but this
    requires visibility of CFG/divergence information not currently
    available.
    perlfu authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    3611c0b View commit details
    Browse the repository at this point in the history
  36. [flang][OpenMP] Delayed privatization for variables with `equivalence…

    …` association (#100531)
    
    Handles variables that are storage associated via `equivalence`. The
    problem is that these variables are declared as `fir.ptr`s while their
    privatized storage is declared as `fir.ref` which was triggering a
    validation error in the OpenMP dialect.
    ergawy authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    bbadbf7 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    3b3b891 View commit details
    Browse the repository at this point in the history
  38. [mlir][vector] Add tests xfer-permute-lowering (nfc)(2/n) (#96033)

    Adds more tests to:
      * vector-transfer-permutation-lowering.mlir
    
    Specifically, adds tests for:
      * out-of-bounds access for the `TransferWritePermutationLowering`
        pattern
      * in-bounds access for `TransferWriteNonPermutationLowering` +
        `TransferWritePermutationLowering`
    
    Also renames `@permutation_with_mask_xfer_write_fixed_width` as
    `@xfer_write_non_transposing_permutation_map`.
    
    This is a part of a larger effort to make sure that all key cases for
    patterns under populateVectorTransferPermutationMapLoweringPatterns
    (*) are tested. I also want to make sure that tests use consistent
    function and variable names.
    
    (*) transform.apply_patterns.vector.transfer_permutation_patterns in
    TD parlance)
    banach-space authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    85fbc4f View commit details
    Browse the repository at this point in the history
  39. Revert "Simplify hot-path size computations in BumpPtrAllocator. (#10…

    …1312)"
    
    This reverts commit 65c000a.
    resistor committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    67730ae View commit details
    Browse the repository at this point in the history
  40. [LowerMatrixIntrinsics] Fix type suffix for matrix.multiply.* (#100940)

    Based on the [proposal
    PDF](https://llvm.org/devmtg/2020-09/slides/Hahn-Matrix_Support_in_LLVM_and_Clang.pdf)
    and the test code under
    [llvm/test/Transforms/LowerMatrixIntrinsics](https://github.com/llvm/llvm-project/tree/main/llvm/test/Transforms/LowerMatrixIntrinsics),
    the suffix for the `@llvm.matrix.multiply.*` intrinsic should be {output
    matrix type}.{input matrix 1 type}.{input matrix 2 type} (e.g.,
    `@llvm.matrix.multiply.v4i32.v4i32.v4i32`).
    
    This PR corrects the places where these suffixes do not follow the
    aforementioned format.
    sbite0138 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    05d3f5e View commit details
    Browse the repository at this point in the history
  41. [clang][analyzer] Improve PointerSubChecker (#96501)

    The checker could report false positives if pointer arithmetic was done
    on pointers to non-array data before pointer subtraction. Another
    problem is fixed that could cause false positive if members of the same
    structure but in different memory objects are subtracted.
    balazske authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    cab91ec View commit details
    Browse the repository at this point in the history
  42. [NFC][libc++][libc++abi][libunwind][test] Fix/unify AIX triples used …

    …in LIT tests (#101196)
    
    This patch fixes/unifies AIX target triples used in libc++, libc++abi,
    and libunwind LIT tests.
    xingxue-ibm authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    2d36550 View commit details
    Browse the repository at this point in the history
  43. [Inliner] Fix bugs for partial inlining with vector

    In the cost model of partial inlining, cost for intrinsics will be applied. However, some intrinsics for vector have invalid cost, which is not allowed for partial inlining. Instead of assertion, we directly do not do partial inlining  in this circumstance to avoid compiling errors.
    joshua-arch1 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    0a5e572 View commit details
    Browse the repository at this point in the history
  44. [libc] Change the GPU loaders to LLVM executables (#101442)

    Summary:
    I am going to rework these tools to just me LLVM tools. This patch is
    pretty much NFC to set up the CMake for that.
    jhuber6 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    feeb833 View commit details
    Browse the repository at this point in the history
  45. Revert "[Inliner] Fix bugs for partial inlining with vector"

    This reverts commit llvm/llvm-project@0a5e572,
    since I forgot to start a pull request.
    joshua-arch1 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    241a05a View commit details
    Browse the repository at this point in the history
  46. [libc] Remove extra parens

    jhuber6 committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    097a1d2 View commit details
    Browse the repository at this point in the history
  47. AMDGPU: Add baseline test for copysign combine

    We can use known bits information to avoid masking out one or
    both of the operands.
    arsenm committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    2feb058 View commit details
    Browse the repository at this point in the history
  48. [NVPTX][NFC] Remove unneeded declarations in test (#101167)

    Only the bf16 declarations are needed, as only they are lowered in
    AutoUpgrade.cpp.
    f16 and other builtins have LLVM intrinsics already defined.
    hdelan authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    3d1e1d9 View commit details
    Browse the repository at this point in the history
  49. [libc++] Remove dedicated namespaces for ranges functions (#76543)

    We originally put implementation-detail function objects into individual
    namespaces for `std::ranges` without a good reason for doing so. This
    practice was continued, presumably because there was prior art. Since
    there's no reason to keep these namespaces, this commit removes them,
    which will slightly impact binary size.
    
    This commit does not apply to CPOs, some of which need additional work.
    cjdb authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    d10dc5a View commit details
    Browse the repository at this point in the history
  50. [libc++] Fix missing declarations of uses_allocator_construction_args…

    … (#67044)
    
    We were not declaring `__uses_allocator_construction_args` helper 
    functions, leading to several valid uses failing to compile. This
    patch solves the problem by moving these helper functions into a
    struct, which also reduces the amount of redundant SFINAE we need
    to perform since most overloads are checking for a cv-qualfied pair.
    
    Fixes #66714
    
    Co-authored-by: Louis Dionne <[email protected]>
    phyBrackets and ldionne authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    beecf2c View commit details
    Browse the repository at this point in the history
  51. [libc++] Avoid using **this in error messages for expected monadic op…

    …erations (#84840)
    
    Instead of using **this in error messages for std::expected monadic
    operations, use value(). As shown in LWG3969, **this can trigger
    unintended ADL and while it's only an error message, we might as
    well be ADL-correct there too.
    
    Co-authored-by: Louis Dionne <[email protected]>
    ZERO-N and ldionne authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    3891468 View commit details
    Browse the repository at this point in the history
  52. [NFC] [Clang] Some core issues have changed status from tentatively r…

    …eady -> ready / review (#97200)
    
    Also classes the "ready" status similarly to "tentatively ready" in
    make_cxx_dr_status
    MitalAshok authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    14c8feb View commit details
    Browse the repository at this point in the history
  53. [LLVM][ISel][SVE] Remove redundant merging fp patterns. (#101351)

    Since "vselect cond, (binop, x, y), x" became the canonical form the
    equivalent PatFrags for "binop x, (vselect cond, y, 0)" are no longer
    required.
    paulwalker-arm authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    1fbd7be View commit details
    Browse the repository at this point in the history
  54. [lldb][test] Disable vla test on Windows

    For the same reasons as 6cfac49.
    
    This test was added in llvm/llvm-project#100710.
    
    It fails because when we're linking with link.exe, -gdwarf has no
    effect and we get a PDB file anyway. The Windows on Arm lldb bot
    uses link.exe.
    
     "C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.34.31933\\bin\\Hostx86\\arm64\\link.exe" <...>
    
    08/01/2024  01:47 PM         2,956,488 vla.cpp.ilk
    08/01/2024  01:47 PM         6,582,272 vla.cpp.pdb
    08/01/2024  01:47 PM           734,208 vla.cpp.tmp
    DavidSpickett committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    229a165 View commit details
    Browse the repository at this point in the history
  55. [Flang][Driver] Introduce -fopenmp-targets offloading option (#100152)

    This patch modifies the flang driver to introduce the `-fopenmp-targets`
    option to the frontend compiler invocations corresponding to the OpenMP
    host device on offloading-enabled compilations.
    
    This option holds the list of offloading triples associated to the
    compilation and is used by clang to determine whether offloading calls
    should be generated for the host.
    skatrak authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e145123 View commit details
    Browse the repository at this point in the history
  56. [AIX] Turn on #pragma mc_func check by default (#101336)

    llvm/llvm-project#99888 added a check (and
    corresponding options) to flag uses of `#pragma mc_func` on AIX.
    
    This PR turns on the check by default.
    qiongsiwu authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    b933517 View commit details
    Browse the repository at this point in the history
  57. [clang] Fix crash with multiple non-parenthsized sizeof (#101297)

    There are 5 unary operators that can be followed by a non-parenthesized
    expression: `sizeof`, `__datasizeof`, `__alignof`, `alignof`,
    `_Alignof`. When we nest them too deep, `BalancedDelimiterTracker` does
    not help, because there are no parentheses, and we crash. Instead, this
    patch recognize chains of those operators, and parse them with
    sufficient stack space.
    
    Fixes #45061
    Endilll authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    130c135 View commit details
    Browse the repository at this point in the history
  58. [Clang] Fix definition of layout-compatible to ignore empty classes (…

    …#92103)
    
    Also changes the behaviour of `__builtin_is_layout_compatible`
    
    None of the historic nor the current definition of layout-compatible
    classes mention anything about base classes (other than implicitly
    through being standard-layout) and are defined in terms of members, not
    direct members.
    MitalAshok authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    5d7357c View commit details
    Browse the repository at this point in the history
  59. [libc++] Increase atomic_ref's required alignment for small types (#9…

    …9654)
    
    This patch increases the alignment requirement for std::atomic_ref
    such that we can guarantee lockfree operations more often. Specifically,
    we require types that are 1, 2, 4, 8, or 16 bytes in size to be aligned
    to at least their size to be used with std::atomic_ref.
    
    This is the case for most types, however a notable exception is
    `long long` on x86, which is 8 bytes in length but has an alignment
    of 4.
    
    As a result of this patch, one has to be more careful about the
    alignment of objects used with std::atomic_ref. Failure to provide
    a properly-aligned object to std::atomic_ref is a precondition 
    violation and is technically UB. On the flipside, this allows us
    to provide an atomic_ref that is actually lockfree more often, 
    which is an important QOI property.
    
    More information in the discussion at llvm/llvm-project#99570 (comment).
    
    Co-authored-by: Louis Dionne <[email protected]>
    dalg24 and ldionne authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    59ca618 View commit details
    Browse the repository at this point in the history
  60. [InstCombine] Convert mem intrinsic with null into a noop (#100388)

    When src/dest passed into memset/memcpy is null: 
    ```
    len == 0: this call is a noop.
    len != 0: the behavior is undefined.
    ```
    See also https://llvm.org/docs/LangRef.html#llvm-memset-intrinsics
    Alive2: https://alive2.llvm.org/ce/z/tJeRNL
    
    This patch converts these mem intrinsic calls into an assumption `len ==
    0` to mitigate code-size bloat caused by JumpThreading.
    dtcxzyw authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    4e89d11 View commit details
    Browse the repository at this point in the history
  61. [libc++][stringbuf] Test and document LWG2995. (#100879)

    As mentioned in the LWG issue libc++ has already implemented the
    optimization. This adds tests and documents the implementation defined
    behaviour.
    
    Drive-by fixes an initialization.
    mordante authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    d5a6ec1 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    5ad15e5 View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    e7630a0 View commit details
    Browse the repository at this point in the history
  64. [RISCV] Support f16 vmv.v.v and vmerge.vvm intrinsics with Zvfhmin. (…

    …#101457)
    
    Clang expects that this works.
    topperc authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    d2c0459 View commit details
    Browse the repository at this point in the history
  65. [Mem2Reg] Replace block maps with block numbers (#101391)

    Very minor performance improvement.
    aengelke authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e833e8b View commit details
    Browse the repository at this point in the history
  66. [CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727)

    Currently, the LowerConstantIntrinsics pass does an RPO traversal of
    every function... only to find that many functions don't have constant
    intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is
    already a pre-isel intrinsic lowering pass, which iterates over
    intrinsic declarations and lowers all users. Call
    lowerConstantIntrinsics from this pass to avoid the extra iteration over
    the entire IR and the RPO traversal.
    aengelke authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    b5fc083 View commit details
    Browse the repository at this point in the history
  67. [ConstantRange] Add support for shlWithNoWrap (#100594)

    This patch adds initial support for `ConstantRange:: shlWithNoWrap` to
    fold dtcxzyw/llvm-tools#22. However, this
    patch cannot fix the original issue. Improvements will be submitted in subsequent patches.
    dtcxzyw authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    1a5d892 View commit details
    Browse the repository at this point in the history
  68. [Hexagon] Do not optimize address of another function's block (#101209)

    When the constant extender optimization pass encounters an instruction
    that uses an extended address pointing to another function's block,
    avoid adding the instruction to the extender list for the current
    machine function.
    
    Fixes llvm/llvm-project#99714
    yandalur authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    68df06a View commit details
    Browse the repository at this point in the history
  69. [libc] Remove verbose printing from hdrgen tool (#101376)

    Summary:
    This fills the terminal with information already present from the
    `add_custom_command(COMMENT ...)` field, so it breaks everything into
    new lines. Remove this print to clean that up.
    jhuber6 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    6d40580 View commit details
    Browse the repository at this point in the history
  70. [Hexagon] Fix concat lowering for HVX for 64B vector length (#98318)

    When concatenation of vector instructions is formed, as a part of it
    vector rotation is performed. The direction of the shift was not
    correctly calculated. This fixes the rotation factor.
    quic-santdas authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    2771ea4 View commit details
    Browse the repository at this point in the history
  71. [mlir][vector] Update tests for xfer-permute-lowering (nfc) (#101468)

    Updates formatting and variable names in:
      * vector-transfer-permutation-lowering.mlir
    
    This is primarily to improve consistency, both within this particular
    test file as well as across tests. In particular, with this PR I'm
    adopting similar naming convention to that that's already present in
    vector-transfer-flatten.mlir.
    
    Overview of changes:
      * All memref input arguments are re-named as `%mem`.
      * All vector input arguments are re-named as `%vec`.
      * All tensor input arguments are re-named as `%dest`.
      * LIT variables are update to be consistent with input arguments.
      * Renamed all output arguments as `%res`.
      * Updated indentation to be more C-like.
    banach-space authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    98e4413 View commit details
    Browse the repository at this point in the history
  72. [flang][runtime] Avoid call recursion in CopyElement runtime. (#101421)

    Device compilers may fail to identify maximum stack size required
    by a kernel that calls CopyElement due to potential recursive calls.
    To avoid this, we can use dynamically allocated Stack. To avoid
    dynamic allocations on the host for simple cases, the Stack
    implementation
    has a reserved space (that ends up being allocated on the program
    stack).
    I tested both pre-allocated and 0-reserve implementations on the host,
    and all passed. The actual reserve values might be tuned as needed.
    vzakhari authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    2177a17 View commit details
    Browse the repository at this point in the history
  73. [flang] Add ability to have special allocator for descriptor data (#1…

    …00690)
    
    This patch enhances the descriptor with the ability to have specialized
    allocator. The allocators are registered in a dedicated registry and the
    index of the desired allocator is stored in the descriptor. The default
    allocator, std::malloc, is registered at index 0.
    
    In order to have this allocator index in the descriptor, the f18Addendum
    field is repurposed to be able to hold the presence flag for the
    addendum (lsb) and the allocator index.
    
    Since this is a change in the semantic and name of the 7th field of the
    descriptor, the CFI_VERSION is bumped to the date of the initial change.
    
    This patch only adds the ability to have this features as part of the
    descriptor but does not add specific allocator yet. CUDA fortran will be
    the first user of this feature to allocate descriptor data in the
    different type of device memory base on the CUDA attribute.
    
    ---------
    
    Co-authored-by: Slava Zakharin <[email protected]>
    clementval and vzakhari authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    6df4e7c View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    c7c5e05 View commit details
    Browse the repository at this point in the history
  75. [NFC][asan][odr] Use IntrusiveList for a ListOfGlobals

    Extracted from #100923.
    artempyanykh authored and vitalybuka committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    2a5f7e5 View commit details
    Browse the repository at this point in the history
  76. [libc] Implement vasprintf and asprintf (#98824)

    [libc] Implement vasprintf and asprintf
    
    ---------
    
    Co-authored-by: Izaak Schroeder <[email protected]>
    tszhin-swe and izaakschroeder authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    a5e67fb View commit details
    Browse the repository at this point in the history
  77. Configuration menu
    Copy the full SHA
    0c31123 View commit details
    Browse the repository at this point in the history
  78. [MachO] Remove redundant bounds check (#100176)

    The condition was duplicated, the correct one for this message would
    have been `ImportsEnd > SymbolsEnd`. However, this is a subset of
    `ImportEnd > Symbols` (since `Symbols <= SymbolsEnd`), so it can be
    removed altogether.
    
    I made this thinko in 686d8ce.
    
    Note that that change wasn't intended to be permanent, and served as a
    quick stopgap to facilitate testing chained fixups in LLD before Apple
    upstreamed their implementation.
    
    Fixes #90662
    Fixes #87203
    BertalanD authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    7da1dbb View commit details
    Browse the repository at this point in the history
  79. [ELF] Support relocatable files using CREL with explicit addends

    ... using the temporary section type code 0x40000020
    (`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
    code and break compatibility (Clang and lld of different versions are
    not guaranteed to cooperate, unlike other features). CREL with implicit
    addends are not supported.
    
    ---
    
    Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
    update users to check `crels`.
    
    (The decoding performance is critical and error checking is difficult.
    Follow `skipLeb` and `R_*LEB128` handling, do not use
    `llvm::decodeULEB128`, whichs compiles to a lot of code.)
    
    A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
    `/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
    convert CREL to RELA (`relas` instead of `crels` will be used). Since
    allocating a buffer increases, the conversion is only performed when
    absolutely necessary.
    
    ---
    
    Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
    links. SHT_CREL and SHT_RELA components need reencoding since
    r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
    relocations referencing a symbol in a discarded section are converted to
    `R_*_NONE`).
    
    * SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
    * SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
    * SHT_REL components: print an error for now.
    
    SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
    unsupported yet.
    
    Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600
    
    Pull Request: llvm/llvm-project#98115
    MaskRay authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    0af07c0 View commit details
    Browse the repository at this point in the history
  80. [SandboxIR][NFC] Introduce templated CastInstImpl to simplify subclas…

    …ses (#101427)
    
    The CastInst subclasses all have pretty much the same implementation.
    Add a helper templated class to help stamp out the subclasses more
    succinctly.
    aeubanks authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    d68a4d5 View commit details
    Browse the repository at this point in the history
  81. [SystemZ][z/OS] Fix incorrect codegen for ADA_ENTRY pseudo instructio…

    …n (#101415)
    
    The current MCInstBuilder for generating an ALGFI when loading something
    from the ADA is incorrect and will crash the compiler.
    
    r0 must also be excluded from the registers returned as the result,
    since it is treated as the value "0" on z/OS.
    
    Also add some tests to properly test the paths where LLILF and ALGFI are
    generated.
    
    ---------
    
    Co-authored-by: Tony Tao <[email protected]>
    tltao and Tony Tao authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    bc747c3 View commit details
    Browse the repository at this point in the history
  82. [libc] created fuzz test for sin function (#101411)

    Verifies that sin function output is correct by comparing with MPFR
    output. NaN and inf are not tested (as our output will vary compared to
    MPFR), and signed zeroes are already tested in unit tests.
    RoseZhang03 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    90065da View commit details
    Browse the repository at this point in the history
  83. [libc] Fix math fuzzers (#101529)

    Fix minor typos that accumulated while the math fuzzers were disabled.
    michaelrj-google authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    3497211 View commit details
    Browse the repository at this point in the history
  84. [libc] heap_sort_fuzz deleted unnecessary includes (#101535)

    Including src/__suppot/macros/config.h is unnecessary
    RoseZhang03 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    83e6d87 View commit details
    Browse the repository at this point in the history
  85. AMDGPU: Handle remote/fine-grained memory in atomicrmw fmin/fmax lowe…

    …ring (#96759)
    
    Consider the new atomic metadata when choosing to expand as cmpxchg
    instead.
    arsenm authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    41439d5 View commit details
    Browse the repository at this point in the history
  86. [libc++] Check correctly ref-qualified __is_callable in algorithms (#…

    …73451)
    
    We were only checking that the comparator was rvalue callable,
    when in reality the algorithms always call comparators as lvalues.
    This patch also refactors the tests for callable requirements and
    expands it to a few missing algorithms.
    
    Fixes #69554
    changkhothuychung authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    8d151f8 View commit details
    Browse the repository at this point in the history
  87. [AMDGPU][True16][MC] Support v_swap_b16. (#100442)

    support V_SWAP_B16 true16 encoding in asm/disasm for GFX11/12
    
    Co-authored-by: guochen2 <[email protected]>
    broxigarchen and broxigarchen authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    ab91371 View commit details
    Browse the repository at this point in the history
  88. [lld][InstrProf] Add "Separate" irpgo-profile-sort option (#101084)

    Add the "Separate" option `--irpgo-profile-sort <profile` instead of
    just the "Joined" option `--irpgo-profile-sort=<profile>`. This is
    useful if the path has a `,` for some reason which would break when
    trying to use `-Wl,--irpgo-profile-sort=<profile-with-comma>`.
    
    While I'm here, use `static_cast<>` instead of the C style cast
    introduced in llvm/llvm-project#100627
    ellishg authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    f95bd62 View commit details
    Browse the repository at this point in the history
  89. workflows: Fix libclc-tests (#101524)

    The old out-of-tree build configuration stopped working and in tree
    builds are supported now, so we should use the in tree configuration.
    The only downside is we can't run the tests any more, but at least we
    will be able to test the build again.
    tstellar authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    0512ba0 View commit details
    Browse the repository at this point in the history
  90. [LV] Add more tests with switches.

    Extra tests for
    llvm/llvm-project#99808, including cost model
    tests.
    fhahn committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    8557035 View commit details
    Browse the repository at this point in the history
  91. [SandboxIR] Implement the remaining CastInst sub-classes (#101537)

    This patch implements:
    sandboxir::UIToFPInst
    sandboxir::FPExtInst
    sandboxir::FPTruncInst
    sandboxir::SExtInst
    sandboxir::ZExtInst
    sandboxir::TruncInst
    vporpo authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    b6b0a24 View commit details
    Browse the repository at this point in the history
  92. [libc] Use LLVM CommandLine for loader tool (#101501)

    Summary:
    This patch removes the ad-hoc parsing that I used previously and
    replaces it with the LLVM CommnadLine interface. This doesn't change any
    functionality, but makes it easier to maintain.
    jhuber6 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    5e32698 View commit details
    Browse the repository at this point in the history
  93. [clang-format] Rename variable more sensitively (#100943)

    Renaming to `Disallowed`.
    urnathan authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    18b58d4 View commit details
    Browse the repository at this point in the history
  94. Configuration menu
    Copy the full SHA
    ea46e20 View commit details
    Browse the repository at this point in the history
  95. [Clang][NFC] Improve generation of GEP and RecordDecl loop (#101434)

    As with other loops, we need only look at a RecordDecl's FieldDecls.
    Convert to using them. In the meantime, we can improve the generation of
    the 'counted_by' FieldDecl's GEP by creating one GEP instead of a series
    of GEPs.
    bwendling authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    160fb11 View commit details
    Browse the repository at this point in the history
  96. [flang] Add allocator_idx attribute on fir.embox and fircg.ext_embox …

    …(#101212)
    
    #100690 introduces allocator registry with the ability to store
    allocator index in the descriptor. This patch adds an attribute to
    fir.embox and fircg.ext_embox to be able to set the allocator index
    while populating the descriptor fields.
    clementval authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    0def9a9 View commit details
    Browse the repository at this point in the history
  97. [libc++] Revert "Check correctly ref-qualified __is_callable in algor…

    …ithms (#73451)"
    
    This reverts commit 8d151f8, which
    broke some build bots. I think that is caused by an invalid argument
    order when checking __is_comparable in upper_bound.
    ldionne committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    451bba6 View commit details
    Browse the repository at this point in the history
  98. [Clang] Fix nomerge attribute not working with __builtin_trap(), __de…

    …bugbreak(), __builtin_verbose_trap() (#101549)
    
    1. It fixes the problem that llvm.trap() not getting the nomerge
    attribute.
    2. It sets nomerge flag for the node if the instruction has nomerge
    arrtibute.
    
    This is a copy of https://reviews.llvm.org/D146164. This only attempts
    to fix `nomerge` for `__builtin_trap()`, `__debugbreak()`,
    `__builtin_verbose_trap()`, not working for non-trap builtins.
    
    Fixes #53011
    ZequanWu authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    5e84646 View commit details
    Browse the repository at this point in the history
  99. Configuration menu
    Copy the full SHA
    e89129e View commit details
    Browse the repository at this point in the history
  100. [SCEV] Prove no-self-wrap from negative power of two step (#101416)

    We have existing code which reasons about a step evenly dividing the
    iteration space is a finite loop with a single exit implying
    no-self-wrap. The sign of the step doesn't effect this.
    
    ---------
    
    Co-authored-by: Nikita Popov <[email protected]>
    preames and nikic authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    f0944f4 View commit details
    Browse the repository at this point in the history
  101. Configuration menu
    Copy the full SHA
    97f723b View commit details
    Browse the repository at this point in the history
  102. [libc][math][c23] Add dadd{l,f128} and ddiv{l,f128} C23 math function…

    …s (#100456)
    
    - fadd removed because I need to add for different input types
    - finishing rest of basic operations
    - noticed duplicates will remove
    
    ---------
    
    Co-authored-by: OverMighty <[email protected]>
    aaryanshukla and overmighty authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    8f33f1d View commit details
    Browse the repository at this point in the history
  103. [asan] Speed up ASan ODR indicator-based checking (#100923)

    **Summary**:
    When ASan checks for a potential ODR violation on a global it loops over
    a linked list of all globals to find those with the matching value of an
    indicator. With the default setting 'detect_odr_violation=1', ASan
    doesn't report violations on same-size globals but it still has to
    traverse the list. For larger binaries with a ton of shared libs and
    globals (and a non-trivial volume of same-sized duplicates) this gets
    extremely expensive.
    
    This patch adds an indicator indexed (multi-)map of globals to speed up
    the search.
    
    > Note: asan used to use a map to store globals a while ago which was
    replaced with a list when the codebase [moved off of
    STL](llvm/llvm-project@e4bada2).
    
    Internally we see many examples where ODR checking takes *seconds* (even
    double digits). With this patch it's practically free and
    `__asan_register_globals` doesn't show up prominently in the perf
    profile anymore.
    
    There are several high-level questions:
    1. I understand that the intent is that we hit the slow path rarely,
    ideally once before the process dies with an error. But in practice we
    hit the slow path a lot. It feels reasonable to keep the amount of work
    bounded even in the worst case, even if it requires a bit of extra
    memory. But if not, it'd be great to learn about the tradeoffs.
    2. Poisoning based ODR checking remains on the slow path. Internally we
    build everything with `-fsanitize-address-use-odr-indicator` so I'm not
    sure if poisoning-based check would exhibit the same behavior (looking
    at the code, the shape looks very similar, so it might?).
    3. Globals with an ODR indicator of `-1` need to be skipped for the
    purposes of ODR checking (cf.
    llvm/llvm-project@a257639).
    But they are still getting added to the list of globals and hence take
    up space and slow down the iteration over the list of globals. It would
    be a good saving if we could avoid adding them to the globals list.
    4. Any reason to use a linked list instead of e.g. a vector to store
    globals?
    
    **Test Plan**:
    
    * `cmake --build build --target check-asan` looks good
    * Perf-wise things look good when linking against this version of
    compiler-rt.
    
    ---------
    
    Co-authored-by: Vitaly Buka <[email protected]>
    artempyanykh and vitalybuka authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    c584c42 View commit details
    Browse the repository at this point in the history
  104. Configuration menu
    Copy the full SHA
    289c049 View commit details
    Browse the repository at this point in the history
  105. [libc++] Improve code gen for string's operator== (#100926)

    If the string is too long for a short string, we can simply check for
    the long bit. If that's false we can do an early return. This improves
    the code gen slightly.
    philnik777 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    3af26be View commit details
    Browse the repository at this point in the history
  106. [libc++][NFC] Fix inconsistent quoting and spacing in our CSV files

    There were a few places where we didn't properly quote entries in the
    CSV status pages, or where we followed inconsistent spacing. This causes
    issue when trying to synchronize status pages with Github issues.
    ldionne committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    64946fd View commit details
    Browse the repository at this point in the history
  107. [libc++] Add status page consistency change to git-blame-ignore-revs

    To avoid breaking searchability of when a paper was implemented.
    ldionne committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    8d83fae View commit details
    Browse the repository at this point in the history
  108. Revert "[Clang] Fix nomerge attribute not working with __builtin_trap…

    …(), __debugbreak(), __builtin_verbose_trap() (#101549)"
    
    This reverts commit 5e84646, which
    broke 'nomerge.ll' test on llvm bots.
    zeroomega committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    667598d View commit details
    Browse the repository at this point in the history
  109. Configuration menu
    Copy the full SHA
    b45d362 View commit details
    Browse the repository at this point in the history
  110. Configuration menu
    Copy the full SHA
    7471387 View commit details
    Browse the repository at this point in the history
  111. [Offload][OpenMP] Prettify error messages by "demangling" the kernel …

    …name (#101400)
    
    The kernel names for OpenMP are manually mangled and not ideal when we
    report something to the user. We demangle them now, providing the
    function and line number of the target region, together with the actual
    kernel name.
    jdoerfert authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    f3bfc56 View commit details
    Browse the repository at this point in the history
  112. Reapply "[Clang] Fix nomerge attribute not working with __builtin_tra…

    …p(), __debugbreak(), __builtin_verbose_trap() (#101549)"
    
    This reverts commit 667598d and fixes failed tests: llvm/test/CodeGen/X86/nomerge.ll and llvm/test/MC/AArch64/local-bounds-single-trap.ll.
    ZequanWu committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    ae6dc64 View commit details
    Browse the repository at this point in the history
  113. Fix codegen of consteval functions returning an empty class, and rela…

    …ted issues (#93115)
    
    Fix codegen of consteval functions returning an empty class, and related
    issues
    
    If a class is empty, don't store it to memory: the store might overwrite
    useful data. Similarly, if a class has tail padding that might overlap
    other fields, don't store the tail padding to memory.
    
    The problem here turned out a bit more general than I initially thought:
    basically all uses of EmitAggregateStore were broken. Call lowering had
    a method that did mostly the right thing, though: CreateCoercedStore.
    Adapt CreateCoercedStore so it always does the conservatively right
    thing, and use it for both calls and ConstantExpr.
    
    Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
    set incorrectly for empty classes in some cases.
    
    Fixes #93040.
    efriedma-quic authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    1762e01 View commit details
    Browse the repository at this point in the history
  114. Add support for verifying local type units in .debug_names. (#101133)

    This patch adds support for verifying local type units in .debug_names
    section. It adds a test to test if the TU index is valid, and a test
    that tests that an error is found inside the name entry for a type unit.
    We don't need to test all other errors in the name entry because these
    are essentially identical to compile unit entries, they just use a
    different DWARF unit offset index.
    clayborg authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    b6a2eb0 View commit details
    Browse the repository at this point in the history
  115. [libc] created tan function fuzzer (#101570)

    Also edited file header formatting on sin_fuz and cos_fuzz
    RoseZhang03 authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    0142bd6 View commit details
    Browse the repository at this point in the history
  116. [mlir][emitc] Fix EmitC dialect's operations' descriptions (#101523)

    - Added the dialect's prefix to operations' descriptions to follow the
    same style inside the TableGen file.
    - Minor changes in the 'emitc.yield' operation's description.
    EtoAndruwa authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    c89e9e7 View commit details
    Browse the repository at this point in the history
  117. Add a tutorial on mlir-opt (#96105)

    This tutorial gives an introduction to the `mlir-opt` tool, focusing on
    how to run basic passes with and without options, run pass pipelines
    from the CLI, and point out particularly useful flags.
    
    ---------
    
    Co-authored-by: Jeremy Kun <[email protected]>
    Co-authored-by: Mehdi Amini <[email protected]>
    3 people authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    7f19686 View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. [SandboxIR] Implement UnaryInstruction class (#101541)

    This patch implements sandboxir::UnaryInstruction class and updates
    sandboxir::LoadInst and sandboxir::CastInst to inherit from it instead
    of sandboxir::Instruction.
    vporpo authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    f9392fc View commit details
    Browse the repository at this point in the history
  2. [M68k] Fix compilation pipeline check

    - After 'lowerConstantIntrinsics' is merged into pre-isel lowering
    darkbuck committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    7b0f143 View commit details
    Browse the repository at this point in the history
  3. [asan] Avoid global ~DenseMap()

    Follow up to #100923
    vitalybuka committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    54c9404 View commit details
    Browse the repository at this point in the history
  4. [lldb] Change Module to have a concrete UnwindTable, update (#101130)

    Currently a Module has a std::optional<UnwindTable> which is created
    when the UnwindTable is requested from outside the Module. The idea is
    to delay its creation until the Module has an ObjectFile initialized,
    which will have been done by the time we're doing an unwind.
    
    However, Module::GetUnwindTable wasn't doing any locking, so it was
    possible for two threads to ask for the UnwindTable for the first time,
    one would be created and returned while another thread would create one,
    destroy the first in the process of emplacing it. It was an uncommon
    crash, but it was possible.
    
    Grabbing the Module's mutex would be one way to address it, but when
    loading ELF binaries, we start creating the SymbolTable on one thread
    (ObjectFileELF) grabbing the Module's mutex, and then spin up worker
    threads to parse the individual DWARF compilation units, which then try
    to also get the UnwindTable and deadlock if they try to get the Module's
    mutex.
    
    This changes Module to have a concrete UnwindTable as an ivar, and when
    it adds an ObjectFile or SymbolFileVendor, it will call the Update
    method on it, which will re-evaluate which sections exist in the
    ObjectFile/SymbolFile. UnwindTable used to have an Initialize method
    which set all the sections, and an Update method which would set some of
    them if they weren't set. I unified these with the Initialize method
    taking a `force` option to re-initialize the section pointers even if
    they had been done already before.
    
    This is addressing a rare crash report we've received, and also a
    failure Adrian spotted on the -fsanitize=address CI bot last week, it's
    still uncommon with ASAN but it can happen with the standard testsuite.
    
    rdar://128876433
    jasonmolenda authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    7ad073a View commit details
    Browse the repository at this point in the history
  5. [Bazel] Port f3bfc56

    MaskRay committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    c4fac0e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    c5f1395 View commit details
    Browse the repository at this point in the history
  7. [X86_32][C++] fix 0 sized struct case in vaarg. (#86388)

    struct SuperEmpty { struct{ int a[0];} b;};
    Such 0 sized structs in c++ mode can not be ignored in i386 for that c++
    fields are never empty.But when EmitVAArg, its size is 0, so that
    va_list not increase.Maybe we can just Ignore this kind of arguments,
    like X86_64 did. Fix #86385.
    CoTinker authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    4461b69 View commit details
    Browse the repository at this point in the history
  8. [mlir][bufferization] Improve performance of DropEquivalentBufferResu…

    …ltsPass (#101281)
    
    By using DenseMap to minimize the traveral time of callOps, and the
    efficiency of running this pass has been greatly improved.
    CoTinker authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    6867324 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    e3d9b01 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    ca26ea2 View commit details
    Browse the repository at this point in the history
  11. [Attributor] Indicate optimistic fixed point if an instruction alread…

    …y has non-zero address space (#101589)
    shiltian authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    9373a43 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    6c375ae View commit details
    Browse the repository at this point in the history
  13. [Attributor] Use getPointerAddressSpace to replace a cast followed …

    …by a `getAddressSpace`
    shiltian committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    e7f73c0 View commit details
    Browse the repository at this point in the history
  14. [RISCV] Use Zvhmin instead of Zvfh on RUN lines for some intrinsic te…

    …sts. NFC (#101540)
    
    Loads/stores/reinterpret/vfncvt.f.f.w/vfwcvt.f.f.v/vmerge/vmv.v.v are
    all expected to work for f16 vectors with Zvfhmin.
    
    Remove the handcrafted Zvfhmin test that partially tested this.
    
    Splits the vfwcvt.f.f.v and vfncvt.f.f.w tests into their own file so we
    can have a separate RUN line from the float<->int conversions.
    topperc authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    7a134f5 View commit details
    Browse the repository at this point in the history
  15. [LoongArch] Align stack objects passed to memory intrinsics (#101309)

    Memcpy, and other memory intrinsics, typically try to use wider
    load/store if the source and destination addresses are aligned. In
    CodeGenPrepare, look for calls to memory intrinsics and, if the object
    is on the stack, align it to 4-byte (32-bit) or 8-byte (64-bit)
    boundaries if it is large enough that we expect memcpy to use wider
    load/store instructions to copy it.
    
    Fixes #101295
    heiher authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    8b26c02 View commit details
    Browse the repository at this point in the history
  16. [SPARC][IAS] Add v8plus feature bit (#101367)

    Implement handling for `v8plus` feature bit to allow the user to switch
    between V8 and V8+ mode with 32-bit code.
    Currently this only sets the appropriate ELF machine type and flags;
    codegen changes will be done in future patches.
    
    This is done as a prerequisite for `-mv8plus` flag on clang (#98713).
    koachan authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    aca971d View commit details
    Browse the repository at this point in the history
  17. Merge from 'sycl' to 'sycl-web'

    iclsrc committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    668af1c View commit details
    Browse the repository at this point in the history
  18. [HLSL] cleanup builtin names elementwise usage (#101543)

    Remove elementwise description for builtins that don't perform
    elementwise operations.
    farzonl authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    96e6255 View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2024

  1. Merge from 'main' to 'sycl-web' (207 commits)

      CONFLICT (content): Merge conflict in clang/lib/CodeGen/CGExprAgg.cpp
    jsji committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    9fe070a View commit details
    Browse the repository at this point in the history
  2. Merge from 'sycl' to 'sycl-web' (12 commits)

      CONFLICT (content): Merge conflict in sycl/CMakeLists.txt
    jsji committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    5685396 View commit details
    Browse the repository at this point in the history
  3. Merge from 'sycl' to 'sycl-web' (5 commits)

      CONFLICT (content): Merge conflict in llvm/lib/SYCLLowerIR/SYCLVirtualFunctionsAnalysis.cpp
    AlexeySachkov committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    c281123 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ab66ccc View commit details
    Browse the repository at this point in the history
  5. Merge from 'sycl' to 'sycl-web' (3 commits)

    iclsrc committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    9c4aab8 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. Configuration menu
    Copy the full SHA
    5e66b4f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7e1776d View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8a14027 View commit details
    Browse the repository at this point in the history
  4. Update llvm.memmove test after LLVM change (#2655)

    Update a test after llvm-project commit 92a0654
    ("[LowerMemIntrinsics] Lower llvm.memmove to wide memory accesses
    (#100122)", 2024-07-26).
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@84f525abd741c30
    svenvh authored and sys-ce-bb committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    30b827e View commit details
    Browse the repository at this point in the history
  5. Upgrade in-tree job to Ubuntu 22.04 (#2658)

    The spirv-tools package used by the job seems no longer available for
    Ubuntu 20.04.
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@88e546a689b2679
    svenvh authored and sys-ce-bb committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    d0c3bea View commit details
    Browse the repository at this point in the history
  6. Fix addrspace generation in reverse translation for global annotations (

    #2656)
    
    This change fixes the assertion:
    
    Assertion `C->getType() == Ty->getElementType() && "Wrong type in array element initializer"' failed
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@e099f77cc6d02b9
    vmaksimo authored and sys-ce-bb committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    e7bfe86 View commit details
    Browse the repository at this point in the history
  7. Add translation for Intrinsic::{atan,acos,asin,cosh,sinh,tanh} (#2657)

    Add translation for atan, acos, asin, cosh, sinh and tanh LLVM
    intrinsics which are mapped to corresponding OpenCL extended
    instructions.
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@95605477e7fe635
    linehill authored and sys-ce-bb committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    9479076 View commit details
    Browse the repository at this point in the history
  8. Removed OpAtomicCompareExchangeWeak (#2665)

    Verified locally by changing the version from `65536` to `66560` in `test/transcoding/atomics.spt`.
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@62ea823e64307e8
    vmaksimo authored and sys-ce-bb committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    3fa5900 View commit details
    Browse the repository at this point in the history
  9. Translate floating-point atomic_compare_exchange as integer (#2668)

    OpenCL spec supports atomic_float/atomic_double type for
    atomic_compare_exchange* functions. However, value and return type in
    OpAtomicCompareExchange in SPIR-V spec must be integer type.
    Therefore, in OCLToSPIRV translation we need to translate floating-point
    type to corresponding integer variant that has the same type size.
    Floating-point value is bitcasted so that bits remain the same.
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@e5544014fba77d3
    wenju-he authored and sys-ce-bb committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    c6519ef View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2024

  1. Add missing fpbuiltin math functions. (#15039)

    This change due to llvm/llvm-project#98949.
    zahiraam authored and jsji committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    22f9f89 View commit details
    Browse the repository at this point in the history
  2. [CodeGenCUDA] Update module flag value in test

    We overwrite the value in 8096a6f from llvm::Module::Override (4)
    to llvm::Module::Max (7).
    jsji committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    8dd5df1 View commit details
    Browse the repository at this point in the history
  3. Ensure -W<warning> gets HelpHidden

    Align with community commit: 0953fb4
    jsji committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    2b80605 View commit details
    Browse the repository at this point in the history
  4. [sycl-web] Undo bad conflict resolution and adjust tests to new upstr…

    …eam behavior. (#15051)
    
    Signed-off-by: Marcos Maronas <[email protected]>
    maarquitos14 authored and jsji committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    616728e View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2024

  1. Configuration menu
    Copy the full SHA
    43f6fd3 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2024

  1. Configuration menu
    Copy the full SHA
    86385ed View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9fd763d View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2024

  1. Configuration menu
    Copy the full SHA
    0cd0a6a View commit details
    Browse the repository at this point in the history
  2. Revert "[SYCL][Driver] Enable SPV_INTEL_memory_access_aliasing extens…

    …ion (#14992)"
    
    This reverts commit 0a9db37.
    jsji committed Aug 21, 2024
    Configuration menu
    Copy the full SHA
    200d768 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2024

  1. Configuration menu
    Copy the full SHA
    78703d9 View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2024

  1. Configuration menu
    Copy the full SHA
    a142ad3 View commit details
    Browse the repository at this point in the history