forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow link to llvm shared library for current distros #68
Open
littlewu2508
wants to merge
10,000
commits into
ROCm:amd-stg-open
Choose a base branch
from
littlewu2508:device-libs-link-llvm-dylib
base: amd-stg-open
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Allow link to llvm shared library for current distros #68
littlewu2508
wants to merge
10,000
commits into
ROCm:amd-stg-open
from
littlewu2508:device-libs-link-llvm-dylib
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…lvm#90486) Noticed that there already was a function in APInt that updated a FoldingSet so there was no need for me to add it in llvm#84617.
This ensures the explicit value is generated (and not a load into the values array). Note that actually not storing values array at all is still TBD, this is just the very first step.
…imental late parsing mode "extension" (llvm#88596) This patch changes the `LateParsed` field of `Attr` in `Attr.td` to be an instantiation of the new `LateAttrParseKind` class. The instation can be one of the following: * `LateAttrParsingNever` - Corresponds with the false value of `LateParsed` prior to this patch (the default for an attribute). * `LateAttrParseStandard` - Corresponds with the true value of `LateParsed` prior to this patch. * `LateAttrParseExperimentalExt` - A new mode described below. `LateAttrParseExperimentalExt` is an experimental extension to `LateAttrParseStandard`. Essentially this allows `Parser::ParseGNUAttributes(...)` to distinguish between these cases: 1. Only `LateAttrParseExperimentalExt` attributes should be late parsed. 2. Both `LateAttrParseExperimentalExt` and `LateAttrParseStandard` attributes should be late parsed. Callers (and indirect callers) of `Parser::ParseGNUAttributes(...)` indicate the desired behavior by setting a flag in the `LateParsedAttrList` object that is passed to the function. In addition to the above, a new driver and frontend flag (`-fexperimental-late-parse-attributes`) with a corresponding LangOpt (`ExperimentalLateParseAttributes`) is added that changes how `LateAttrParseExperimentalExt` attributes are parsed. * When the flag is disabled (default), in cases where only `LateAttrParsingExperimentalOnly` late parsing is requested, the attribute will be parsed immediately (i.e. **NOT** late parsed). This allows the attribute to act just like a `LateAttrParseStandard` attribute when the flag is disabled. * When the flag is enabled, in cases where only `LateAttrParsingExperimentalOnly` late parsing is requested, the attribute will be late parsed. The motivation behind this change is to allow the new `counted_by` attribute (part of `-fbounds-safety`) to support late parsing but **only** when `-fexperimental-late-parse-attributes` is enabled. This attribute needs to support late parsing to allow it to refer to fields later in a struct definition (or function parameters declared later). However, there isn't a precedent for supporting late attribute parsing in C so this flag allows the new behavior to exist in Clang but not be on by default. This behavior was requested as part of the `-fbounds-safety` RFC process (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854/68). This patch doesn't introduce any uses of `LateAttrParseExperimentalExt`. This will be added for the `counted_by` attribute in a future patch (llvm#87596). A consequence is the new behavior added in this patch is not yet testable. Hence, the lack of tests covering the new behavior. rdar://125400257
In new pass system, `MachineFunction` could be an analysis result again, machine module pass can now fetch them from analysis manager. `MachineModuleInfo` no longer owns them. Remove `FreeMachineFunctionPass`, replaced by `InvalidateAnalysisPass<MachineFunctionAnalysis>`. Now `FreeMachineFunction` is replaced by `InvalidateAnalysisPass<MachineFunctionAnalysis>`, the workaround in `MachineFunctionPassManager` is no longer needed, there is no difference between `unittests/MIR/PassBuilderCallbacksTest.cpp` and `unittests/IR/PassBuilderCallbacksTest.cpp`.
This is used when -march=native run on an unknown CPU to old version of LLVM.
Skip updating references for operands that do not directly refer to jump table symbols but fall within a jump table's address range to prevent unintended modifications.
within module purview Close llvm#90259 Technically, the static declarations shouldn't be leaked from the module interface, otherwise it is an illegal program according to the spec. So we can get rid of the static declarations from the reduced BMI technically. Then we can close the above issue. However, there are too many `static inline` codes in existing headers. So it will be a pretty big breaking change if we do this globally.
…g/ec/llvm-project into amd-staging
Change-Id: Icf8748fff11482f16cbeb1f19baf5a3404b57c6e
Disable this test on x86_64h for LSan. This test is failing with malformed object only on x86_64h. Disabling for now. rdar://125052424
…elism with multi-frame parallelism https://reviews.llvm.org/D133679 utilizes zstd's multithread API to create one single frame. This provides a higher compression ratio but is significantly slower than concatenating multiple frames. With manual parallelism, it is easier to parallelize memcpy in OutputSection::writeTo for parallel memcpy. In addition, as the individual allocated decompression buffers are much smaller, we can make a wild guess (compressed_size/4) without worrying about a resize (due to wrong guess) would waste memory.
…ng-parentheses` (llvm#90279) When a binary operator is the last operand of a macro, the end location that is past the `BinaryOperator` will be inside the macro and therefore an invalid location to insert a `FixIt` into, which is why the check bails when encountering such a pattern. However, the end location is only required for the `FixIt` and the diagnostic can still be emitted, just without an attached fix.
…te module file for C++20 modules instead of PCHGenerator Previously we're re-using PCHGenerator to generate the module file for C++20 modules. But this is slighty more or less odd. This patch tries to use a new class 'CXX20ModulesGenerator' to generate the module file for C++20 modules.
llvm#90522) LegalizeVectorType is responsible for legalizing nodes that perform an operation on each element may need to scalarize. This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR, SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc. This patch drops any nodes with a scalar result from LegalizeVectorOps and handles them in LegalizeDAG instead. This required moving the reduction promotion to LegalizeDAG. I have removed the support integer promotion as it was incorrect for integer min/max reductions. Since it was untested, it was best to assert on it until it was really needed. There are a couple regressions that can be fixed with a small DAG combine which I will do as a follow up.
Close llvm#75057 Previously, I thought the diagnostic mappings is not meaningful with modules incorrectly. And this problem get revealed by another change recently. So this patch tried to rever the previous "optimization" partially.
…uctions Marking them as `hasSideEffects=1` stops some optimizations. According to `Target.td`: > // Does the instruction have side effects that are not captured by any > // operands of the instruction or other flags? > bit hasSideEffects = ?; It seems we don't need to set `hasSideEffects` for vleNff since we have modelled `vl` as an output operand. As for saturating instructions, I think that explicit Def/Use list is kind of side effects captured by any operands of the instruction, so we don't need to set `hasSideEffects` either. And I have just investigated AArch64's implementation, they don't set this flag and don't add `Def` list. These changes make optimizations like `performCombineVMergeAndVOps` and MachineCSE possible for these instructions. As a consequence, `copyprop.mir` can't test what we want to test in https://reviews.llvm.org/D155140, so we replace `vssra.vi` with a VCIX instruction (it has side effects). Reviewers: jacquesguan, topperc, preames, asb, lukel97 Reviewed By: topperc, lukel97 Pull Request: llvm#90049
Simultaneously implemented parsing support for the `%desc_*` modifiers. Reviewers: SixWeining, heiher, xen0n Reviewed By: xen0n, SixWeining Pull Request: llvm#90158
Close llvm#75057 Previously, I thought the diagnostic mappings is not meaningful with modules incorrectly. And this problem get revealed by another change recently. So this patch tried to rever the previous "optimization" partially.
Extends `omp.private` with a new region: `dealloc` where deallocation logic for Fortran deallocatables will be outlined (this will happen in later PRs).
This removes various subtitles or converts them to bold text so that the table of contents is less cluttered. This includes "Example", "Notes", "Priority To Implement" and "Response".
The implementation only enables when the `-enable-tlsdesc` option is passed and the TLS model is `dynamic`. LoongArch's GCC has the same option(-mtls-dialet=) as RISC-V. Reviewers: heiher, MaskRay, SixWeining Reviewed By: SixWeining, MaskRay Pull Request: llvm#90159
…ixed. (llvm#90484) The original PR llvm#90083 had to be reverted in PR llvm#90444 as it caused one of the gfortran tests to fail. The issue was using `isIntOrIndex` for checking for integer type. It allowed index type which later caused assertion when calling `getIntOrFloatBitWidth`. I have now replaced it with `isInteger` which should fix this regression.
…90471) In the debug intrinsic class heirachy, a dbg.assign is a (inherits from) dbg.value, so `findDbgValues` returns dbg.values and dbg.assigns (by design). That hierarchy doesn't exist for DbgRecords - fix findDbgValues to return dbg_assign records as well as dbg_values and add unittest.
Previously if you passed an ELF binary it would be silently copied with no changes.
Section unification cannot just use names, because it's valid for ELF binaries to have multiple sections with the same name. We should check other section properties too. Fixes llvm#88001. rdar://124467787
I strongly suspect nobody ever used that macro since it wasn't very well known. Furthermore, it only affects a handful of diagnostics and I think it makes sense to either provide them unconditionally, or to not provided them at all.
…n emulation (llvm#89131) This PR builds on llvm#79494 with an additional path for efficient unsigned `i4 ->i8` type extension for 1D/2D operations. This will impact any i4 -> i8/i16/i32/i64 unsigned extensions as well as sitofp i4 -> f8/f16/f32/f64.
…0508) Empty ISG1 and OSG1 parts are generated for compute shader since there's no signature for compute shader. Fixes llvm#88778
The output on eel.is has similar oddities, so I expect this was copy pasted.
…m#89992) The dependency scanner only puts top-level affecting module map files on the command line for explicitly building a module. This is done because any affecting child module map files should be referenced by the top-level one, meaning listing them explicitly does not have any meaning and only makes the command lines longer. However, a problem arises whenever the definition of an affecting module lives in a module map that is not top-level. Considering the rules explained above, such module map file would not make it to the command line. That's why 83973cf started marking the parents of an affecting module map file as affecting too. This way, the top-level file does make it into the command line. This can be problematic, though. On macOS, for example, the Darwin module lives in "/usr/include/Darwin.modulemap" one of many module map files included by "/usr/include/module.modulemap". Reporting the parent on the command line forces explicit builds to parse all the other module map files included by it, which is not necessary and can get expensive in terms of file system traffic. This patch solves that performance issue by stopping marking parent module map files as affecting, and marking module map files as top-level whenever they are top-level among the set of affecting files, not among the set of all known files. This means that the top-level "/usr/include/module.modulemap" is now not marked as affecting and "/usr/include/Darwin.modulemap" is.
In case of functions without a stack frame no "stack" field is serialized into MIR which leads to isCalleeSavedInfoValid being false when reading a MIR file back in. To fix this we should serialize MachineFrameInfo::isCalleeSavedInfoValid() into MIR.
ROTL and ROTR can take a shift amount larger than the element size, in which case the effective shift amount should be the shift amount modulo the element size. This patch adds the modulo step when the shift amount isn't known at compile time. Without it the existing implementation would end up shifting beyond the type size and give incorrect results.
…lvm#90700) Instead of hardcoding the 4 current profile prefixes, treat profile selection as a fallback if we don't find "rv32" or "rv64". Update the error message accordingly.
…function. (llvm#90665) This simplifies the callers.
…O summaries" (llvm#90610) (llvm#90692) This reverts commit 2aabfc8. Add fixes to LLD and Gold tests missed in original change. Co-authored-by: Jan Voung <[email protected]>
It was pointed out in post commit review of llvm#90597 that the pass should never have been run in parallel over all functions (and now other top level operations) in the first place. The mutex used in the pass was ineffective at preventing races since each instance of the pass would have a different mutex.
Fixes llvm#84968. Implements the `fcntl()` function defined in the `fcntl.h` header.
…ons. (llvm#90414) Treat a compound operator such as |=, array subscription, sizeof, and non-type template parameter as trivial so long as subexpressions are also trivial. Also treat true/false boolean literal as trivial.
The function may return Z_MEM_ERROR or Z_STREAM_ERR. The former does not have a good way of testing. The latter will be possible with a pending change that allows setting the compression level, which will come with a test.
zstd excels at scaling from low-ratio-very-fast to high-ratio-pretty-slow. Some users prioritize speed and prefer disk read speed, while others focus on achieving the highest compression ratio possible, similar to traditional high-ratio codecs like LZMA. Add an optional `level` to `--compress-sections` (llvm#84855) to cater to these diverse needs. While we initially aimed for a one-size-fits-all approach, this no longer seems to work. (https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html) When --compress-debug-sections is used together, make --compress-sections take precedence since --compress-sections is usually more specific. Remove the level distinction between -O/-O1 and -O2 for --compress-debug-sections=zlib for a more consistent user experience. Pull Request: llvm#90567
…g/ec/llvm-project into amd-staging
Change-Id: I4968e32ce2fcf8592f4ab65f9b2eb89b5fbb67dc
Change-Id: I8d57fc9053f1ee71230ac48337f73b474581188f
…g/ec/llvm-project into amd-staging
…g/ec/llvm-project into amd-staging
littlewu2508
force-pushed
the
device-libs-link-llvm-dylib
branch
from
May 2, 2024 09:58
1c7e7f8
to
509146d
Compare
Signed-off-by: "Yiyang Wu <[email protected]>"
littlewu2508
force-pushed
the
device-libs-link-llvm-dylib
branch
from
May 2, 2024 10:27
509146d
to
7311c1b
Compare
Did you mean to make a PR against amd-staging (amd-stg-open is deprecated now)? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Current distros build llvm shared libs. Same as comgr: 1322aa0