aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-08-06[Clang][Lex] Extend HeaderSearch::LookupFile to control OpenFile behavior.linaro-local/ci/tcwg_kernel/llvm-master-arm-lts-allnoconfiglinaro-local/ci/tcwg_kernel/llvm-master-arm-lts-allmodconfigJun Zhang
In the case of static compilation the file system is pretty much read-only and taking a snapshot of it usually is sufficient. In the interactive C++ case the compilation is longer and people can create and include files, etc. In that case we often do not want to open files or cache failures unless is absolutely necessary. This patch extends the original API call by forwarding some optional flags, so we can continue use it in the previous way with no breakage. Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D131241
2022-08-05[NFC] add test case for D129558linaro-local/ci/tcwg_kernel/llvm-master-arm-stable-allyesconfigChen Zheng
2022-08-05 [LLDB] Missing break in a switch statement alters the execution flow.linaro-local/ci/tcwg_bmk_llvm_fx/llvm-master-aarch64-cpu2017-O3Slava Gurevich
Looks like a typo from the past code changes. Differential Revision: https://reviews.llvm.org/D131244
2022-08-05[ELF][AArch64] Fix potentially corrupted section content for PACFangrui Song
D74537 introduced a bug: if `(config->andFeatures & GNU_PROPERTY_AARCH64_FEATURE_1_PAC) != 0` with -z pac-plt unspecified, we incorrectly use AArch64BtiPac, whose writePlt will make out-of-bounds write after the .plt section. This is often benign because the output section after .plt will usually overwrite the content. This is very difficult to test without D131247 (Parallelize writes of different OutputSections).
2022-08-05[ELF] Keep only getTarget() call. NFCFangrui Song
The place from D61712 seems unneeded now. We can just use the place added by D62609 (support AArch64 BTI/PAC).
2022-08-05[test/Modules/cxx20-export-import.cpp] Pre-clean the modules cache directory ↵Argyrios Kyrtzidis
of the test, NFC
2022-08-05[ELF] mergeCmp: work around irreflexivity bugFangrui Song
Some tests (e.g. aarch64-feature-pac.s) segfault in libstdc++ _GLIBCXX_DEBUG builds (enabled by LLVM_ENABLE_EXPENSIVE_CHECKS). dyn_cast<ThunkSection> is incorrectly true for any SyntheticSection. std::merge transitively calls mergeCmp(x, x) (due to __glibcxx_requires_irreflexive_pred) and will segfault in `ta->getTargetInputSection()`. The dyn_cast<ThunkSection> issue should be eventually fixed properly, bug `a != b` is robust enough for now.
2022-08-05unbreak Modules/cxx20-export-import.cpp with LLVM_APPEND_VC_REV=OFF after ↵Nico Weber
6635f48e4aba See revision b8b7a9dcdcbc for prior art.
2022-08-05[HLSL] emit-obj when set output.Xiang Li
When not set output, set default output to stdout. When set output with -Fo and no -fcgl, set -emit-obj to generate dx container. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D130858
2022-08-05[CUDA] Fix output name being replaced in device-only modeJoseph Huber
When performing device only compilation, there was an issue where `cubin` outputs were being renamed to `cubin` despite the user's name. This is required in a normal compilation flow as the Nvidia tools only understand specific filenames instead of checking magic bytes for some unknown reason. We do not want to perform this transformation when the user is performing device only compilation. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D131278
2022-08-05[Serialization] Remove `ORIGINAL_PCH_DIR` recordArgyrios Kyrtzidis
Use of `ORIGINAL_PCH_DIR` record has been superseeded by making PCH/PCM files with relocatable paths at write time. Removing this record is useful for producing an output-path-independent PCH file and enable sharing of the same PCH file even when it was intended for a different output path. Differential Revision: https://reviews.llvm.org/D131124
2022-08-05[Sanitizer][Darwin] Support OS versions before DRIVERKITKeith Smiley
Fixes https://github.com/llvm/llvm-project/issues/56960 Differential Revision: https://reviews.llvm.org/D131288
2022-08-05[ELF][PPC64] Fix potentially corrupted section content with empty .gotFangrui Song
D91426 makes .got possibly empty while needed. If .got and .data have the same address, and .got's content is written after .data, the first word of .data will be corrupted. The bug is not testable without D131247.
2022-08-05[mlir] Use SymbolTableCollection to lookup referenced symbol in AddressOfOpEugene Zhulenev
Depends On D131285 Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D131291
2022-08-05[libc++][ranges][NFC] Mark the completed Ranges papers and issues as done.Konstantin Varlamov
The newly-completed papers: - P0896R4 ("The One Ranges Proposal"); - P1243R4 ("Rangify New Algorithms"); - P1252R2 ("Ranges Design Cleanup"); - P1716R3 ("Range Comparison Algorithms Are Over-Constrained"); - P1871R1 ("Concept traits should be named after concepts"); - P2106R0 ("Alternative wording for GB315 and GB316"). Differential Revision: https://reviews.llvm.org/D131234
2022-08-05[DAGCombiner] Hoist funnel shifts from logic operationFilipp Zhinkin
Hoist funnel shift from logic op: logic_op (FSH x0, x1, s), (FSH y0, y1, s) --> FSH (logic_op x0, y0), (logic_op x1, y1), s The transformation improves code generated for some cases related to issue https://github.com/llvm/llvm-project/issues/49541. Reduced amount of funnel shifts can also improve throughput on x86 CPUs by utilizing more available ports: https://quick-bench.com/q/gC7AKkJJsDZzRrs_JWDzm9t_iDM Transformation correctness checks: https://alive2.llvm.org/ce/z/TKPULH https://alive2.llvm.org/ce/z/UvTd_9 https://alive2.llvm.org/ce/z/j8qW3_ https://alive2.llvm.org/ce/z/7Wq7gE https://alive2.llvm.org/ce/z/Xr5w8R https://alive2.llvm.org/ce/z/D5xe_E https://alive2.llvm.org/ce/z/2yBZiy Differential Revision: https://reviews.llvm.org/D130994
2022-08-05[libc++][ranges][NFC] Make sure all implemented algorithms are enabled in ↵Konstantin Varlamov
"robust" tests. Also fix `std::find_first_of` (which accidentally copied the predicate in the implementation). Differential Revision: https://reviews.llvm.org/D131235
2022-08-05[ORC] Fix a memory leak in LLVMOrcIRTransformLayerSetTransform.Lang Hames
This function heap-allocates a ThreadSafeModule (the current C bindings assume that TSMs are always heap-allocated), but was failing to free it. Should fix http://llvm.org/PR56953.
2022-08-05[examples][ORC] Add missing call to LLVMDisposeBuilder to example.Lang Hames
The missing call was pointed out in https://llvm.org/PR56953, though it's not the focus of that issue.
2022-08-05[mlir] Implement SymbolUserOpInterface in LLVM::CallOpEugene Zhulenev
Avoid expensive calls to `SymbolTable::lookupNearestSymbolFrom` in verifier Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D131285
2022-08-05[mlir][spirv] Define spv.IAddCarryJakub Kuderski
Based on `spv.ISubBorrow` from D127909. Also resolved some clang-tidy warnings. Reviewed By: antiagainst, ThomasRaoux Differential Revision: https://reviews.llvm.org/D131281
2022-08-05[NFC][Inliner] Add Load/Store handlerVitaly Buka
This is an additional signal which may benefit sanitizers. Reviewed By: kda Differential Revision: https://reviews.llvm.org/D131129
2022-08-05[test][SimpleLoopUnswitch] Precommit test for D129599Ruobing Han
2022-08-05[HWASan] Remove incorrect unreachable.Florian Mayer
This function could be called wih access_info & 0x20 or with flags()->halt_on_error, in which case HandleTagMismatch returns (is not fatal). Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D131279
2022-08-05[NFC] Regenerates X86's win64-bool.llAmaury Séchet
2022-08-05[flang] Lower MOD to Fortran runtime call.Slava Zakharin
This change removes dependency on pgmath mod, and also allows Fortran runtime to issue a diagnostic message in case of zero denominator. Differential Revision: https://reviews.llvm.org/D131192
2022-08-05[RISCV] Don't use li+sh3add for constants that can use lui+add.Craig Topper
If we're adding a constant that can't use addi we try a few tricks, one of which is using li+sh3add. We should not do this if lui+add would work. For example adding 8192. Using sh3add prevents folding a sext.w to form addw, thus increasing instruction count.
2022-08-05[llvm][macos] Fix usage of std::shared_mutex on old macOS SDK versionsTobias Hieta
When setting CMAKE_CXX_STANDARD to 17 and targeting a macOS version under 10.12 the ifdefs would try to use std::shared_mutex because the of the C++ standard. This should also check the targeted SDK. See discussion in: https://reviews.llvm.org/D130689 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D131063
2022-08-05fixes clang-tidy/checks/list.rst: a line was accidentally removed in ↵Rashmi Mudduluru
95a92995d45fc6fada43ecd91eba3e7aea90487a
2022-08-05[clang][modules] Don't depend on sharing FileManager during module buildBen Langmuir
Sharing the FileManager between the importer and the module build should only be an optimization. Add a cc1 option -fno-modules-share-filemanager to allow us to test this. Fix the path to modulemap files, which previously depended on the shared FileManager when using path mapped to an external file in a VFS. Differential Revision: https://reviews.llvm.org/D131076
2022-08-05[clang] Fix redirection behaviour for cached FileEntryRefBen Langmuir
In 6a79e2ff1989b we changed Filemanager::getEntryRef() to return the redirecting FileEntryRef instead of looking through the redirection. This commit fixes the case when looking up a cached file path to also return the redirecting FileEntryRef. This mainly affects the behaviour of calling getNameAsRequested() on the resulting entry ref. Differential Revision: https://reviews.llvm.org/D131273
2022-08-05[CUDA] Fixed sm version constrain for __bmma_m8n8k128_mma_and_popc_b1.Jack Kirk
As stated in https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma: ".and operation in single-bit wmma requires sm_80 or higher." tra@: Fixed a bug in builtins-nvptx-mma.py test generator and regenerated the tests. Differential Revision: https://reviews.llvm.org/D131265
2022-08-05[RISCVInsertVSETVLI] Remove an unsound optimizationPhilip Reames
This fixes a bug reported privately by @craig.topper. Here's an example which illustrates the problem: vsetivli a1, a0, e32, m1, ta, mu # both DefInfo and PrevInfo vsetivli a2, a1, e32, m4, ta, mu With the unsound result being: vsetivli a1, a0, e32, m1, ta, mu vsetivli a2, a0, e32, m4, ta, mu Consider the case where this is running on a machine with VLEN=512,. For this case, the VLMAXs are 16 and 64 respectively. Consider for a0 = 33. The correct result is: a1 = 16, and a2 = 16 After the unsound optimization: a1 = 16 and a2 = 33 This particular example used VLMAXs which differed by more than a power of two. With a difference of only one power of two, there's another form of this bug which involves the AVL < 2 x VLMAX special case, but that ones more complicated to construct as many examples turn out accidentally sound. This patch takes the approach of simply removing the unsound optimization, but there are multiple sound sub-cases of it. I plan to return to at least a couple of them, but figured it was cleaner to remove the unsound optimization (for ease of backporting), and then review the new optimizations on their own. Differential Revision: https://reviews.llvm.org/D131264
2022-08-05[WinEH][ARM64] Split Unwind Info for Fucntions Larger than 1MBZhaoshi Zheng
Create function segments and emit unwind info of them. A segment must be less than 1MB and no prolog or epilog is splitted between two segments. This patch should generate correct, though not optimal, unwind info for large functions. Currently it only generate pacted info (.pdata) only for functions that are less than 1MB (single-segment functions). This is NFC from before this patch. The next step is to enable (.pdata) only unwind info for the first segment or segments that have neither prolog or epilog in a multi-segment function. Another future work item is to further split segments that require more than 255 code words or have more than 65535 epilogs. Reference: https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling#function-fragments Differential Revision: https://reviews.llvm.org/D130049
2022-08-05[flang] Propagate lowering options from driver.Slava Zakharin
This commit addresses concerns raised in D129497. Propagate lowering options from driver to expressions lowering via AbstractConverter instance. A single use case so far is using optimized TRANSPOSE lowering with O1/O2/O3. bbc does not support optimization level switches, so it uses default LoweringOptions (e.g. optimized TRANSPOSE lowering is enabled by default, but an engineering -opt-transpose=false option can still override this). Differential Revision: https://reviews.llvm.org/D130204
2022-08-05[lldb] Improve EXC_RESOURCE exception reasonJonas Devlieghere
Jason noted that the stop message we print for a memory high water mark notification (EXC_RESOURCE) could be clearer. Currently, the stop reason looks like this: * thread #3, queue = 'com.apple.CFNetwork.LoaderQ', stop reason = EXC_RESOURCE RESOURCE_TYPE_MEMORY (limit=14 MB, unused=0x0) It's hard to read the message because the exception and the type (EXC_RESOURCE RESOURCE_TYPE_MEMORY) blend together. Additionally, the "observed=0x0" should not be printed for memory limit exceptions. I wanted to continue to include the resource type from <kern/exc_resource.h> while also explaining what it actually is. I used the wording from the comments in the header. With this path, the stop reason now looks like this: * thread #5, stop reason = EXC_RESOURCE (RESOURCE_TYPE_MEMORY: high watermark memory limit exceeded) (limit=14 MB) rdar://40466897 Differential revision: https://reviews.llvm.org/D131130
2022-08-05[libc] Update look and feel of libc.llvm.orgJeff Bailey
This design is borrowed from the lldb folks (thank you!) to declutter the page. * The version number at the top is removed. * Links are pushed over to a sidebar * The sidebar has headings There are other minor changes: * The warning about this project not being ready is now an RST "warning" * Links to the Bug Reports and the Source Code are Added * Refer to this project as either "The LLVM C LIbrary" or "The libc" Tested: Built locally Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D131242
2022-08-05Reapply the commits to enable accurate hit-count detection for watchpoints.Jim Ingham
This commit combines the initial commit (7c240de609af), a fix for x86_64 Linux (3a0581501e76) and a fix for thinko in a last minute rewrite that I really should have run the testsuite on. Also, make sure that all the "I need to step over watchpoint" plans execute before we call a public stop. Otherwise, e.g. if you have N watchpoints and a Signal, the signal stop info will get us to stop with the watchpoints in a half-done state. Differential Revision: https://reviews.llvm.org/D130674
2022-08-05[mlir] Use SymbolUserOpInterface in LLVM::AddressOfOp verifierEugene Zhulenev
Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D131271
2022-08-05[mlir][spirv] Add default Vulkan memory space to storage class mappinglinaro-local/ci/tcwg_kernel/llvm-master-arm-mainline-allnoconfigLei Zhang
Reviewed By: ThomasRaoux, kuhar Differential Revision: https://reviews.llvm.org/D131128
2022-08-05[mlir][spirv] Add a pass to map memref memory spaceLei Zhang
MemRef types now can carry an attribute to represent the memory space. Still, upper layers in the compilation stack mostly use nuemric values. They don't mean much (other than differentiating separate memory domains) in MLIR's multi-level settings. Those numeric memory space inside MemRef types need to be translated into concrete SPIR-V storage classes during lowering to pin down to concrete memory types. Thus far we have been hardcoding an arbitrary mapping from memory space to storage class for converting MemRef types. This works fine for only targeting Vulkan; it falls apart if we want to target other SPIR-V consumers like OpenCL, as different consumers might want different storage classes for the buffer/variable of the same lifetime. For example, StorageClass in Vulkan vs. CrossWorkgroup in OpenCL. So putting up a new pass to let the user to control how to map MemRef memory spaces into SPIR-V storage classes. This provides more flexibility and can address the awkwardness in the current SPIR-V type converter. This pass should be the prelimiary step towards lowering MemRef related types/ops into SPIR-V. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D130317
2022-08-05[InstSimplify] make uses of isImpliedCondition more efficient (NFCI)Sanjay Patel
As suggested in the post-commit comments for 019d76196f79fcff3c148, this makes the usage symmetric with the 'and' patterns and should be more efficient.
2022-08-05[SVE] Expand DUPM patterns to handle all integer vector types.Paul Walker
NOTE: i8 vector splats are ignored because the immediate range of DUP already has full coverage. Differential Revision: https://reviews.llvm.org/D131078
2022-08-05tsan: fix bug in shadow reset introduced in D128909Than McIntosh
Correct a bug in the code that resets shadow memory introduced as part of a previous change for the Go race detector (D128909). The bug was that only the most recently added shadow segment was being reset, as opposed to the entire extent of the segment created so far. This fixes a bug identified in Google internal testing (b/240733951). Differential Revision: https://reviews.llvm.org/D131256
2022-08-05[InstSimplify] use isImpliedCondition() instead of semi-duplicated codeSanjay Patel
We get a couple of improvements from recognizing swapped operand patterns that were not handled by the replicated code. This should also enable simplifying larger patterns as seen in issue #56653 and issue #56654, but that requires enhancements to isImpliedCondition() itself.
2022-08-05[x86] add tests for bitwise logic of funnel shifts; NFCFilipp Zhinkin
Baseline tests for D130994
2022-08-05Revert "[compiler-rt][CMake] Enable TF intrinsics on powerpc32 Linux"Nikita Popov
As mentioned in https://reviews.llvm.org/D121379#3690593, this change broke the build of compiler-rt targeting powerpc using GCC. The 32-bit powerpc target is not supposed to emit 128-bit libcalls -- if it does, then that's a backend bug and needs to be fixed there. This reverts commit 8f24a56a3a9363f353c8da318d97491a6818781d. Differential Revision: https://reviews.llvm.org/D130988
2022-08-05[libc] Implement sincosf function correctly rounded to all rounding modes.linaro-local/ci/tcwg_kernel/llvm-master-aarch64-stable-allmodconfigTue Ly
Refactor common range reductions and evaluations for sinf, cosf, and sincosf. Added exhaustive tests for sincosf. Performance before the patch: ``` System LIBC reciprocal throughput : 30.205 LIBC reciprocal throughput : 30.533 System LIBC latency : 67.961 LIBC latency : 61.564 ``` Performance after the patch: ``` System LIBC reciprocal throughput : 30.409 LIBC reciprocal throughput : 20.273 System LIBC latency : 67.527 LIBC latency : 61.959 ``` Reviewed By: orex Differential Revision: https://reviews.llvm.org/D130901
2022-08-05[AMDGPU] Remove unused MIMG tablegen variantsMirko Brkusanin
There are no AMDGPUSampleVariant versions for _G16, it is treated more like a modifier for derivatives (_D) (also for intrinsics where it is overloaded type instead of part of instrinsic name) so we ended up making more variants for these instruction then we actually needed. 32-bit derivatives need 6 dwords at most, while 16-bit need 4 at most. Using same AMDGPUSampleVariant for both, we ended up creating 2 extra variants per instruction than were necessary. In total this deletes 260 unused tablegen records. Differential Revision: https://reviews.llvm.org/D131252
2022-08-05Removing redundant code; NFCAaron Ballman
The same predicate is checked on line 12962 just above the removed code.