Skip to content

llvmspirv pulldown #9479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 617 commits into from
Closed

llvmspirv pulldown #9479

wants to merge 617 commits into from

Conversation

jsji
Copy link
Contributor

@jsji jsji commented May 16, 2023

  • [Clang] Change default triple to LLVM_HOST_TRIPLE for the CUDA toolchain
  • [NFC][msan] Rename function parameter
  • [msan] Add pthread_*join_np interceptors
  • [flang][runtime] Initialize uninitialized pointer components
  • cmake: add missing dependencies on Attributes.inc
  • [BOLT][DWARF] Fix dwarf5-one-loclists-two-bases test
  • [flang] Procedure pointers are not descriptors
  • [AIX][Clang][K] Create -K Option for AIX.
  • [flang] Semantics for ISO_C_BINDING's C_LOC()
  • [NFC][sanitizer] Rename internal function
  • [NFC][HWASAN] Hide thread_list_placeholder
  • [NFC][HWASAN] Reformat the file
  • [NFC][HWASAN] Move HwasanThreadStartFunc
  • Revert "[CodeGen][ShrinkWrap] Split restore point"
  • [NFC][ASAN] Hide placeholder buffer
  • [NFC][HWASAN] Use InternalAlloc for ThreadStartArg
  • [flang] Prevent character length setting with dangling ac-do-variable.
  • Do not optimize debug locations across section boundaries
  • Remove -Wpacked false positive for non-pod types where the layout isn't directly changed
  • Fix for release notes (follow-up to D149182/a8b0c6fa)
  • [BOLT] Use MCInstPrinter in createRetpolineFunctionTag
  • [BOLT] Use opcode name in hashBlock
  • [OpenMP] Fix incorrect interop type for number of dependencies
  • [clang-format] Fix consecutive alignments in #else blocks
  • [BOLT][test] Fix retpoline-synthetic.test
  • [mlir] Replace None with std::nullopt in comments (NFC)
  • [Hexagon] Remove unused struct AlignVectors::Segment
  • [clang] Modernize LoopHint (NFC)
  • Add a new report_load_commands option to jGetLoadedDynamicLibrariesInfos
  • [llvm-profdata] ProfileReader cleanup - preparation for MD5 refactoring - 2
  • [ValutTracking] Use isGuaranteedNotToBePoison in impliesPoison
  • CMake: add missing dependency on intrinsics_gen
  • [GlobalISel] Implement commuting shl (add/or x, c1), c2 -> add/or (shl x, c2), c1 << c2
  • Set mayLoad = 1 for shift/rotate with a memory operand
  • [libc++][ranges] Implement the changes to vector from P1206 (ranges::to):
  • [LoongArch] Support fcc* (condition flag) registers in inlineasm clobbers
  • [AMDGPU] Recompute liveness in SIOptimizeExecMaskingPreRA
  • Revert "[ValutTracking] Use isGuaranteedNotToBePoison in impliesPoison"
  • Undo include order work-around in Regex.cpp
  • [flang][hlfir] Lower forall to HLFIR
  • [flang][hlfir] Lower WHERE to HLFIR
  • [flang][hlfir] Lower left-hand side vector subscripts to HLFIR
  • [gn build] Port 17bbb22
  • [clang][dataflow][NFC] Remove SkipPast param from getValue(const ValueDecl &).
  • [CodeGen] Only consider innermost cast for !heapallocsite
  • [clang] Evaluate non-type default template argument when it is required
  • fix stack probe lowering for x86_intrcc
  • [mlir] Fix missing dep on MLIRX86VectorTransforms
  • Reland "[mlir][mem2reg] Expose algorithm internals."
  • [mlir][transform] SplitHandleOp: add additional distribution options
  • [Sema] Lambdas are not part of immediate context for deduction
  • [bazel] Port BUILD rules for 92cc30a.
  • [C++20] [Modules] Handle modules visible relationship properly
  • [FuncSpec][NFC] Add an alias for InstructionCost.
  • [clangd] Support macro evaluation on hover
  • [tidy][IdentifierNaming] Fix crashes on non-identifiers
  • [FuncSpec][NFC] Rename cryptic variable to better describe it.
  • [VPlan] Use VPRecipeWithIRFlags for VPWidenGEPRecipe (NFCI).
  • [PartialInlining] Fix incorrect costing when IR has unreachable BBs
  • [libc++] Provide an assignment operator from pair<U, V> in C++03
  • [NFC][AMDGPU] Pre-commit test.
  • [clangd][NFX][FIX] Fix conflicting symbol name Expr
  • [libc++] Add assertions for potential OOB reads in std::sort
  • [OpenMP][libomptarget] Improve device info printing in NextGen plugins
  • Adopt Properties to store operations inherent Attributes in the Bufferization dialect
  • Adopt Properties to store operations inherent Attributes in the Complex dialect
  • Adopt Properties to store operations inherent Attributes in the ControlFlow dialect
  • Adopt Properties to store operations inherent Attributes in the DLTI dialect
  • Adopt Properties to store operations inherent Attributes in the EmitC dialect
  • Adopt Properties to store operations inherent Attributes in the GPU dialect
  • [mlir][mem2reg] Add mem2reg rewrite pattern.
  • [VPlan] Address missed suggestions from D149082.
  • [AggressiveInstCombine] folding load for constant global patterened arrays and structs by GEP-indices Differential Revision: https://reviews.llvm.org/D146622 Fixes clang x86 missed optimization for 2 dimension array accessed through always zero enum class llvm/llvm-project#61615
  • [IRGen] Change annotation metadata to support inserting tuple of strings into annotation metadata array.
  • [NFC][AMDGPU] Add option to test.
  • [HIP] Detect HIP for Ubuntu, Mint, Gentoo, etc.
  • libclc: clspv: fix fma, add vstore and fix inlining issues
  • [flang][openacc] Lower if_present clause correctly on acc update
  • [mlir][openacc] Cleanup acc.enter_data from old data clause operands
  • [flang] Ensure pointer components are always established
  • [mlir][python] Allow specifying block arg locations
  • [clang-tidy] Fix bugprone-assert-side-effect to actually give warnings
  • [OpenMP][libomptarget] Init device when printing device info
  • [lldb] Simplify predicates of find_if in BroadcastManager
  • [clang][deps] Teach dep directive scanner about _Pragma
  • [clang] Fix initializer_list matching failures with modules
  • [flang][openacc] Lower self clause on acc update as host clause
  • [AArch64] Remove global constructors from AArch64Disassembler.cpp.
  • Revert "[AggressiveInstCombine] folding load for constant global patterened arrays and structs by GEP-indices Differential Revision: https://reviews.llvm.org/D146622 Fixes clang x86 missed optimization for 2 dimension array accessed through always zero enum class llvm/llvm-project#61615"
  • [lldb] Simplify Log::PutString (NFC)
  • [M68k] Register MIR Passes with the PassRegistry
  • Do not link asan_rtl_x86_64.S for non x86_64 platforms.
  • [Clang][Sema] Fix comparison of constraint expressions
  • [mlir][openacc] Cleanup acc.exit_data from old data clause operands
  • [BOLT][DWARF][NFC] Fixed an assertion check
  • [CodeGen][KCFI] Move cfi-type lowering to TargetLowering
  • [flang][openacc] Fix lowerbound when there is no subscripts
  • Fix LLVM sphinx build
  • Wrap debug code with the LLVM_DEBUG macro; NFC
  • Revert "[flang][openacc] Fix lowerbound when there is no subscripts"
  • [clang] Prevent creation of new submodules in ASTWriter
  • Fix test failure from 945f6e6
  • Further amend 945f6e6
  • [mlir][openacc] Cleanup acc.data from old data clause operands
  • PrologEpilogInserter: Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds
  • [flang][hlfir] Allow passing null() to dummy class argument.
  • [flang] Added missing type cast for the implied-do index.
  • [lldb][NFCI] Remove custom dwarf LEB128 types
  • [FuzzMutate] Module size heuristics
  • [flang][openacc] Fix lowerbound when there is no subscripts
  • [ARM] ARMMachObjectWriter::recordRelocation: reduce strength on a condition
  • [clang-format] Put a "trailing" space back in a unit test
  • [VPlan] Add printing test with fast-math flags.
  • [Driver] Add -dumpdir and change -gsplit-dwarf .dwo names for linking
  • [gn build] Manually port e956974
  • [mlir][openacc] Cleanup acc.parallel from old data clause operands
  • [test] [llvm-config] Assume unix style lib names on mingw targets
  • [clang] [test] Narrow down MSVC specific behaviours from "any windows" to only MSVC/clang-cl
  • [clang-tidy] [test] Narrow down a special case to MSVC mode
  • Revert "[mlir][python] Allow specifying block arg locations"
  • [mlir][gpu] Reduction ops canonicalizatios
  • [scudo] Change secondary StatsAllocated update
  • [NFC] Refactor loop metadata movement
  • [mlir] Remove unused variable 'kCopyFlag' in OpenACCToLLVMIRTranslation.cpp (NFC)
  • [KCFI] Expand the KCFI term in comments (NFC)
  • [Driver][test] Exclude -o /dev/null test for Windows
  • [nfc] Remove dead code from ObjectFileMachO
  • Re-revert "[ValueTracking] Use knownbits interface for determining if div/rem are safe to speculate"
  • [clangd] Fix a build failure. NFC
  • Adopt Properties to store operations inherent Attributes in the Index dialect
  • Adopt Properties to store operations inherent Attributes in the IRDL dialect
  • Adopt Properties to store operations inherent Attributes in the Math dialect
  • Adopt Properties to store operations inherent Attributes in the Memref dialect
  • Adopt Properties to store operations inherent Attributes in the MLProgram dialect
  • Adopt Properties to store operations inherent Attributes in the NVGPU dialect
  • Revert "[test] [llvm-config] Assume unix style lib names on mingw targets"
  • [TableGen] Fix null pointer dereferences in TreePattern::ParseTreePattern()
  • [DebugInfo] add test case for D147506, NFC
  • [DebugLine] save one debug line entry for empty prologue
  • [NFC][HWASAN] replace redundant calls to IRBuilder::get*Ty() with saved types
  • [RISCV] Make Zvfh depend on Zfhmin.
  • [docs] [C++20] [Modules] Remove the section 'Source content consistency'
  • fix bot failure https://lab.llvm.org/buildbot/#/builders/38/builds/11709
  • [spirv][math] Fix sign propagation for math.powf conversion
  • [X86] Add test cases for fminimum/fmaximum with vector zero operands.
  • [mlir][Linalg] Avoid collapsing dimensions of linalg op that arent foldable.
  • [LegalizeTypes] Use ISD::isTrueWhenEqual to simplify code. NFC
  • [lli] Improve support for MinGW by implementing __main as a no-op.
  • [LegalizeTypes] Simplify code for UndefinedBooleanContent in PromoteIntOp_VECREDUCE.
  • [lli] Honor -mtriple option in -jit-kind=orc mode.
  • Revert "Revert "[ValutTracking] Use isGuaranteedNotToBePoison in impliesPoison""
  • TableGen: Fix missing C++ mode comments
  • GlobalISel: Fold out G_FPTRUNC(G_FPEXT)
  • AMDGPU: Add baseline tests for fmed3 shrinking combine
  • [CombinerHelper] Fix gcc warning [NFC]
  • [test] Clean up Driver/check-time-trace*
  • [SimpleLoopUnswitch] unswitch selects
  • Reland "[PowerPC] Add target feature requirement to builtins"
  • [NFC] [C++20] [Modules] Refactor Sema::isModuleUnitOfCurrentTU into Decl::isInCurrentModuleUnit
  • [libc] Add optimized memcpy for RISCV
  • [SystemZ] Bugfix in expansion of memmem operations.
  • Revert "[SystemZ] Bugfix in expansion of memmem operations."
  • [AMDGPU][MC] Clean up DPP bound_ctrl handling
  • [C++20] [Modules] Don't generate unused variables in other module units even if its initializer has side effects
  • [NFC] [C++20] [Modules] Code cleanups when checking modules in ADL
  • [lldb][NFCI] Remove n^2 loops and simplify iterator usage
  • [clangd] Initialize clang-tidy modules only once
  • [gn build] Port 62a090f
  • Reapply "[SystemZ] Bugfix in expansion of memmem operations."
  • [tidy] Expose getID to tidy checks
  • [MLIR][LLVM] Support inlining of LLVM atomic operations.
  • Add -no-canonical-prefixes to test that matches the binary name
  • [AArch64] Emit FNMADD instead of FNEG(FMADD)
  • Revert "[SimpleLoopUnswitch] unswitch selects"
  • clang-format: [JS] support import/export type
  • Adopt Properties to store operations inherent Attributes in the OpenACC dialect
  • Adopt Properties to store operations inherent Attributes in the OpenMP dialect
  • Adopt Properties to store operations inherent Attributes in the PDL dialect
  • Adopt Properties to store operations inherent Attributes in the PDLInterp dialect
  • Adopt Properties to store operations inherent Attributes in the Quant dialect
  • Add support of the next Ubuntu (Ubuntu 23.10 - Mantic Minotaur)
  • Revert "[clang] [test] Narrow down MSVC specific behaviours from "any windows" to only MSVC/clang-cl"
  • [RISCV] Enable signed truncation check transforms for i8
  • [flang][openacc] Preserve user order for entry data operand on data construct
  • [mlir][openacc][NFC] Add missing check lines for acc.update tests
  • [flang][openacc] Preserve user order for entry data operand on compute construct
  • [mlir][openacc] Cleanup acc.serial from old data clause operands
  • [mlir][openacc] Cleanup acc.kernels from old data clause operands
  • AMDGPU: Add basic gfx941 target
  • AMDGPU: Add basic gfx942 target
  • AMDGPU: Factor out GFX9.4 common features into a feature set
  • Revert "[lli] Honor -mtriple option in -jit-kind=orc mode."
  • [libc++][PSTL] Move the remaining configuration into __config
  • [gn build] Port f041b34
  • [Object] Fix handling of Elf_Nhdr with sh_addralign=8
  • [SLP][NFC] Rename a couple of variables and replace an if-else with an std::min
  • Revert "[SCEV] Replace IsAvailableOnEntry with block disposition"
  • libclang: declare blocks interfaces always
  • [RISCV] Remove redundant F and D extension implication from V. NFC
  • [libcxx] Fix pstl _init identifier after 9c4717a
  • libclang: add missing struct in the declaration
  • [RISCV] Add support for V extenstion in SiFive7
  • [clang][modules] Avoid unnecessary writes of .timestamp files
  • fix typos to cycle bots
  • Reland [clang] Make predefined expressions string literals under -fms-extensions
  • Revert "[RISCV][InsertVSETVLI] Avoid VL toggles for extractelement patterns"
  • [lldb][NFCI] Replace dw_form_t with llvm::dwarf::Form
  • [CodeGen] Fix nomerge attribute not working in tail calls.
  • [PseudoProbe] Clean up dwarf discriminator and avoid duplicating factor.
  • [PseudoProbe] Encode/Decode FS discriminator
  • [FS-AFDO] Generate pseudo-probe-based profiles with FS-discriminators.
  • [FS-AFDO] Load pseudo probe profile on MIR
  • [Verifier] Allow DW_OP_LLVM_entry_value in IR
  • [Corosplit] Prepend entry_value in swift async dbg values
  • Reapply "[RISCV][InsertVSETVLI] Avoid VL toggles for extractelement patterns"
  • [LangRef] Fix sphinx label syntax
  • [Libomptarget] Fix AMDGPU Note handling after D150022
  • [lldb] Mark most SBAPI methods involving private types as protected or private
  • [libc++][NFC] Fix slightly incorrect instructions for testing with Ninja
  • [Flang] Syntax support for OMP Allocators Construct
  • [Headers][doc] Add "shift" intrinsic descriptions to avx2intrin.h
  • [scudo] Skip releaseToOSMaybe if there's no byte in freelist
  • [scudo] Drain caches when release with M_PURGE_ALL
  • [libc++][NFC] Remove duplicate declaration of __iter_value_type
  • [scudo] Lock FallbackTSD before draining it
  • [TableGen] Print message about dropped patterns with -debug
  • [Hexagon] Add patterns for bspap/bitreverse for scalar vectors
  • [MemProf] Update hot/cold information after importing
  • Revert "[RISCV] Fix extract_vector_elt on i1 at idx 0 being inverted"
  • [FS-AFDO] Fix a pseudo probe test issue.
  • When the Debugger runs HandleProcessEvent it should allow selecting the "Most relevant" frame.
  • [libc++] Consistently enable __CORRECT_ISO_CPP_WCHAR_H_PROTO in mbstate.
  • [libc++][PSTL] Add missing includes to PSTL headers
  • Adopt Properties to store operations inherent Attributes in the SCF dialect
  • Adopt Properties to store operations inherent Attributes in the Shape dialect
  • Adopt Properties to store operations inherent Attributes in the SparseTensor dialect
  • Adopt Properties to store operations inherent Attributes in the SPIRV dialect
  • Adopt Properties to store operations inherent Attributes in the Tensor dialect
  • [libc] Prevent changing ownership of the port once opened
  • [FS-AFDO] Do not load non-FS profile in MIR loader.
  • [libc][rpc] Allocate locks array within process
  • Prioritize using a segment with the name TEXT instead off fileoff 0
  • [libc] Fix RPC interface when sending and recieving aribtrary packets
  • [ORC-RT] Add REQUIRES: jit-compatible-osx-swift-runtime to testcase.
  • [OpenMP][Flang][Semantics] Add semantics support for USE_DEVICE_ADDR clause on OMP TARGET DATA directive.
  • [mlir][Linalg] Use ReifyRankedShapedTypeOpInterface for pad transforms.
  • Remove accidentally committed empty file
  • [libc][rpc] Allocate a single block of shared memory instead of three
  • [C++] Don't filter using declaration when we perform qualified look up
  • [AIX] enable enable OrcCAPITest, NFC
  • [clang][CodeGenPGO] Don't use an invalid index when region counts disagree
  • Support critical edge splitting for jump tables
  • [X86] Add lowering of fminimum/fmaximum for vector operands.
  • [NFC][AMDGPU] Pre-commit test.
  • [mlir][doc] Fix the EBNF description of mlir syntax in language reference doc
  • [libc] Allows cross compilation of membenchmarks
  • [libc][benchmark] Do not force static linking
  • [AArch64] Handle vector with two different values with efficient vector mask
  • [libc++] Adjust tests using ext/* headers that undefine __DEPRECATED
  • [MachineFunction][DebugInfo][nfc] Introduce EntryValue variable kind
  • [Flang] Change complex divide lowering
  • [mlir][bytecode] Fix dialect version parsing.
  • [CodeGen][ShrinkWrap] Split restore point
  • [mlir][llvm] Improve lookups in LLVM IR import (NFC).
  • [mlir][llvm] Improve LLVM IR constant import.
  • Fix CRTP partial specialization instantiation crash.
  • [libc][NFC] Clean up some code in the RPC implementation.
  • Fixed NATVIS debug visualizers for LLVM
  • Fixed NATVIS debug visualizers for Clang
  • [libc][obvious] Fix undefined variable after name change
  • [YamlMF] Serialize EntryValueObjects
  • llvm/lib: Use explicitly since D146395 has hidden errno
  • [flang] Use internal linkage for string literals
  • [LAA/LV] Simplify stride speculation logic [NFC]
  • [flang][hlfir] Establish <storage, mustFree> tuple for ApplyOp and NoReassocOp.
  • [LV] Use VPValue to get expanded value for SCEV step expressions.
  • [WPD] Update llvm.public.type.test after importing functions
  • Revert "[LAA/LV] Simplify stride speculation logic [NFC]"
  • [libc] Implement a generic streaming interface in the RPC
  • [LAA/LV] Simplify stride speculation logic [NFC] (try 2)
  • [mlir][flang][openacc] Remove obsolete operand legalization passes
  • [mlir][sparse] add util for ToCoordinatesBuffer for COO AoS
  • [mlir][Linalg] Add support for lowerPack on dynamic outer shapes.
  • [libc] Fix undeclared 'free' function in stream test
  • [mlgo] Fix reference files / values post - D140975
  • [LV/LAA] Use PSE to identify stride multiplies which simplify [mostly nfc]
  • [Propeller] Use a bit-field struct for the metdata fields of BBEntry.
  • [ShrinkWrap] Allow shrinkwrapping past memory accesses to jump tables
  • [LAA] Simplify identification of speculatable strides [nfc]
  • [BOLT] Fix flush pending relocs
  • [clang] Document extensions from later standards
  • [AArch64] Update Changed status in AArch64MIPeepholeOpt
  • [bazel][NFC] Add missing dep after 5ac48ef
  • [lldb-vscode] Fix handling of RestartRequest arguments.
  • [lldb] Correct elision of line zero in mixed disassembly
  • Relax test to not rely on the variable being optimized out
  • [ObjC][ARC] Fix non-deterministic behavior in ProvenanceAnalysis
  • [RISCV] RISCVELFTargetObjectFile: use 2-byte alignment for .text if RVC
  • [libc++][PSTL] Add more specialized backend customization points
  • [mlir][spirv] Remove duplicated tests in MemRefToSPIRV conversions
  • [LV] Reuse SCEV expansion results for epilogue vectorization.
  • [Clang] Respect -L options when compiling directly for AMDGPU
  • [mlir][spirv] NFC: Clean up MemRefToSPIRV tests with CSE
  • [VPlan] Remove dangling comment and newlines (NFC).
  • Remove outdated sentence in SourceBasedCodeCoverage.rst
  • [MLIR] Add InferShapedTypeOpInterface bindings
  • [NFC][sanitizers] Extract BlockSignals function
  • [gn build] Port 8e2d09c
  • [NFC][sanitizer] Add class to track thread arg and retval
  • We can't let GetStackFrameCount get interrupted or it will give the wrong answer. Plus, it's useful in some places to have a way to force the full stack to be created even in the face of interruption. Moreover, most of the time when you're just getting frames, you don't need to know the number of frames in the stack to start with. You just keep calling Thread::GetStackFrameAtIndex(index++) and when you get a null StackFrameSP back, you're done. That's also more amenable to interruption if you are doing some work frame by frame.
  • [flang] Inline array size call when dim is compile time constant
  • [mlir][openacc] Add host_data operation
  • [flang][hlfir] Fixed invalid fir.convert generated by AssociateOp codegen. Differential Revision: https://reviews.llvm.org/D150393
  • [mlir][spirv] Support sub-byte integer types in type conversion
  • [HWSAN] Use ThreadArgRetval in HWSAN
  • [flang][openacc][NFC] Update _OPENACC definition to 202011
  • [ADT][NFC] Fix compilation of headers under C++23
  • [bazel] fix bazel
  • [ASAN] Use ThreadArgRetval in ASAN
  • [IPO] Opt-in local clones for thinlto imports
  • [LSAN] Use ThreadArgRetval in LSAN
  • [HWASAN] Prevent crashes on thread exit
  • [NFC][LSAN] Move ThreadCreate into child thread
  • [SelectionDAG] Correct AddNodeIDCustom for MemIntrinsicSDNodes.
  • [RISCV] Fix crash if you use an immediate as part of a vtype operand list.
  • [NFC][AST] Return void from setUseQualifiedLookup
  • Declare _availability_version_check as weak_import instead of looking it up at runtime using dlsym
  • [gn] port c45ee7c
  • [NFC] Refactor SuffixTree to use LLVM-style RTTI
  • [SystemZ][z/OS] Save (and restore) R3 to avoid clobbering parameter when call stack frame extension is invoked
  • [CMake][fuzzer] Add riscv64 to fuzzer supported arch list
  • [NFC][xray] Initialize XRayFileHeader
  • [RISCV][NFC] Remove unused class defination.
  • [NFC][LLLexer] Consistently initialize *Val fields
  • [LoongArch] clang-format LoongArchISelLowering.cpp. NFC
  • Revert "[SystemZ][z/OS] Save (and restore) R3 to avoid clobbering parameter when call stack frame extension is invoked"
  • [LiveDebugValues] Temporarily initialize MLocTracker::CurBB
  • [NFC][LiveDebugValues] Clang-format b135df0
  • [MLIR][Memref] Remove unnecessary #include
  • This patch adds doc for __builtin_flt_rounds and __builtin_set_flt_rounds and also adds description for default fp environment.
  • [mlir][tosa] Fold exp(log) operation into no-op
  • [xray] Ignore -Wc++20-extensions in xray_records.h [NFC]
  • [NFC] SuffixTree: Move EmptyIdx into SuffixTreeNode and add a root allocator
  • [NFC] SuffixTree: Split out SuffixTreeNodes into their own files
  • [gn build] Port 6cf993e
  • [NFC] Tidy SuffixTree.h
  • SuffixTree: Don't save entire leaf nodes in advance()
  • [SimpleLoopUnswitch][reland 2] unswitch selects
  • [NFC] SuffixTree: Move advance() into SuffixTree.cpp + more cleanup
  • [lli] Add new testcases for lli.
  • [NFC][xray] Initialize XRayFileHeader
  • [ASAN][LSAN] Ignore main or uninitialized thead in pthread_exit
  • [test] Remove Python<3.3 workaround without shlex.quote
  • [DFSAN] Add support for strnlen
  • [Serialization] Don't try to complete the redeclaration chain in ASTReader after we start writing
  • [test] Use autogenerated assertions
  • Revert "[NFC][xray] Initialize XRayFileHeader" Revert "[xray] Ignore -Wc++20-extensions in xray_records.h [NFC]"
  • [RISCV] Fold (select setcc, setcc, setcc) into and/or instructions
  • [clang][analyzer] Cleanup tests of StdCLibraryFunctionsChecker (NFC)
  • [mlir][Linalg] NFC - fail gracefully instead of asserting in HoistPadding
  • [RISCV][CodeGen] Support Zhinx and Zhinxmin
  • [AArch64][SME2/SVE2p1] Add predicate-as-counter intrinsics for ptrue/cntp
  • [AArch64][SME2/SVE2p1] Add predicate-as-counter intrinsics for while*
  • [mlir] Move casting method calls to function calls
  • [mlir] Move casting calls from methods to function calls
  • [mlir] Update method cast calls to function calls
  • [AggressiveInstCombine] folding load for constant global patterened arrays and structs by GEP-indices     Differential Revision: https://reviews.llvm.org/D146622     Fixes https://github.com/llvm/llvm-project/issues/61615     Reviewed By: nikic
    
  • [lldb] Don't write to source directory in test
  • AMDGPU: Fix issue in shl(or) combine
  • [mlir] Add timings to mlir translate.
  • [AsmPrinter] Use EntryValue object info to emit Dwarf
  • [lldb][nfc] Simplify DebugRanges class
  • [mlir][transform] TrackingListener: Allow existing ops as replacements
  • [SystemZ][z/OS] Save (and restore) R3 to avoid clobbering parameter when call stack frame extension is invoked
  • [AArch64] Add shrink-wrapping test with missing memoperands.
  • [mlir][linalg] Add channel-first variants of convolution
  • [mlir][NFC] Fix broken sidebar and improve documentation
  • Revert "[X86][AsmParser] Refactor code in AsmParser"
  • [X86][AsmParser] Refactor code and optimize more instructions from VEX3 to VEX2
  • [ShrinkWrap] Conservatively treat MIs without memory operands.
  • [mlir][memref] Lower copy of memrefs with outer size-1 dims to intrinsic memcpy.
  • Precommit test for D149873
  • DestinationPassingStyle: allow additional non-tensor results
  • [X86] narrowShuffle - only narrow from legal vector types
  • [clang][ci] Improves buildkite artifacts.
  • [clang] Restores some -std=c++2b tests.
  • [NFC][libc++][format] Uses uniform member signatures.
  • AMDGPU: Force sc0 and sc1 on stores for gfx940 and gfx941
  • [IRTranslator][DebugInfo] Implement translation of entry_value vars
  • [AMDGPU] Fix crash with 160-bit p7's by manually defining getPointerTy
  • [SelectionDAG][DebugInfo] Implement translation of entry_value vars
  • [DAGCombiner][AArch64][VE] Teach BuildUDIV/SDIV to use 2x mul when mulh/mul_lohi are not available.
  • [mlir][sparse] minor reorg of sparse tensor tablegen defs
  • [RISCV][llvm-mca] Add mca tests for riscv lmul instruments
  • [GlobalISel] Handle ptr size != index size in IRTranslator, CodeGenPrepare
  • Revert "[RISCV][llvm-mca] Add mca tests for riscv lmul instruments"
  • [mlir][Linalg] NFC - Retire dead FusionOnTensors.cpp
  • [mlir][Linalg] NFC - Retire dead tilePadOp
  • [RISCV] Fix typo in comment. NFC
  • [RISCV][llvm-mca] Add mca tests for riscv lmul instruments
  • [lldb-vscode] Skip restart tests on ARM
  • Fix libstdc++ data formatter for reference/pointer to std::string
  • [sanitizers] Remove assert from ThreadArgRetval::Finish
  • [RISCVGatherScatterLowering] Minor code cleanup [NFC]
  • [mlir][irdl] Add verification of IRDL ops
  • [mlir][gpu][sparse] add gpu ops for sparse matrix computations
  • [Driver] -ftime-trace: derive trace file names from -o and -dumpdir
  • Revert "[mlir][irdl] Add verification of IRDL ops"
  • [libc] Check the RPC server once again after the kernel exits
  • [OpenMP] Naturally align internal global variables in the OpenMPIRBuilder
  • [EarlyIfCvt] Don't if-convert if condition has only loop-invariant ops.
  • [libc++][ranges] Fix iota_view's constructor's incorrect constraint
  • [flang][hlfir] Fixed AssociateOp codegen for 0-dim variables.
  • [-Wunsafe-buffer-usage] Move the whole analysis to the end of a translation unit
  • Add additional criteria for hoisting vector.transfer_reads
  • profilie inference changes for stale profile matching
  • [lldb][NFCI] Redefine dw_attr_t typedef with llvm::dwarf::Attribute
  • [RISCVGatherScatterLowering] Remove restriction that shift must have constant operand
  • [lldb][NFCI] Change return type of DWARFDebugInfoEntry::GetAttributes
  • [PowerPC] Adjust tests after e351b9b.
  • [RISCVGatherScatterLowering] Support shl in non-recursive matching
  • [test] Fix ftime-trace.cpp on Windows
  • [lldb][NFCI] Delete commented out method OptionValueProperties::GetQualifiedName
  • [RISCV] Move VFMADD_VL DAG combine to a function. NFC
  • [AMDGPU][GFX908] IndirectCopyToAGPR: Confirm modified register is dst reg of accvgpr_write
  • [libc++][PSTL] Move the already implemented functions to the new dispatching scheme
  • [lldb][NFCI] Replace use of DWARFAttribute in DWARFAbbreviationDecl
  • [mlir][tosa] Add accumulator type attribute to TOSA dialect
  • [flang] Fixed global name creation for literal constants.
  • [llvm-profdata] ProfileReader cleanup - preparation for MD5 refactoring - 3
  • [AArch64] Add test for #62620.
  • [test][sanitizers] Disable new test on Android
  • ObjCopy: support --dump-section on COFF
  • [-Wunsafe-buffer-usage] Remove an unnecessary const-qualifier
  • Fix mlir trait documentation typo
  • [HWASan] unflake test
  • [OpenMP] Fix GCC build issues and restore "Additional APIs used by the MSVC compiler for loop collapse (rectangular and non-rectangular loops)"
  • [llvm] Migrate {starts,ends}with_insensitive to {starts,ends}_with_insensitive (NFC)
  • [OpenMP] remove an erroneous assert on the location argument
  • [MemProf] Set hot/cold new values with option
  • [AMDGPU] Emit predefined macro __AMDGCN_CUMODE__
  • ASan: add testcase for backtrace interceptor
  • Revert "[RISCVGatherScatterLowering] Minor code cleanup [NFC]"
  • Revert "[X86][AsmParser] Refactor code and optimize more instructions from VEX3 to VEX2"
  • [LV] Use interface routines instead of internal variables
  • [mlir][openacc] Add canonicalization pattern for acc.host_data
  • [SuffixTree] Add suffix tree statistics
  • Revert "[SuffixTree] Add suffix tree statistics"
  • Reapply "[RISCVGatherScatterLowering] Minor code cleanup [NFC]"
  • [RISCVGatherScatterLowering] Use InstSimplifyFolder
  • [libc][math] Implement fast division / modulus for UInt / (uint32_t * 2^e).
  • [X86][AsmParser] Reapply "Refactor code and optimize more instructions from VEX3 to VEX2"
  • Replace None with std::nullopt in comments (NFC)
  • Add 'REQUIRES: asserts' to test added in D150002 (53a4adc) because it tests for a crash that is caused by an assertion failure.
  • [Clang][LoongArch] Add GPR alias handling without $ prefix
  • ASan: add backtrace_symbols test and clarify code is correct
  • ASan: unbreak Windows build by limiting backtrace* tests to glibc
  • [clang] Fix typos in documentation
  • [clang-tidy] Modernize RangeDescriptor (NFC)
  • workflows/repo-lockdown: Ignore libcxx and related sub-directories
  • [ELF] Remove remnant ranks for PPC64 ELFv1 special sections
  • github: Remove pull request template
  • Revert "[RISCV][llvm-mca] Add mca tests for riscv lmul instruments"
  • workflows/release-tasks: Remove stray backslash
  • docs: Document procedure for updating pull requests
  • [RISCV] Teach doPeepholeMaskedRVV to handle FMA instructions.
  • [gn build] Port b97859b
  • [llvm] Fix typos in documentation
  • [ELF] Simplify getSectionRank and rewrite comments
  • [test] Driver/ftime-trace.cpp: work around -Wmsvc-not-found
  • [Matrix] Add shape verification.
  • [Clang][Docs] Fix man page build
  • [llvm-exegesis] Remove Assembler Tests
  • [Docs][llvm-exegesis] Specify supported platforms and architectures
  • [LV] Move getVScaleForTuning out of LoopVectorizationCostModel (NFC).
  • [NFC][libc++][format] Tests formatter requirements.
  • [llvm-jitlink] Pass object features when creating MCSubtargetInfo
  • Reland "[CMake] Bumps minimum version to 3.20.0."
  • [MLIR] NFC. Pass affine copy options by const ref
  • [VPlan] Change LoopVectorizationPlanner::TTI to be const reference (NFC)
  • [LV] Move selecting vectorization factor logic to LVP (NFC).
  • [gn] port 88c1242 (begone, LLVMExegesisARMTests)
  • [Clang][CMake] Use perf-training for Clang-BOLT
  • [cmake] Disable GCC lifetime DSE
  • [X86] Add tests for inverting (x * (Pow2_Ceil(C1) - (1 << C0))) & C1 -> (-x << C0) & C1; NFC
  • [X86] Invert transforming (x * (Pow2_Ceil(C1) - (1 << C0))) & C1 -> (-x << C0) & C1
  • [InstCombine] Add simplifications for div/rem with i1 operands; PR62607
  • [SelectionDAG] Limit max recursion in isKnownNeverZero and isKnownToBeAPowerOfTwo
  • [SelectionDAG] Use computeKnownBits if Op is not recognized by isKnownNeverZero
  • [Docs] Minor Fixups in Advanced Builds Documentation
  • ASan: fix potential use-after-free in backtrace interceptor
  • [libc++][NFC] Use _LIBCPP_STD_VER instead of __cpp_lib_atomic_is_always_lock_free
  • MCSymbol: Split FragmentAndHasName to Fragment and HasName
  • Revert "[cmake] Disable GCC lifetime DSE" (to fix authorship)
  • [cmake] Disable GCC lifetime DSE
  • [M68k] Update divide-by-constant.ll after D150333.
  • [libc++][PSTL] Make the PSTL submodules only have one header
  • [LegalizeVectorOps][AArch64][RISCV][X86] Use OpVT for ISD::SETCC in LegalizeVectorOps.
  • Revert "[cmake] Disable GCC lifetime DSE"
  • [IntervalTree] Initialize find_iterator::Point
  • [test][sanitizer] Disable create_thread_loop on Android
  • [Coverity] Fix unchecked return value, NFC
  • [MLIR] NFC. Make affine analysis utils method const correct
  • [MLIR] NFC. Add missing const on affine analysis utils methods
  • [X86] Improve handling on zero constant for fminimum/fmaximum lowering
  • [Coverity] Fix unchecked return value, NFC
  • [X86] Fix the bug of pr62625
  • [LV] Add test case for #51677.
  • [clang] Convert a few tests to opaque pointers
  • [libc++] Moves unwrap_reference to type_traits.
  • [MC][X86] Fix != result for two register operands
  • [MC] Remove redundant classof definitions for MCTargetDesc's derived classes
  • Revert "[LV] Add test case for #51677."
  • [gn build] Port b793280
  • [Matrix] Remove redundant transpose with dot product lowering.
  • [clang-tidy][test] Add trailing -- to suppress compile_commands.json read
  • [AArch64] Update FP16 vector cmp costs
  • [NFC][Clang] Fix Coverity issues of copy without assign
  • [NFC][CLANG] Fix Static Code Analysis Concerns
  • [lldb] Complete OptionValue cleanup (NFC)
  • Revert "[Serialization] Don't try to complete the redeclaration chain in"
  • [lldb] Cleanup OptionValue header and implenentation (NFC)
  • [cmake] Disable GCC lifetime DSE
  • [clang][dataflow] Eliminate SkipPast::ReferenceThenPointer.
  • [RISCV] Add test cases for forming vfwmacc when widening from f16 to f64. NFC
  • [RISCV] Add RISCVISD nodes for VWFMADD_VL.
  • [AIX][tests] XFAIL -ftime-trace test for now
  • [Driver][test] Add -fintegrated-as after D150282
  • [AMDGPU][MC] Don't accept attr > 32 for param_load
  • [AMDGPU] Improve PHI-breaking heuristics in CGP
  • [IR] Drop const in DILocation::getMergedLocation
  • [NFC] Refactor GlobalVariable Ctor
  • [libc] Add optimized memset for RISCV
  • [mlir][transform] Use TrackingListener-aware iterator for getPayloadOps
  • [TableGen][SubtargetEmitter] Add the StartAtCycles field in the WriteRes class.
  • [docs] Add Python coding standard to documentation
  • [LLD][ELF] Add missing program header parsing to OVERLAY
  • [flang] add hlfir.any intrinsic
  • [flang] lower any intrinsic to hlfir.any operation
  • [flang][hlfir] lower hlfir.any into fir runtime call
  • [X86] LowerRotate: prefer unpack-based algorithm
  • [RegScavenger] Simplify forward(MachineBasicBlock::iterator). NFC.
  • [VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map
  • [clang][dataflow] Don't analyze templated declarations.
  • [libc] Make the bump pointer explicitly return null on buffer oveerrun
  • [libc] Cache ownership of the shared buffer in the port
  • [clang][parser] Fix namespace dropping after malformed declarations
  • [clangd] Fix fixAll not shown when there is only one unused-include and missing-include diagnostics.
  • [mlir][bufferization] Fix unknown ops in BufferViewFlowAnalysis
  • [mlir][IR][tests] Fix incorrect API usage in RewritePatterns
  • [ValueTracking] Fix computeKnownFPClass with canonicalize
  • [Pipelines] Don't skip GlobalDCE in ThinLTO pre-link
  • Fix build error caused by https://reviews.llvm.org/D149718
  • [AMDGPU] Simplify liveins in some MIR tests
  • [mlir][bufferization] Improve findValueInReverseUseDefChain signature
  • clang-format: [JS] terminate import sorting on export type X = Y
  • [mlir][bufferization] Add option to dump alias sets
  • [mlir][scf][bufferize] Fix bug in WhileOp analysis verification
  • [libc++][PSTL] Implement std::transform
  • [unittests][llvm-exegesis] Remove build warnings [NFCI]
  • Revert "[libc++][PSTL] Implement std::transform"
  • [ConstantFold] use StoreSize for VectorType folding Differential Revision: https://reviews.llvm.org/D150515 Reviewed By: nikic
  • [AArch64] Add test case where widening mull could be used.
  • [X86] Use the CFA as the DWARF frame base for better variable locations around calls.
  • [mlir] allow repeated payload in structured.fuse_into_containing
  • [MLIR][ROCDL] add gpu to rocdl erf support
  • [LLVM][Uniformity] Propagate temporal divergence explicitly
  • Update __cplusplus for C++23, add C++23 diag group alias.
  • [KnownBitsTest] Align with ConstantRange test infrastructure (NFC)
  • [OpenMP] Implement task record and replay mechanism
  • [mlir][memref] Extract isStaticShapeAndContiguousRowMajor as a util function.
  • [KnownBitsTest] Remove stray semicolons
  • [AIX][clang] Storage Locations for Constant Pointers
  • [mlir][sparse][gpu] first implementation of the GPU libgen approach
  • Revert "[X86] Use the CFA as the DWARF frame base for better variable locations around calls."
  • Add C++26 compile flags.
  • [clang][USR] Prevent crashes on incomplete FunctionDecls
  • [AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE.
  • [AMDGPU] Trim zero components from buffer and image stores
  • [libc][NFC] Clean up the memory buffer handling for RPC
  • [Mips] Remove MipsRegisterInfo::requiresRegisterScavenging. NFC.
  • Fix test from b763d6a
  • [flang][hlfir] Fixed lowering for intrinsic calls with null() box argument.
  • [flang][hlfir] Fixed copy-in for polymorphic arguments.
  • [clang][AIX] Remove Newly Added Target Dependent Test Case
  • [test] Fix const-str-array-decay.cl failure on PowerPC
  • [libc++][PSTL] Implement std::transform
  • [gn build] Port 6851d07
  • [mlir] Fix a warning
  • [SLP][NFC] Cleanup: Separate vectorization of Inserts and CmpInsts.
  • [clang] Convert a few OpenMP tests to use opaque pointers
  • Fix build failure caused by https://reviews.llvm.org/D150352
  • Enable frame pointer for all non-leaf functions on riscv64 Android
  • [libc++][PSTL] Implement std::copy{,_n}
  • [libc++][docs] Move the pre-release check-list
  • [gn build] Port b049fc0
  • [flang][runtime] Fixed dimension offset computation for MayAlias.
  • [flang][runtime] Fixed memory leak in Assign().
  • Revert "[AIX][tests] XFAIL -ftime-trace test for now"
  • [mlir][sparse][gpu] end-to-end integration test of GPU libgen approach
  • Revert "[libc++][PSTL] Implement std::copy{,_n}"
  • [LLD][ELF] change CHECK to CHECK-NEXT in overlay-phdr.test NFCI
  • [libc++] Implement ranges::starts_with
  • Fix global-variable-alignment.ll test
  • Update SemaSYCL/intel-fpga-loops.cpp after "[Sema] Lambdas are not part of immediate context for deduction" ([SYCL] Build proper barrier deps if host task is involved in pipeline #13094)
  • Add option to control builtin format for reverse translation ([SYCL] Deny support of SPV_INTEL_usm_storage_classes for all targets … #1986)
  • Support the spirv.BufferSurfaceINTEL target extension type ([SYCL][CUDA] Event synchronization only done for latest events #1995)
  • Add nonsemantic-shader-100/200 to X86 tests (Adding a test case for reqd_work_group_size using L0 backend. #2005)

d0k and others added 30 commits May 12, 2023 12:26
The code is doing the optimization:
`((a | c1) << c2)` ==> `(a << c2) + (c1 << c2)`
But this is only valid if `a` and `c1` have no common bits being set.

Differential Revision: https://reviews.llvm.org/D150246
The revision adds basic timing to the mlir-translate tool.

Reviewed By: Dinistro

Differential Revision: https://reviews.llvm.org/D150434
This patch consumes the EntryValueObjects in a MachineFunction's table, using
them to emit the appropriate debug information for these variables.

Depends on D149880

Differential Revision: https://reviews.llvm.org/D149881
Most of the code changed here dates back to 2010, when LLDB was first
introduced upstream, as such it benefits from a slight cleanup.

The method "dump" is not used anywhere nor is it tested, so this commit removes
it.

The "findRanges" method returns a boolean which is never checked and indicates
whether the method found anything/assigned a range map to the out parameter.
This commit folds the out parameter into the return type of the method.

A handful of typedefs were also never used and therefore removed.

Differential Revision: https://reviews.llvm.org/D150363
The TrackingListener was unnecessarily strict. Existing ops are now allowed when updating payload ops mappings due to `replaceOp` in the TrackingListener.

Differential Revision: https://reviews.llvm.org/D150429
…hen call stack frame extension is invoked

When the stack frame extension routine is used, the contents of r3 is overwritten.
However, if r3 is live in the prologue (ie. one of the function's parameters
resides in r3), it needs to be saved. We save r3 in r0 if r0 is available
(ie. r0 is not used as temporary storage for r4), and in the corresponding
stack slot for the third parameter otherwise.

Differential Revision: https://reviews.llvm.org/D150332

Reviewed By: uweigand
The newly added compiler_pop_stack_no_memoperands has no memory operands
on the memory instructions but accesses the same locations as
compiler_pop_stack. At the moment, accesses to the stack are missed by
shrink-wrapping. Test case for the issue pointed out by @jpenix-quic in
D149668 post-commit.
This change adds the following three operations and unit tests for them:

- conv_3d_ncdhw_fcdhw
- depthwise_conv_1d_ncw_cw
- depthwise_conv_3d_ncdhw_cdhw

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D150054
- Added missing TensorTransformOps to the Transform doc
- Added missing AMDGPUPasses to the Passes doc
- Place `async dialect` in alphabetical order in the Passes doc

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D150341
This reverts commit 8d657c4.

Reverts it due to the regression reported in D150068.
…X3 to VEX2

1. Share code `optimizeInstFromVEX3ToVEX2` with MCInstLower
2. Move the code of optimization for shift/rotate to a separate file
3. Since the function is shared, a side effect is that more encoding
   optimizations are done on the Asmparser side. Considering we already
   use reverse-encoding for optimization in AsmParser before this patch,
   I believe the change is positive and expected.

This is a reland of D150068 with the fix D150440.
As pointed out by @jpenix-quic in D149668 post-commit, machine instructions
without memory operands need to be treated conservatively.
…sic memcpy.

With this change, more `memref.copy` will be lowered to the efficient `memcpy`. For example,

```
memref.copy %subview, %alloc : memref<1x576xf32, strided<[704, 1]>> to memref<1x576xf32>
```

Differential Revision: https://reviews.llvm.org/D150448
Change-Id: I608f14ac3a504cc668f93f130a17dea3950fa554
Also some simplifications:

* `outputBufferOperands` was unused.
* The condition that the number of operands equals the number of inputs
  plus the number of inits seemed vacuously true (?).

Differential Revision: https://reviews.llvm.org/D150376
The financial cost of the network I/O for the Clang install artifacts is
quite significant. afd3478 improved this by creating tarballs. This
commit improves the tarball by using xz compression instead of gzip. This
option is the slowest, but gives the smallest size.

      size  time           time
            (compression)  (decompression)
gzip  51 M  7  s           1.2 s
bz2   44 M  17 s           5.8 s
xz    33 M  76 s           3.1 s

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D150062
These tests should have added -std=c++23 instead of replacing -std=c++2b
in D149553.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D150063
The newer formatters for (tuple, vector<bool>::reference) specify the
formatter's parse and format member function. This signature is slightly
different from the signature for existing formatters. Adapt the existing
formatters to the new style.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D150034
This commit implements IRTranslator lowering of dbg.declare intrinsics targeting
swiftasync Arguments, by putting them in the MachineFunction's table of
variables whose location doesn't change throughout the function.

Depends on D149881

Differential Revision: https://reviews.llvm.org/D149882
While pointers in address space 7 (128 bit rsrc + 32 bit offset)
should be rewritten out of the code before IR translation on AMDGPU,
higher-level analyses may still call MVT getPointerTy() and the like
on the target machine. Currently, since there is no MVT::i160, this
operation ends up causing crashes.

The changes to the data layout that caused such crashes were D149776.

This patch causes getPointerTy() to return the type MVT::v5i32
and getPointerMemTy() to be MVT::v8i32. These are accurate types,
but mean that we can't use vectors of address space 7 pointers during
codegen. This is mostly OK, since vectors of buffers aren't supported
in LPC anyway, but it's a noticable limitation.

Potential alternative solutions include adjusting getPointerTy() to return
an EVT or adding MVT::i160 and MVT::i256, both of which are rather
disruptive to the rest of the compiler.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D150002
This commit implements SelectionDAG lowering of dbg.declare intrinsics targeting
swiftasync Arguments, by putting them in the MachineFunction's table of
variables whose location doesn't change throughout the function.

Depends on D149882

Differential Revision: https://reviews.llvm.org/D149883
…lh/mul_lohi are not available.

Correct the legality of i32 mul_lohi on AArch64.

Previously, AArch64 incorrectly reported i32 mul_lohi as Legal.
This allowed BuildUDIV/SDIV to use them. A later DAGCombiner would
replace them with MULHS/MULHU because only the high half was used.
This conversion does not check the legality of MULHS/MULHU under
the assumption that LegalizeDAG can turn it back into MUL_LOHI later.

After they are converted to MULHS/MULHU, DAGCombine ran and saw that
these operations aren't supported but an i64 MUL is. So they get
converted to that plus a shift. Without this, LegalizeDAG would
convert back MUL_LOHI and isel would fail to find a pattern.

This patch teaches BuildUDIV/SDIV to create the wide mul and shift
so that we can report the correct operation legality on AArch64. It
also enables div by constant folding for more cases on VE.

I don't know if VE wants this div by constant optimization or not. If they
don't want it, they can use the isIntDivCheap hook to disable it.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D150333
Add llvm-mca tests for RISCV LMUL instruments to show that llvm-mca RISCV LMUL
instruments work.

Differential Revision: https://reviews.llvm.org/D149496
…epare

While the original motivation for this patch (address space 7 on
AMDGPU) has been reworked and is not presently planned to reach IR
translation, the incorrect (by the spec) handling of index offset
width in IR translation and CodeGenPrepare is likely to trip someone
- possibly future AMD, since we have a p7:160:256:256:32 now, so we
convert to the other API now.

Reviewed By: aemerson, arsenm

Differential Revision: https://reviews.llvm.org/D143526
This commit passed buildable tests in phabricator, but fails once
committed.

This reverts commit 1dedc96.
ldionne and others added 21 commits May 15, 2023 10:34
It was confusing to some contributors because it appeared in a
prominent place on the Contibuting page.
The temporary descriptor must be either Pointer or Allocatable,
otherwise its memory will not be freed.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D150534
The test was fixed by 2f99932.

This reverts commit 25dc215.
This reverts commit b049fc0.

The wrong patch was landed.
A code-review comment to change a couple of CHECK to CHECK-NEXT that I
forgot to apply prior to committing.

Differential Revision: https://reviews.llvm.org/D150445
…rt of immediate context for deduction" (intel#13094)

Updated test after 629170f
  CONFLICT (content): Merge conflict in clang/test/CodeGenSYCL/field-annotate-addr-space.cpp
  CONFLICT (content): Merge conflict in clang/test/CodeGenSYCL/unique_stable_name.cpp
  CONFLICT (content): Merge conflict in clang/lib/Frontend/CompilerInvocation.cpp
)

Currently, we always convert SPIR-V bultins to globals for forward translation and to functions for reverse translation.

I have a use case where I want to keep them as globals for reverse translation, so I added this mode.

Implementations for both cases already existed, I just consolidated them and added the option.

Signed-off-by: Sarnie, Nick <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@730eaf0
This target extension type is created here: https://github.com/intel/vc-intrinsics/blob/master/GenXIntrinsics/lib/GenXIntrinsics/GenXSPIRVWriterAdaptor.cpp#L245

As with other target extension types, reverse translation is not yet supported.

Signed-off-by: Sarnie, Nick <[email protected]>
Co-authored-by: Victor Mustya <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@60746d5
Currently only to DebugInfo/X86

Currently failing tests can be noticed by RUNx line

Signed-off-by: Sidorov, Dmitry <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@772c7be
@jsji jsji requested review from a team and bader as code owners May 16, 2023 16:03
@jsji jsji requested a review from steffenlarsen May 16, 2023 16:03
@jsji jsji closed this May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clang x86 missed optimization for 2 dimension array accessed through always zero enum class