Bump LLVM to green commit 4546397e39589f0a6a707218349d1bf65fe54645 from Oct. 17 (redone) #4

gargaroff · 2022-11-09T10:20:56Z

See #3 for details. This PR redoes the same effort but tries to make the git history make sense again.

…p (NFC)

Currently emscripten is make assumptions about that memory layout, assuming the stack is between `__data_end` and `__heap_base`: https://github.com/emscripten-core/emscripten/blob/af961ad5c4c278ec510f0b7f7d522a95ee5a90f8/system/lib/compiler-rt/stack_limits.S#L42-L61 With this change we can be more precise: emscripten-core/emscripten#18057 Differential Revision: https://reviews.llvm.org/D135910

avoiding an assertion. A BB with a nonzero count, whose successor blocks all have 0 counts, could cause an assertion. Don't create any branch weights in this case. Reviewed By: xur Differential Revision: https://reviews.llvm.org/D134203

Happens when we find identical specializations. Differential Revision: https://reviews.llvm.org/D135459

The clang distributed with the Android NDK has defaulted to lld since r22, so let's update the driver to match. Differential Revision: https://reviews.llvm.org/D135421

Fix a small thinko in https://reviews.llvm.org/D133534 . Normally DynamicLoaderDarwinKernels are created via the CreateInstance plugin method, and that plugin method sets the Process CanJIT to false. In the above patch, I added a new code path that can call the DynamicLoaderDarwinKernel ctor directly, without going through CreateInstance, and CanJIT was not being correctly set for the process. rdar://101148552

The Count/MaxCount used in TransferBatch and PerClass can be fit in u16 in current configurations and it's also reasonable to have a u16 limit. The spare 16 bits will be used for additional status like pages mapping status in a TransferBatch. Reviewed By: cryptoad, cferris, vitalybuka Differential Revision: https://reviews.llvm.org/D133145

PageReleaseContext contains all the information needed for determing if a page can be released. Splitting out the context increases the flexibility of heterogenous free lists in the future. Also rename PackedCounterArray to PageMap. Reviewed By: cryptoad, cferris Differential Revision: https://reviews.llvm.org/D133895

Scudo is supposed to allocate any blocks across the entired mapped pages and each page is equally likely to be selected. Which means Scudo is leaning to touch as many pages as possible. This brings better security but it also sacrifices the chance of releasing dirty pages. To alleviate the unmanagable footprint growing, this CL introduces the BatchGroup concept. Each blocks will be classified into a BatchGroup according to its address. While allocation, we are leaning to allocate blocks in the same group first. Note that the blocks selected from a group is still random over several pages. At the same time, we have better prediction of dirty page growing speed. Besides, we are able to do partial page releasing by examing part of BatchGroups. Reviewed By: cryptoad, cferris Differential Revision: https://reviews.llvm.org/D133897

Block grouping enables us doing partial page releasing so that we can release the pages in a finer granularity. Which means we don't need to visit all blocks to determine which pages are unused. Besides, this means we can do incremental page releasing depends on the number fo free blocks. Reviewed By: cryptoad, cferris Differential Revision: https://reviews.llvm.org/D134226

Differential Revision: https://reviews.llvm.org/D135624

…ore files. Prior to this fix, no shared libraries would be loaded for a core file, even if they exist on the current machine. The issue was the DYLDRendezvous would read a DYLDRendezvous::Rendezvous from memory of the process in DYLDRendezvous::Resolve() which would read some ld.so structures as they existed in the middle of a process' lifetime. In core files we see, the DYLDRendezvous::Rendezvous::state would be set to eAdd for running processes. When ProcessELFCore.cpp would load the core file, it would call DynamicLoaderPOSIXDYLD::DidAttach(), which would call the above Rendezvous functions. The issue came when during the DidAttach function it call DYLDRendezvous::GetAction() which would return eNoAction if the DYLDRendezvous::m_current.state was read from memory as eAdd. This caused no shared libraries to be loaded for any ELF core files. We now detect if we have a core file and after reading the DYLDRendezvous::m_current.state from memory we set it to eConsistent, which causes DYLDRendezvous::GetAction() to return the correct action of eTakeSnapshot and shared libraries get loaded. We also improve the DynamicLoaderPOSIXDYLD class to not try and set any breakpoints to catch shared library loads/unloads when we have a core file, which saves a bit of time. Differential Revision: https://reviews.llvm.org/D134842

Inputs to crnor can come from operands with chains so if it is being used simply to negate such an operand, the repeated input cannot be CSE'd. This patch just adds a code-gen only instruction for this that takes a single input and duplicates it in the encoding of the underlying crnor. Differential revision: https://reviews.llvm.org/D133577

This enables casting LLVM style for mlir::CallInterfaceCallable usage. Differential Revision: https://reviews.llvm.org/D135823

This fixes an issue in One-Shot Bufferize that could lead to missing buffer copies in the future. This bug can currently not be triggered because of the order in which ops are analyzed (always bottom-to-top). However, if we consider different traversal orders for the analysis in the future, this bug can cause subtle issues that are difficult to debug. Example: ``` %0 = ... %1 = tensor.insert ... into %0 %2 = tensor.extract_slice %0 tensor.extract %2[...] ``` In case of a top-to-bottom analysis of the above IR, the `tensor.insert` is analyzed before the `tensor.extract_slice`. In that case, the `tensor.insert` will bufferize in-place because %2 is not yet known to become an alias of %0 (and therefore causing a conflict). With this change, the `tensor.insert` will bufferize out-of-place, regardless of the traversal order. Differential Revision: https://reviews.llvm.org/D135049

In Linux PIC model, there are 4 cases about value/label addressing: Case 1: Function call or Label jmp inside the module. Case 2: Data access (such as global variable, static variable) inside the module. Case 3: Function call or Label jmp outside the module. Case 4: Data access (such as global variable) outside the module. Due to current llvm inline asm architecture designed to not "recognize" the asm code, there are quite troubles for us to treat mem addressing differently for same value/adress used in different instuctions. For example, in pic model, call a func may in plt way or direclty pc-related, but lea/mov a function adress may use got. This patch fix/refine the case 1 and case 2 in inline asm. Due to currently inline asm didn't support jmp the outsider lable, this patch mainly focus on fix the function call addressing bugs in inline asm. Reviewed By: Pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D133914

…aller than a slot Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D133488

Reviewed By: rouson Differential Revision: https://reviews.llvm.org/D135840

This was scanning through def operands looking for the symbol operand. This is pointless because the symbol is always the first operand as enforced by the verifier, and all operands are implicit.

…anches Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid inserting additional branches. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134557

Simplify the logic of handling sections in BOLT. This change brings more direct and predictable mapping of BinarySection instances to sections in the input and output files. * Only sections from the input binary will have a non-null SectionRef. When a new section is created as a copy of the input section, its SectionRef is reset to null. * RewriteInstance::getOutputSectionName() is removed as the section name in the output file is now defined by BinarySection::getOutputName(). * Querying BinaryContext for sections by name uses their original name. E.g., getUniqueSectionByName(".rodata") will return the original section even if the new .rodata section was created. * Input file sections (with relocations applied) are emitted via MC with ".bolt.org" prefix. However, their name in the output binary is unchanged unless a new section with the same name is created. * New sections are emitted internally with ".bolt.new" prefix if there's a name conflict with an input file section. Their original name is preserved in the output file. * Section header string table is properly populated with section names that are actually used. Previously we used to include discarded section names as well. * Fix the problem when dynamic relocations were propagated to a new section with a name that matched a section in the input binary. E.g., the new .rodata with jump tables had dynamic relocations from the original .rodata. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D135494

This is to fix two issues related with loading address: 1) When multiple MMAPs occur and their loading address are different, before it only used the first MMap as base address, all perf address after it used the wrong base address. 2) For pseudo probe profile, the address is always based on preferred loading address. If the base address is not equal to the preferred loading address, the pseudo probe address query will be wrong. Solution: Instead of converting the address to offset lazily, right now all the address after parsing are converted on the fly based on preferred loading address in the parsing time. There is no "offset" used in profile generator any more. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D126827

Let some of the pointer bithacking fold away if we know the LSB are 0.

Fixes failure after d383ade

Enable lowering of FNEARBYINT for f16 and extend existing tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135124

The revision imports the extract and insert value operations using tablegen generated builders. Additionally, it moves the tests to the instructions.ll test file. Reviewed By: ftynse, dcaballe Differential Revision: https://reviews.llvm.org/D135874

joker-eph and others added 30 commits October 13, 2022 21:49

Apply clang-tidy fixes for performance-for-range-copy in VectorOps.cp…

f7cd3fc

…p (NFC)

[NFC][FuncSpec] Add a test to show redundant function cloning.

2516241

Happens when we find identical specializations. Differential Revision: https://reviews.llvm.org/D135459

Driver: Change default Android linker to lld.

8d9c4a7

The clang distributed with the Android NDK has defaulted to lld since r22, so let's update the driver to match. Differential Revision: https://reviews.llvm.org/D135421

[SPIRV] Fix formatting of function tests

14ea4f5

Differential Revision: https://reviews.llvm.org/D135624

[gn build] port 1fda6f6 (lld driver_executable)

aaecabe

[mlir] Update CallInterfaceCallable to use the new casting infra.

e750c41

This enables casting LLVM style for mlir::CallInterfaceCallable usage. Differential Revision: https://reviews.llvm.org/D135823

[clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate sm…

00b9bed

…aller than a slot Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D133488

[flang] Add a semantics test for atomic_ref

57974c2

Reviewed By: rouson Differential Revision: https://reviews.llvm.org/D135840

[AArch64] add tests for ccmp with negative constant op1; NFC

07c5270

[AArch64][BuildErrorFix] Add compatible classifyGlobalFunctionReference

d0269dd

AsmPrinter: Remove pointless code in inline asm emission

c427ee9

This was scanning through def operands looking for the symbol operand. This is pointless because the symbol is always the first operand as enforced by the verifier, and all operands are implicit.

AMDGPU: Add __builtin_amdgcn_permlane64

f59f116

AtomicExpand: Avoid some operations if the atomic is overaligned

d0750ec

Let some of the pointer bithacking fold away if we know the LSB are 0.

AMDGPU: Fix failing test with expensive checks

99dff82

Fixes failure after d383ade

Add f16 nearbyint support.

6370bc2

Enable lowering of FNEARBYINT for f16 and extend existing tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135124

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bump LLVM to green commit 4546397e39589f0a6a707218349d1bf65fe54645 from Oct. 17 (redone) #4

Bump LLVM to green commit 4546397e39589f0a6a707218349d1bf65fe54645 from Oct. 17 (redone) #4

Uh oh!

gargaroff commented Nov 9, 2022

Uh oh!

Uh oh!

Bump LLVM to green commit 4546397e39589f0a6a707218349d1bf65fe54645 from Oct. 17 (redone) #4

Bump LLVM to green commit 4546397e39589f0a6a707218349d1bf65fe54645 from Oct. 17 (redone) #4

Uh oh!

Conversation

gargaroff commented Nov 9, 2022

Uh oh!

Uh oh!