LLVM and SPIRV-LLVM-Translator pulldown (WW20-21) #3779

vmaksimo · 2021-05-18T17:11:00Z

LLVM: llvm/llvm-project@d30dfa867
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@c62ef5e

1.[bool, char, short] bitfields have the same alignment as unsigned int 2.Adjust alignment on typedef field decls/honor align attribute 3.Fix alignment for scoped enum class 4.Long long bitfield has 4bytes alignment and StorageUnitSize under 32 bit compile mode Differential Revision: https://reviews.llvm.org/D87029

This is to allow disasm with any bits in the unused fields. Differential Revision: https://reviews.llvm.org/D102526

This patch adds a new test for loop-unrolling with multiple exiting blocks, where the latch does not exit, but the header does. This can happen when the loop has not been rotated, e.g. due to minsize. Inspired by the following end-to-end test, using -Oz https://godbolt.org/z/fP6sna8qK bool foo(int *ptr, int limit) { #pragma clang loop unroll(full) for (unsigned int i = 0; i < 4; i++) { if (ptr[i] > limit) return false; ptr[i]++; } return true; }

Bug 49356 (https://bugs.llvm.org/show_bug.cgi?id=49356) reports crash in the test case `tasking/bug_taskwait_detach.cpp`, which is caused by the wrong function declaration. `gtid` in `__kmpc_omp_task` should be `kmp_int32`. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D102584

Since we have both aliasing mode and Intel LAM on x86_64, we need to choose the mode at either run time or compile time. This patch implements the plumbing to build both and choose between them at compile time. Reviewed By: vitalybuka, eugenis Differential Revision: https://reviews.llvm.org/D102286

Mutli-line headers are not allowed in RST, reformat the header to be a single wide line.

…c instructions Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for `sm_80` architecture or newer. PTX ISA description of `cp.async`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-asynchronous-copy https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive Authored-by: Stuart Adams <[email protected]> Co-Authored-by: Alexander Johnston <[email protected]> Differential Revision: https://reviews.llvm.org/D100394

…ync instructions Adds NVPTX builtins and intrinsics for the CUDA PTX `redux.sync` instructions for `sm_80` architecture or newer. PTX ISA description of `redux.sync`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-redux-sync Authored-by: Steffen Larsen <[email protected]> Differential Revision: https://reviews.llvm.org/D100124

Initial version of pooling assumed normalization was accross all elements equally. TOSA actually requires the noramalization is perform by how many elements were summed (edges are not artifically dimmer). Updated the lowering to reflect this change with corresponding tests. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D102540

Missing or duplicate spack package should not cause error, since users may only installed llvm/clang package, or users may installed duplicate HIP package but will use environment variable or compiler option to choose HIP path. The message about missing or duplicate spack package is informational, therefore should be emitted only when -v is specified. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D102556

This change makes the conversion of an mlir::OpState to bool `explicit`. Idiomatic boolean uses continue to work as before, but questionable implicit uses (e.g. accumulating over a range of OpStates to count "true" states) become ill-formed. This makes the class interface a lilttle less error-prone. I tested this change on our internal (fairly large) codebase, and only one fix was needed, which was ultimately an improvement of the affected code. Reviewed By: rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D101989

Alias mode is not expected work on non-x86, so don't build it there. Should fix the aarch64 bot.

…n steroids" idiom recognition. I think i've added exhaustive test coverage, and i have verified that alive2 is happy with all the tests, so in principle i'm fine with landing this without review, but just in case.. This adds support for the "count active bits" pattern, i.e.: ``` int countActiveBits(unsigned val) { int cnt = 0; for( ; (val >> cnt) != 0; ++cnt) ; return cnt; } ``` but a somewhat more general one, since that is what i need: ``` int countActiveBits(unsigned val, int start, int off) { int cnt; for (cnt = start; val >> (cnt + off); cnt++) ; return cnt; } ``` I've followed in footstep of 'left-shift until bittest' idiom (D91038), in the sense that iff the `ctlz` intrinsic is cheap, we'll transform, regardless of all other factors. This can have a shocking effect on certain benchmarks: ``` raw.pixls.us-unique/Olympus/XZ-1$ /repositories/googlebenchmark/tools/compare.py -a benchmarks ~/rawspeed/build-{old,new}/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_min_time=0.00000001 --benchmark_repetitions=128 p1319978.orf RUNNING: /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_min_time=0.00000001 --benchmark_repetitions=128 p1319978.orf --benchmark_display_aggregates_only=true --benchmark_out=/tmp/tmp49_28zcm 2021-05-09T01:06:05+03:00 Running /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench Run on (32 X 3600.24 MHz CPU s) CPU Caches: L1 Data 32 KiB (x16) L1 Instruction 32 KiB (x16) L2 Unified 512 KiB (x16) L3 Unified 32768 KiB (x2) Load Average: 5.26, 6.29, 3.49 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations CPUTime,s CPUTime/WallTime Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ p1319978.orf/threads:32/process_time/real_time_mean 145 ms 145 ms 128 0.145319 0.999981 10.1568M 69.8949M 69.8936M 6.88159 6.88146 0.145322 p1319978.orf/threads:32/process_time/real_time_median 145 ms 145 ms 128 0.145317 0.999986 10.1568M 69.8941M 69.8931M 6.88151 6.88141 0.145319 p1319978.orf/threads:32/process_time/real_time_stddev 0.766 ms 0.766 ms 128 766.586u 15.1302u 0 354.167k 354.098k 0.0348699 0.0348631 766.469u RUNNING: /home/lebedevri/rawspeed/build-new/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_min_time=0.00000001 --benchmark_repetitions=128 p1319978.orf --benchmark_display_aggregates_only=true --benchmark_out=/tmp/tmpwb9sw2x0 2021-05-09T01:06:24+03:00 Running /home/lebedevri/rawspeed/build-new/src/utilities/rsbench/rsbench Run on (32 X 3599.95 MHz CPU s) CPU Caches: L1 Data 32 KiB (x16) L1 Instruction 32 KiB (x16) L2 Unified 512 KiB (x16) L3 Unified 32768 KiB (x2) Load Average: 4.05, 5.95, 3.43 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations CPUTime,s CPUTime/WallTime Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ p1319978.orf/threads:32/process_time/real_time_mean 99.8 ms 99.8 ms 128 0.0997758 0.999972 10.1568M 101.797M 101.794M 10.0225 10.0222 0.0997786 p1319978.orf/threads:32/process_time/real_time_median 99.7 ms 99.7 ms 128 0.0997165 0.999985 10.1568M 101.857M 101.854M 10.0284 10.0281 0.0997195 p1319978.orf/threads:32/process_time/real_time_stddev 0.224 ms 0.224 ms 128 224.166u 34.345u 0 226.81k 227.231k 0.0223309 0.0223723 224.586u Comparing /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench to /home/lebedevri/rawspeed/build-new/src/utilities/rsbench/rsbench Benchmark Time CPU Time Old Time New CPU Old CPU New ---------------------------------------------------------------------------------------------------------------------------------------------------- p1319978.orf/threads:32/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 128 vs 128 p1319978.orf/threads:32/process_time/real_time_mean -0.3134 -0.3134 145 100 145 100 p1319978.orf/threads:32/process_time/real_time_median -0.3138 -0.3138 145 100 145 100 p1319978.orf/threads:32/process_time/real_time_stddev -0.7073 -0.7078 1 0 1 0 ``` Reviewed By: craig.topper, zhuhan0 Differential Revision: https://reviews.llvm.org/D102116

…pe->GetByteSize() in ParseSingleMember We have a bug in which using member_clang_type.GetByteSize() triggers record layout and during this process since the record was not yet complete we ended up reaching a record that had not been layed out yet. Using member_type->GetByteSize() avoids this situation since it relies on size from DWARF and will not trigger record layout. For reference: rdar://77293040 Differential Revision: https://reviews.llvm.org/D102445

This patch contains the bare minimum to run the new Pass Manager from the LLVM-C APIs. It does not feature PGOOptions, PassPlugins or Debugify in its current state. Bugzilla: PR48499 Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102136

Differential Revision: https://reviews.llvm.org/D102562

This reverts commit cd220a0. Doesn't build.

These are intended to mimic warnings available in gcc. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D100581

MSVC has a `try-except` statement. This statement could containt a `__leave` keyword, which is similar to `goto` to the end of the try block. The semantic of this keyword is not implemented. We should at least parse such code without crashing. https://docs.microsoft.com/en-us/cpp/cpp/try-except-statement?view=msvc-160 Patch By: AbbasSabra! Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D102280

Differential Revision: https://reviews.llvm.org/D102636

Has the effect that `__mh_execute_header` stays in the symbol table of outputs even after running `strip` on the output. I don't know if that's important for anything -- my motivation for the patch is just is to make the output more similar to ld64. (Corresponds to symbolTableInAndNeverStrip in ld64.) Differential Revision: https://reviews.llvm.org/D102619

CONFLICT (content): Merge conflict in llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp CONFLICT (content): Merge conflict in llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

The experimental flag for "inplace" bufferization in the sparse compiler can be replaced with the new inplace attribute. This gives a uniform way of expressing the more efficient way of bufferization. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D102538

This patch contains the bare minimum to run the new Pass Manager from the LLVM-C APIs. It does not feature PGOOptions, PassPlugins or Debugify in its current state. Bugzilla: PR48499 Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102136

- Enables inferring return type for ConstShape, takes into account valid return types; - The compatible return type function could be reused, leaving that for next use refactoring; Differential Revision: https://reviews.llvm.org/D102182

The LAM mode is currently untested by check-hwasan, so we only need to build the runtime in aliasing mode. Because LAM mode will always need to be conditional (because only certain hardware will support it) we can always just disable the LAM lit tests if it ever starts being tested.

Follow up to D88631 but for aarch64; the Linux kernel uses the command line flags: 1. -mstack-protector-guard=sysreg 2. -mstack-protector-guard-reg=sp_el0 3. -mstack-protector-guard-offset=0 to use the system register sp_el0 for the stack canary, enabling the kernel to have a unique stack canary per task (like a thread, but not limited to userspace as the kernel can preempt itself). Address pr/47341 for aarch64. Fixes: ClangBuiltLinux/linux#289 Signed-off-by: Nick Desaulniers <[email protected]> Reviewed By: xiangzhangllvm, DavidSpickett, dmgreen Differential Revision: https://reviews.llvm.org/D100919

This is one of the folds requested in: https://llvm.org/PR39480 https://alive2.llvm.org/ce/z/NczU3V Note - this uses the normal FMF propagation logic (flags transfer from the final value to new/intermediate ops). It's not clear if this matches what Alive2 implements, so we may want to adjust one or the other.

vmaksimo · 2021-05-21T08:53:46Z

/summary:run

vmaksimo · 2021-05-21T11:22:50Z

Hi @steffenlarsen, could you please help to investigate a build failure (check-sycl target) on CUDA?
Example of failure: http://ci.llvm.intel.com:8010/#/builders/37/builds/8925/steps/16/logs/stdio

Possibly, it can be related to google test update in LLORG: d4d80a2

vmaksimo · 2021-05-21T11:55:44Z

/summary:run

steffenlarsen · 2021-05-21T13:08:41Z

Possibly, it can be related to google test update in LLORG: d4d80a2

Good intuition! That was exactly it. steffenlarsen@0331956 did the trick on my machine.

Signed-off-by: Steffen Larsen <[email protected]>

vmaksimo · 2021-05-24T16:46:05Z

Good intuition! That was exactly it. steffenlarsen@0331956 did the trick on my machine.

Thanks @steffenlarsen! This helped to fix the failure

vmaksimo · 2021-05-24T16:46:13Z

/summary:run

bader

LGTM, except typo in the comments.

bader · 2021-05-25T09:33:52Z

clang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c

@@ -1,3 +1,7 @@
+// XFAIL: *
+// Failure is expected untill fixed in LLORG upstream.


untill -> until

bader · 2021-05-25T09:33:58Z

clang/test/OpenMP/remarks_parallel_in_target_state_machine.c

@@ -1,3 +1,7 @@
+// XFAIL: *
+// Failure is expected untill fixed in LLORG upstream.


untill -> until

xling-liao and others added 30 commits May 17, 2021 11:30

[AMDGPU] Set unused dst_sel to '?' in the encoding

f4c0fdc

This is to allow disasm with any bits in the unused fields. Differential Revision: https://reviews.llvm.org/D102526

[llvm][doc] fix header for read/write_register intrinsics in LangRef

1417dda

Mutli-line headers are not allowed in RST, reformat the header to be a single wide line.

[HWASan] Don't build alias mode on non-x86.

d97bab6

Alias mode is not expected work on non-x86, so don't build it there. Should fix the aarch64 bot.

Reset the wakeup timeout when we re-enter the continue wait.

bd5751f

Differential Revision: https://reviews.llvm.org/D102562

Revert "[NewPM] Add C bindings for new pass manager"

0b33977

This reverts commit cd220a0. Doesn't build.

[Clang] -Wunused-but-set-parameter and -Wunused-but-set-variable

14dfb38

These are intended to mimic warnings available in gcc. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D100581

Merge with mainline.

648f34a

Differential Revision: https://reviews.llvm.org/D102636

Merge from 'sycl' to 'sycl-web' (#1)

84c4715

CONFLICT (content): Merge conflict in llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp CONFLICT (content): Merge conflict in llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Add type function for ConstShape op.

24bf554

- Enables inferring return type for ConstShape, takes into account valid return types; - The compatible return type function could be reused, leaving that for next use refactoring; Differential Revision: https://reviews.llvm.org/D102182

[InstCombine] add tests for fneg-of-select; NFC

e9f600f

[gn build] Port 0c557db

11c857c

[SPIRV] Align fshl.ll test with the Khronos version

4c65c65

vmaksimo force-pushed the llvmspirv_pulldown branch from 3068e96 to 4c65c65 Compare May 21, 2021 11:21

vmaksimo added 2 commits May 21, 2021 14:54

Disable OpenMP remarks* tests until they fixed in upstream

44c3aa5

Fix CUDA build

87b317f

Fix parameterized PI CUDA unittests with empty parameter

0621c56

Signed-off-by: Steffen Larsen <[email protected]>

vmaksimo force-pushed the llvmspirv_pulldown branch from dcea17c to 0621c56 Compare May 24, 2021 11:05

Merge remote-tracking branch 'intel_llvm/sycl' into llvmspirv_pulldown

0365bd4

vladimirlaz mentioned this pull request May 25, 2021

[SYCL] XFAIL test failing after LLVM pulldown intel/llvm-test-suite#290

Merged

vladimirlaz marked this pull request as ready for review May 25, 2021 07:49

vladimirlaz requested review from AaronBallman, AGindinson, AlexeySachkov, AlexeySotkin, bader, elizabethandrews, mdtoguchi, mlychkov, premanandrao and sndmitriev as code owners May 25, 2021 07:49

bader previously approved these changes May 25, 2021

View reviewed changes

Fix typo

fa2d83f

vmaksimo dismissed bader’s stale review via fa2d83f May 25, 2021 12:47

vladimirlaz merged commit 611746c into intel:sycl May 25, 2021

vmaksimo mentioned this pull request Jun 15, 2021

llvm.fshr.i32 missing #3308

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW20-21) #3779

LLVM and SPIRV-LLVM-Translator pulldown (WW20-21) #3779

Uh oh!

vmaksimo commented May 18, 2021 •

edited

Loading

Uh oh!

vmaksimo commented May 21, 2021

Uh oh!

vmaksimo commented May 21, 2021 •

edited

Loading

Uh oh!

vmaksimo commented May 21, 2021

Uh oh!

steffenlarsen commented May 21, 2021

Uh oh!

vmaksimo commented May 24, 2021

Uh oh!

vmaksimo commented May 24, 2021

Uh oh!

bader left a comment

Uh oh!

bader May 25, 2021

Uh oh!

bader May 25, 2021

Uh oh!

Uh oh!

		@@ -1,3 +1,7 @@
		// XFAIL: *
		// Failure is expected untill fixed in LLORG upstream.

LLVM and SPIRV-LLVM-Translator pulldown (WW20-21) #3779

LLVM and SPIRV-LLVM-Translator pulldown (WW20-21) #3779

Uh oh!

Conversation

vmaksimo commented May 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmaksimo commented May 21, 2021

Uh oh!

vmaksimo commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmaksimo commented May 21, 2021

Uh oh!

steffenlarsen commented May 21, 2021

Uh oh!

vmaksimo commented May 24, 2021

Uh oh!

vmaksimo commented May 24, 2021

Uh oh!

bader left a comment

Choose a reason for hiding this comment

Uh oh!

bader May 25, 2021

Choose a reason for hiding this comment

Uh oh!

bader May 25, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vmaksimo commented May 18, 2021 •

edited

Loading

vmaksimo commented May 21, 2021 •

edited

Loading