-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Closed
Description
There is a codegen regression caused by #89966: https://godbolt.org/z/G3hjh4KTn
; bin/llc -mtriple=riscv64 -mattr=+zba,+m test.ll -o -
define i64 @test(i64 %0) {
entry:
%1 = lshr i64 %0, 18
%2 = and i64 %1, 4294967295
%3 = mul i64 %2, 24
ret i64 %3
}
Before:
test: # @test
srli a0, a0, 18
slli.uw a0, a0, 3
sh1add a0, a0, a0
ret
After:
test: # @test
srli a0, a0, 15
srli a0, a0, 3
slli.uw a0, a0, 3
sh1add a0, a0, a0
ret
These two cascade shifts can be folded into one srli instruction.
SDAG before ISel:
Optimized legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 12 nodes:
t0: ch,glue = EntryToken
t15: i64 = RISCVISD::SHL_ADD t18, Constant:i64<1>, t18
t10: ch,glue = CopyToReg t0, Register:i64 $x10, t15
t2: i64,ch = CopyFromReg t0, Register:i64 %0
t20: i64 = srl t2, Constant:i64<15>
t18: i64 = and t20, Constant:i64<34359738360> ; 34359738360 = 0xffffffff << 3
t11: ch = RISCVISD::RET_GLUE t10, Register:i64 $x10, t10:1
t18
will be lowered into srli + slli_uw
:
llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
Lines 685 to 689 in 991192b
// Match a shifted 0xffffffff mask. Use SRLI to clear the LSBs and SLLI_UW to | |
// mask and shift. | |
def : Pat<(i64 (and GPR:$rs1, Shifted32OnesMask:$mask)), | |
(SLLI_UW (XLenVT (SRLI GPR:$rs1, Shifted32OnesMask:$mask)), | |
Shifted32OnesMask:$mask)>; |
Two solutions:
- fold
and (srl X, C), Shifted32OnesMask
intoslli_uw (srli X, C+ShAmt), ShAmt
- Do some peephole cleanup which folds cascade slli/srli pairs in
RISCVDAGToDAGISel::PostprocessISelDAG
(Preferred)
Any thoughts? @preames @topperc @wangpc-pp
Related issue:
dtcxzyw/llvm-codegen-benchmark#25
dtcxzyw/llvm-codegen-benchmark#40