Skip to content

[RISCV] Add isel special case for (and (srl X, c2), c1) -> (slli_uw (srli x, c2+c3), c3). #100966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Jul 29, 2024

Where c1 is a shifted mask with 32 set bits and c3 trailing zeros.

Fixes #100936.

…srli x, c2+c3), c3).

Where c1 is a shifted mask with 32 set bits and c3 trailing zeros.

Fixes llvm#100936.
@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

Where c1 is a shifted mask with 32 set bits and c3 trailing zeros.

Fixes #100936.


Full diff: https://github.com/llvm/llvm-project/pull/100966.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+12)
  • (modified) llvm/test/CodeGen/RISCV/rv64zba.ll (+2-4)
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index eef6ae677ac85..01b2bc08d3ba0 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -1393,6 +1393,18 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
           ReplaceNode(Node, SLLI);
           return;
         }
+        // If we have 32 bits in the mask, we can use SLLI_UW instead of SLLI.
+        if (Trailing > 0 && Leading + Trailing == 32 && C2 + Trailing < XLen &&
+            OneUseOrZExtW && Subtarget->hasStdExtZba()) {
+          SDNode *SRLI = CurDAG->getMachineNode(
+              RISCV::SRLI, DL, VT, X,
+              CurDAG->getTargetConstant(C2 + Trailing, DL, VT));
+          SDNode *SLLI_UW = CurDAG->getMachineNode(
+              RISCV::SLLI_UW, DL, VT, SDValue(SRLI, 0),
+              CurDAG->getTargetConstant(Trailing, DL, VT));
+          ReplaceNode(Node, SLLI_UW);
+          return;
+        }
       }
 
       // Turn (and (shl x, c2), c1) -> (slli (srli x, c3-c2), c3) if c1 is a
diff --git a/llvm/test/CodeGen/RISCV/rv64zba.ll b/llvm/test/CodeGen/RISCV/rv64zba.ll
index 61be5ee458e9d..20a0484464018 100644
--- a/llvm/test/CodeGen/RISCV/rv64zba.ll
+++ b/llvm/test/CodeGen/RISCV/rv64zba.ll
@@ -2962,8 +2962,7 @@ define i64 @srli_slliuw_2(i64 %1) {
 ;
 ; RV64ZBA-LABEL: srli_slliuw_2:
 ; RV64ZBA:       # %bb.0: # %entry
-; RV64ZBA-NEXT:    srli a0, a0, 15
-; RV64ZBA-NEXT:    srli a0, a0, 3
+; RV64ZBA-NEXT:    srli a0, a0, 18
 ; RV64ZBA-NEXT:    slli.uw a0, a0, 3
 ; RV64ZBA-NEXT:    ret
 entry:
@@ -2985,8 +2984,7 @@ define i64 @srli_slliuw_canonical_2(i64 %0) {
 ;
 ; RV64ZBA-LABEL: srli_slliuw_canonical_2:
 ; RV64ZBA:       # %bb.0: # %entry
-; RV64ZBA-NEXT:    srli a0, a0, 15
-; RV64ZBA-NEXT:    srli a0, a0, 3
+; RV64ZBA-NEXT:    srli a0, a0, 18
 ; RV64ZBA-NEXT:    slli.uw a0, a0, 3
 ; RV64ZBA-NEXT:    ret
 entry:

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 29, 2024

I don't think it is a good way to fix these kinds of issues. I prefer to do a final clean up for slli+slli/srli+srli pairs in RISCVDAGToDAGISel::PostprocessISelDAG.

Another special case with slli+slli (sampled from pybind11): https://godbolt.org/z/rc95a37ej

; bin/llc -mtriple=riscv64 -mattr=+zbb test.ll -o -
define i64 @func0000000000000005(i64 %0, i16 signext %1) #0 {
entry:
  %2 = shl i16 %1, 9
  %sext = ashr i16 %2, 15
  %3 = sext i16 %sext to i64
  %4 = add nsw i64 %3, %0
  ret i64 %4
}
func0000000000000005:                   # @func0000000000000005
        slli    a1, a1, 9
        slli    a1, a1, 48
        srai    a1, a1, 63
        add     a0, a1, a0
        ret

@topperc
Copy link
Collaborator Author

topperc commented Jul 29, 2024

Another special case with slli+slli (sampled from pybind11): https://godbolt.org/z/rc95a37ej

Is that something we should handle in generic DAGCombine? The X86 code is also bad. https://godbolt.org/z/7Tv3Gsc5f

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 29, 2024

Another special case with slli+slli (sampled from pybind11): https://godbolt.org/z/rc95a37ej

Is that something we should handle in generic DAGCombine? The X86 code is also bad. https://godbolt.org/z/7Tv3Gsc5f

Optimized legalized selection DAG: %bb.0 'func0000000000000005:entry'
SelectionDAG has 16 nodes:
  t0: ch,glue = EntryToken
              t4: i64,ch = CopyFromReg t0, Register:i64 %1
            t6: i64 = AssertSext t4, ValueType:ch:i16
          t19: i64 = shl t6, Constant:i64<9>
        t20: i64 = sign_extend_inreg t19, ValueType:ch:i16
      t21: i64 = sra t20, Constant:i64<15>
      t2: i64,ch = CopyFromReg t0, Register:i64 %0
    t15: i64 = add nsw t21, t2
  t17: ch,glue = CopyToReg t0, Register:i64 $x10, t15
  t18: ch = RISCVISD::RET_GLUE t17, Register:i64 $x10, t17:1


===== Instruction selection begins: %bb.0 'entry'

Unfortunately we cannot handle this in DAGCombine since it happens in ISel.
Do you mean to expand sign_extend_inreg X, i16 into (X << 48) s>> 48 in DAGCombine?

@topperc
Copy link
Collaborator Author

topperc commented Jul 29, 2024

Do you mean to expand sign_extend_inreg X, i16 into (X << 48) s>> 48 in DAGCombine?

I wouldn't want to expand sign_extend_inreg by itself.

But maybe we should expand this into a sra+shl during DAGCombine.

        t20: i64 = sign_extend_inreg t19, ValueType:ch:i16
      t21: i64 = sra t20, Constant:i64<15>

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

(The discussion about a generic option can continue even if this lands.)

@topperc topperc merged commit 7647f88 into llvm:main Jul 29, 2024
9 checks passed
@topperc topperc deleted the pr/slliuw branch July 29, 2024 16:08
@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 29, 2024

Do you mean to expand sign_extend_inreg X, i16 into (X << 48) s>> 48 in DAGCombine?

I wouldn't want to expand sign_extend_inreg by itself.

But maybe we should expand this into a sra+shl during DAGCombine.

        t20: i64 = sign_extend_inreg t19, ValueType:ch:i16
      t21: i64 = sra t20, Constant:i64<15>

I filed #101040 to track this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Missing fold with cascade shifts
4 participants