-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[SelectionDAG] Fix and improve TargetLowering::SimplifySetCC #87646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -40,7 +40,9 @@ define i1 @test_129_15_0(ptr %y) { | |
; | ||
; CHECK-BE-LABEL: test_129_15_0: | ||
; CHECK-BE: @ %bb.0: | ||
; CHECK-BE-NEXT: ldrh r0, [r0, #14] | ||
; CHECK-BE-NEXT: ldr r1, [r0, #12] | ||
; CHECK-BE-NEXT: ldrb r0, [r0, #16] | ||
; CHECK-BE-NEXT: orr r0, r0, r1, lsl #8 | ||
; CHECK-BE-NEXT: mov r1, #255 | ||
; CHECK-BE-NEXT: orr r1, r1, #32512 | ||
; CHECK-BE-NEXT: ands r0, r0, r1 | ||
|
@@ -49,7 +51,7 @@ define i1 @test_129_15_0(ptr %y) { | |
; | ||
; CHECK-V7-BE-LABEL: test_129_15_0: | ||
; CHECK-V7-BE: @ %bb.0: | ||
; CHECK-V7-BE-NEXT: ldrh r0, [r0, #14] | ||
; CHECK-V7-BE-NEXT: ldrh r0, [r0, #15] | ||
; CHECK-V7-BE-NEXT: bfc r0, #15, #17 | ||
; CHECK-V7-BE-NEXT: cmp r0, #0 | ||
; CHECK-V7-BE-NEXT: movwne r0, #1 | ||
|
@@ -119,14 +121,14 @@ define i1 @test_33_8_0(ptr %y) { | |
; | ||
; CHECK-BE-LABEL: test_33_8_0: | ||
; CHECK-BE: @ %bb.0: | ||
; CHECK-BE-NEXT: ldrb r0, [r0, #3] | ||
; CHECK-BE-NEXT: ldrb r0, [r0, #4] | ||
; CHECK-BE-NEXT: cmp r0, #0 | ||
; CHECK-BE-NEXT: movne r0, #1 | ||
; CHECK-BE-NEXT: mov pc, lr | ||
; | ||
; CHECK-V7-BE-LABEL: test_33_8_0: | ||
; CHECK-V7-BE: @ %bb.0: | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0, #3] | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0, #4] | ||
; CHECK-V7-BE-NEXT: cmp r0, #0 | ||
; CHECK-V7-BE-NEXT: movwne r0, #1 | ||
; CHECK-V7-BE-NEXT: bx lr | ||
|
@@ -179,13 +181,13 @@ define i1 @test_33_1_31(ptr %y) { | |
; | ||
; CHECK-BE-LABEL: test_33_1_31: | ||
; CHECK-BE: @ %bb.0: | ||
; CHECK-BE-NEXT: ldrb r0, [r0] | ||
; CHECK-BE-NEXT: ldrb r0, [r0, #1] | ||
; CHECK-BE-NEXT: lsr r0, r0, #7 | ||
; CHECK-BE-NEXT: mov pc, lr | ||
; | ||
; CHECK-V7-BE-LABEL: test_33_1_31: | ||
; CHECK-V7-BE: @ %bb.0: | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0] | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0, #1] | ||
; CHECK-V7-BE-NEXT: lsr r0, r0, #7 | ||
; CHECK-V7-BE-NEXT: bx lr | ||
%a = load i33, ptr %y | ||
|
@@ -209,13 +211,13 @@ define i1 @test_33_1_0(ptr %y) { | |
; | ||
; CHECK-BE-LABEL: test_33_1_0: | ||
; CHECK-BE: @ %bb.0: | ||
; CHECK-BE-NEXT: ldrb r0, [r0, #3] | ||
; CHECK-BE-NEXT: ldrb r0, [r0, #4] | ||
; CHECK-BE-NEXT: and r0, r0, #1 | ||
; CHECK-BE-NEXT: mov pc, lr | ||
; | ||
; CHECK-V7-BE-LABEL: test_33_1_0: | ||
; CHECK-V7-BE: @ %bb.0: | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0, #3] | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0, #4] | ||
; CHECK-V7-BE-NEXT: and r0, r0, #1 | ||
; CHECK-V7-BE-NEXT: bx lr | ||
%a = load i33, ptr %y | ||
|
@@ -309,7 +311,7 @@ define i1 @test_48_16_8(ptr %y) { | |
; CHECK-LE-LABEL: test_48_16_8: | ||
; CHECK-LE: @ %bb.0: | ||
; CHECK-LE-NEXT: ldrh r0, [r0, #1] | ||
; CHECK-LE-NEXT: cmp r0, #0 | ||
; CHECK-LE-NEXT: lsls r0, r0, #8 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if lsls or cmp is better than the other here for ARM? What happens is that after load narrowing we get:
and then the DAG combiner triggers on the AND and changes it into
I think the optimization in this patch first avoids introducing a misaligned 16-bit load from Makes me wonder if the 16 bit load with align 1 a bad thing here? It also seems like we lack some optimization that removes the redundant SHL :-( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe the difference is that this patch checks the Fast result from the allowsMemoryAccess check. But DAGCombiner::isLegalNarrowLdSt is only checking if it is legal (not if the narrowed, not naturally aligned, access also is considered as Fast). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you looked into this any further? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
After that I also made sure the test is run both with I'm also not sure why this code in DAGCombiner.cpp
is skipping to consider the "Fast" result from allowsMemoryAccess that would be given if using the extra bool argument. I kind of think the new behavior is more correct in some sense (taking the "Fast" part of the TLI hook into account). But then I guess we want to do that in DAGCombiner::isLegalNarrowLdSt as well (potentially getting even more diffs that looks like "regressions", but that in fact might be optimizations in case unaligned accesses are costly). |
||
; CHECK-LE-NEXT: movne r0, #1 | ||
; CHECK-LE-NEXT: mov pc, lr | ||
; | ||
|
@@ -444,9 +446,7 @@ define i1 @test_48_17_0(ptr %y) { | |
; | ||
; CHECK-V7-BE-LABEL: test_48_17_0: | ||
; CHECK-V7-BE: @ %bb.0: | ||
; CHECK-V7-BE-NEXT: ldr r1, [r0] | ||
; CHECK-V7-BE-NEXT: ldrh r0, [r0, #4] | ||
; CHECK-V7-BE-NEXT: orr r0, r0, r1, lsl #16 | ||
; CHECK-V7-BE-NEXT: ldr r0, [r0, #2] | ||
; CHECK-V7-BE-NEXT: bfc r0, #17, #15 | ||
; CHECK-V7-BE-NEXT: cmp r0, #0 | ||
; CHECK-V7-BE-NEXT: movwne r0, #1 | ||
|
@@ -506,15 +506,14 @@ define i1 @test_40_1_32(ptr %y) { | |
; | ||
; CHECK-BE-LABEL: test_40_1_32: | ||
; CHECK-BE: @ %bb.0: | ||
; CHECK-BE-NEXT: ldr r0, [r0] | ||
; CHECK-BE-NEXT: mov r1, #1 | ||
; CHECK-BE-NEXT: and r0, r1, r0, lsr #24 | ||
; CHECK-BE-NEXT: ldrb r0, [r0] | ||
; CHECK-BE-NEXT: and r0, r0, #1 | ||
; CHECK-BE-NEXT: mov pc, lr | ||
; | ||
; CHECK-V7-BE-LABEL: test_40_1_32: | ||
; CHECK-V7-BE: @ %bb.0: | ||
; CHECK-V7-BE-NEXT: ldr r0, [r0] | ||
; CHECK-V7-BE-NEXT: ubfx r0, r0, #24, #1 | ||
; CHECK-V7-BE-NEXT: ldrb r0, [r0] | ||
; CHECK-V7-BE-NEXT: and r0, r0, #1 | ||
; CHECK-V7-BE-NEXT: bx lr | ||
%a = load i40, ptr %y | ||
%b = and i40 %a, u0x100000000 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a regression. It is supposed to be fixing a miscompile.
The 15bits masked out with the AND is at [r0, #15] and [r0, #16]. I think the optimization to narrow to a 16-bit load is blocked due to loading 16 bits starting at [r0, #15] would be an unaligned load.