-
Notifications
You must be signed in to change notification settings - Fork 14.5k
AMDGPU: Delete spills of undef values #119684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Delete spills of undef values #119684
Conversation
@llvm/pr-subscribers-llvm-regalloc @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesAMDGPU: Delete spills of undef values It would be a bit more logical to preserve the undef and do the normal https://reviews.llvm.org/D122607 Move where undef sgpr spills are deleted Full diff: https://github.com/llvm/llvm-project/pull/119684.diff 5 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
index f868843318ff6c..d27c523425feb2 100644
--- a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
+++ b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
@@ -422,6 +422,13 @@ bool SILowerSGPRSpills::run(MachineFunction &MF) {
if (!TII->isSGPRSpill(MI))
continue;
+ if (MI.getOperand(0).isUndef()) {
+ if (Indexes)
+ Indexes->removeMachineInstrFromMaps(MI);
+ MI.eraseFromParent();
+ continue;
+ }
+
int FI = TII->getNamedOperand(MI, AMDGPU::OpName::addr)->getIndex();
assert(MFI.getStackID(FI) == TargetStackID::SGPRSpill);
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 296c32fa4e0d09..8a23cf5e214967 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -1954,6 +1954,9 @@ bool SIRegisterInfo::spillSGPR(MachineBasicBlock::iterator MI, int Index,
RegScavenger *RS, SlotIndexes *Indexes,
LiveIntervals *LIS, bool OnlyToVGPR,
bool SpillToPhysVGPRLane) const {
+ assert(!MI->getOperand(0).isUndef() &&
+ "undef spill should have been deleted earlier");
+
SGPRSpillBuilder SB(*this, *ST.getInstrInfo(), isWave32, MI, Index, RS);
ArrayRef<SpilledReg> VGPRSpills =
@@ -2375,6 +2378,11 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
case AMDGPU::SI_SPILL_WWM_AV32_SAVE: {
const MachineOperand *VData = TII->getNamedOperand(*MI,
AMDGPU::OpName::vdata);
+ if (VData->isUndef()) {
+ MI->eraseFromParent();
+ return true;
+ }
+
assert(TII->getNamedOperand(*MI, AMDGPU::OpName::soffset)->getReg() ==
MFI->getStackPtrOffsetReg());
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir b/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir
index 774785fb3966fc..d352e8a13da9f1 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir
@@ -54,3 +54,45 @@ body: |
SI_SPILL_S64_SAVE renamable $sgpr4_sgpr5, %stack.0, implicit $exec, implicit $sgpr96_sgpr97_sgpr98_sgpr99, implicit $sgpr32 :: (store (s64) into %stack.0, align 4, addrspace 5)
...
+
+---
+name: sgpr_spill_s32_undef
+tracksRegLiveness: true
+machineFunctionInfo:
+ isEntryFunction: true
+ hasSpilledSGPRs: true
+ scratchRSrcReg: '$sgpr96_sgpr97_sgpr98_sgpr99'
+ stackPtrOffsetReg: '$sgpr32'
+stack:
+ - { id: 0, type: spill-slot, size: 4, alignment: 4, stack-id: sgpr-spill }
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: sgpr_spill_s32_undef
+ ; CHECK: body:
+ ; CHECK-NEXT: bb.0:
+ ; CHECK-NOT: {{.+}}
+ ; CHECK: ...
+ SI_SPILL_S32_SAVE undef $sgpr8, %stack.0, implicit $exec, implicit $sgpr96_sgpr97_sgpr98_sgpr99, implicit $sgpr32 :: (store (s32) into %stack.0, align 4, addrspace 5)
+
+...
+
+---
+name: sgpr_spill_s64_undef
+tracksRegLiveness: true
+machineFunctionInfo:
+ isEntryFunction: true
+ hasSpilledSGPRs: true
+ scratchRSrcReg: '$sgpr96_sgpr97_sgpr98_sgpr99'
+ stackPtrOffsetReg: '$sgpr32'
+stack:
+ - { id: 0, type: spill-slot, size: 8, alignment: 4, stack-id: sgpr-spill }
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: sgpr_spill_s64_undef
+ ; CHECK: body:
+ ; CHECK-NEXT: bb.0:
+ ; CHECK-NOT: {{.+}}
+ ; CHECK: ...
+ SI_SPILL_S64_SAVE undef $sgpr8_sgpr9, %stack.0, implicit $exec, implicit $sgpr96_sgpr97_sgpr98_sgpr99, implicit $sgpr32 :: (store (s64) into %stack.0, align 4, addrspace 5)
+
+...
diff --git a/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir b/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir
index c825674de7652c..b02b6e79d7a76f 100644
--- a/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir
+++ b/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir
@@ -71,3 +71,37 @@ body: |
; CHECK-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec, implicit killed $agpr0_agpr1 :: (store (s32) into %stack.0 + 4, addrspace 5)
SI_SPILL_A64_SAVE killed $agpr0_agpr1, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, addrspace 5)
...
+
+---
+name: spill_a32_undef
+tracksRegLiveness: true
+stack:
+ - { id: 0, type: spill-slot, size: 4, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3'
+ stackPtrOffsetReg: '$sgpr32'
+ frameOffsetReg: '$sgpr33'
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: spill_a32_undef
+ ; CHECK: S_ENDPGM 0
+ SI_SPILL_A32_SAVE undef $agpr0, %stack.0, $sgpr32, 0, implicit $exec :: (store (s32) into %stack.0, addrspace 5)
+ S_ENDPGM 0
+...
+
+---
+name: spill_a64_undef
+tracksRegLiveness: true
+stack:
+ - { id: 0, type: spill-slot, size: 8, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3'
+ stackPtrOffsetReg: '$sgpr32'
+ frameOffsetReg: '$sgpr33'
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: spill_a64_undef
+ ; CHECK: S_ENDPGM 0
+ SI_SPILL_A64_SAVE undef $agpr0_agpr1, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, addrspace 5)
+ S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir b/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
index c2badc5720f149..edea344a66a3cd 100644
--- a/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
+++ b/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
@@ -153,3 +153,37 @@ body: |
; CHECK-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr3, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 12, 0, 0, implicit $exec, implicit killed $vgpr0_vgpr1_vgpr2_vgpr3 :: (store (s32) into %stack.0 + 12, addrspace 5)
SI_SPILL_V128_SAVE killed $vgpr0_vgpr1_vgpr2_vgpr3, %stack.0, $sgpr32, 0, implicit $exec :: (store (s128) into %stack.0, addrspace 5)
...
+
+---
+name: spill_v32_undef
+tracksRegLiveness: true
+stack:
+ - { id: 0, type: spill-slot, size: 4, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3'
+ stackPtrOffsetReg: '$sgpr32'
+ frameOffsetReg: '$sgpr33'
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: spill_v32_undef
+ ; CHECK: S_NOP 0, implicit undef $vgpr0
+ SI_SPILL_V32_SAVE undef $vgpr0, %stack.0, $sgpr32, 0, implicit $exec :: (store (s32) into %stack.0, addrspace 5)
+ S_NOP 0, implicit undef $vgpr0
+...
+
+---
+name: spill_v64_undef
+tracksRegLiveness: true
+stack:
+ - { id: 0, type: spill-slot, size: 8, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3'
+ stackPtrOffsetReg: '$sgpr32'
+ frameOffsetReg: '$sgpr33'
+body: |
+ bb.0:
+ ; CHECK-LABEL: name: spill_v64_undef
+ ; CHECK: S_NOP 0, implicit undef $vgpr0_vgpr1
+ SI_SPILL_V64_SAVE undef $vgpr0_vgpr1, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, addrspace 5)
+ S_NOP 0, implicit undef $vgpr0_vgpr1
+...
|
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 28de01787331dc5f4c1e39c239efdd07ffc5d324 62956df9c3aadd057dcccff36321070209a2946d llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp The following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
} Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
} Please refer to the Undefined Behavior Manual for more information. |
8a05688
to
2bd0046
Compare
03cca4c
to
31bfcc4
Compare
2bd0046
to
0f84c63
Compare
Currently, regalloc won't do subreg spilling. But it will be fixed in the foreseeable future. What will happen if you write a test that spills the |
There's no change in interpretation of an undef use with subregisters. The register is just unused, it doesn't matter what the subregister index is. |
31bfcc4
to
9af0962
Compare
0f84c63
to
08b4cd5
Compare
9af0962
to
28de017
Compare
08b4cd5
to
62956df
Compare
28de017
to
726e468
Compare
It would be a bit more logical to preserve the undef and do the normal expansion, but this is less work. This avoids verifier errors in a future patch which starts deleting liveness from registers after allocation failures which results in spills of undef values. https://reviews.llvm.org/D122607
62956df
to
13d828c
Compare
AMDGPU: Delete spills of undef values
It would be a bit more logical to preserve the undef and do the normal
expansion, but this is less work. This avoids verifier errors in a
future patch which starts deleting liveness from registers after
allocation failures which results in spills of undef values.
https://reviews.llvm.org/D122607
Move where undef sgpr spills are deleted