Skip to content

AMDGPU: Delete spills of undef values #119684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 17, 2024

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Dec 12, 2024

AMDGPU: Delete spills of undef values

It would be a bit more logical to preserve the undef and do the normal
expansion, but this is less work. This avoids verifier errors in a
future patch which starts deleting liveness from registers after
allocation failures which results in spills of undef values.

https://reviews.llvm.org/D122607

Move where undef sgpr spills are deleted

@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2024

@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

AMDGPU: Delete spills of undef values

It would be a bit more logical to preserve the undef and do the normal
expansion, but this is less work. This avoids verifier errors in a
future patch which starts deleting liveness from registers after
allocation failures which results in spills of undef values.

https://reviews.llvm.org/D122607

Move where undef sgpr spills are deleted


Full diff: https://github.com/llvm/llvm-project/pull/119684.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp (+7)
  • (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (+8)
  • (modified) llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir (+42)
  • (modified) llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir (+34)
  • (modified) llvm/test/CodeGen/AMDGPU/vgpr-spill.mir (+34)
diff --git a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
index f868843318ff6c..d27c523425feb2 100644
--- a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
+++ b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
@@ -422,6 +422,13 @@ bool SILowerSGPRSpills::run(MachineFunction &MF) {
         if (!TII->isSGPRSpill(MI))
           continue;
 
+        if (MI.getOperand(0).isUndef()) {
+          if (Indexes)
+            Indexes->removeMachineInstrFromMaps(MI);
+          MI.eraseFromParent();
+          continue;
+        }
+
         int FI = TII->getNamedOperand(MI, AMDGPU::OpName::addr)->getIndex();
         assert(MFI.getStackID(FI) == TargetStackID::SGPRSpill);
 
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 296c32fa4e0d09..8a23cf5e214967 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -1954,6 +1954,9 @@ bool SIRegisterInfo::spillSGPR(MachineBasicBlock::iterator MI, int Index,
                                RegScavenger *RS, SlotIndexes *Indexes,
                                LiveIntervals *LIS, bool OnlyToVGPR,
                                bool SpillToPhysVGPRLane) const {
+  assert(!MI->getOperand(0).isUndef() &&
+         "undef spill should have been deleted earlier");
+
   SGPRSpillBuilder SB(*this, *ST.getInstrInfo(), isWave32, MI, Index, RS);
 
   ArrayRef<SpilledReg> VGPRSpills =
@@ -2375,6 +2378,11 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
     case AMDGPU::SI_SPILL_WWM_AV32_SAVE: {
       const MachineOperand *VData = TII->getNamedOperand(*MI,
                                                          AMDGPU::OpName::vdata);
+      if (VData->isUndef()) {
+        MI->eraseFromParent();
+        return true;
+      }
+
       assert(TII->getNamedOperand(*MI, AMDGPU::OpName::soffset)->getReg() ==
              MFI->getStackPtrOffsetReg());
 
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir b/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir
index 774785fb3966fc..d352e8a13da9f1 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-spill-partially-undef.mir
@@ -54,3 +54,45 @@ body:             |
     SI_SPILL_S64_SAVE renamable $sgpr4_sgpr5, %stack.0, implicit $exec, implicit $sgpr96_sgpr97_sgpr98_sgpr99, implicit $sgpr32 :: (store (s64) into %stack.0, align 4, addrspace 5)
 
 ...
+
+---
+name:  sgpr_spill_s32_undef
+tracksRegLiveness: true
+machineFunctionInfo:
+  isEntryFunction: true
+  hasSpilledSGPRs:     true
+  scratchRSrcReg:  '$sgpr96_sgpr97_sgpr98_sgpr99'
+  stackPtrOffsetReg: '$sgpr32'
+stack:
+  - { id: 0, type: spill-slot, size: 4, alignment: 4, stack-id: sgpr-spill }
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: sgpr_spill_s32_undef
+    ; CHECK: body:
+    ; CHECK-NEXT: bb.0:
+    ; CHECK-NOT: {{.+}}
+    ; CHECK: ...
+    SI_SPILL_S32_SAVE undef $sgpr8, %stack.0, implicit $exec, implicit $sgpr96_sgpr97_sgpr98_sgpr99, implicit $sgpr32 :: (store (s32) into %stack.0, align 4, addrspace 5)
+
+...
+
+---
+name:  sgpr_spill_s64_undef
+tracksRegLiveness: true
+machineFunctionInfo:
+  isEntryFunction: true
+  hasSpilledSGPRs:     true
+  scratchRSrcReg:  '$sgpr96_sgpr97_sgpr98_sgpr99'
+  stackPtrOffsetReg: '$sgpr32'
+stack:
+  - { id: 0, type: spill-slot, size: 8, alignment: 4, stack-id: sgpr-spill }
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: sgpr_spill_s64_undef
+    ; CHECK: body:
+    ; CHECK-NEXT: bb.0:
+    ; CHECK-NOT: {{.+}}
+    ; CHECK: ...
+    SI_SPILL_S64_SAVE undef $sgpr8_sgpr9, %stack.0, implicit $exec, implicit $sgpr96_sgpr97_sgpr98_sgpr99, implicit $sgpr32 :: (store (s64) into %stack.0, align 4, addrspace 5)
+
+...
diff --git a/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir b/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir
index c825674de7652c..b02b6e79d7a76f 100644
--- a/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir
+++ b/llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir
@@ -71,3 +71,37 @@ body:             |
     ; CHECK-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr0, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec, implicit killed $agpr0_agpr1 :: (store (s32) into %stack.0 + 4, addrspace 5)
     SI_SPILL_A64_SAVE killed $agpr0_agpr1, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, addrspace 5)
 ...
+
+---
+name: spill_a32_undef
+tracksRegLiveness: true
+stack:
+  - { id: 0, type: spill-slot, size: 4, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  stackPtrOffsetReg: '$sgpr32'
+  frameOffsetReg: '$sgpr33'
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: spill_a32_undef
+    ; CHECK: S_ENDPGM 0
+    SI_SPILL_A32_SAVE undef $agpr0, %stack.0, $sgpr32, 0, implicit $exec :: (store (s32) into %stack.0, addrspace 5)
+    S_ENDPGM 0
+...
+
+---
+name: spill_a64_undef
+tracksRegLiveness: true
+stack:
+  - { id: 0, type: spill-slot, size: 8, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  stackPtrOffsetReg: '$sgpr32'
+  frameOffsetReg: '$sgpr33'
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: spill_a64_undef
+    ; CHECK: S_ENDPGM 0
+    SI_SPILL_A64_SAVE undef $agpr0_agpr1, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, addrspace 5)
+    S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir b/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
index c2badc5720f149..edea344a66a3cd 100644
--- a/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
+++ b/llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
@@ -153,3 +153,37 @@ body:             |
     ; CHECK-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr3, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 12, 0, 0, implicit $exec, implicit killed $vgpr0_vgpr1_vgpr2_vgpr3 :: (store (s32) into %stack.0 + 12, addrspace 5)
     SI_SPILL_V128_SAVE killed $vgpr0_vgpr1_vgpr2_vgpr3, %stack.0, $sgpr32, 0, implicit $exec :: (store (s128) into %stack.0, addrspace 5)
 ...
+
+---
+name: spill_v32_undef
+tracksRegLiveness: true
+stack:
+  - { id: 0, type: spill-slot, size: 4, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  stackPtrOffsetReg: '$sgpr32'
+  frameOffsetReg: '$sgpr33'
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: spill_v32_undef
+    ; CHECK: S_NOP 0, implicit undef $vgpr0
+    SI_SPILL_V32_SAVE undef $vgpr0, %stack.0, $sgpr32, 0, implicit $exec :: (store (s32) into %stack.0, addrspace 5)
+    S_NOP 0, implicit undef $vgpr0
+...
+
+---
+name: spill_v64_undef
+tracksRegLiveness: true
+stack:
+  - { id: 0, type: spill-slot, size: 8, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  stackPtrOffsetReg: '$sgpr32'
+  frameOffsetReg: '$sgpr33'
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: spill_v64_undef
+    ; CHECK: S_NOP 0, implicit undef $vgpr0_vgpr1
+    SI_SPILL_V64_SAVE undef $vgpr0_vgpr1, %stack.0, $sgpr32, 0, implicit $exec :: (store (s64) into %stack.0, addrspace 5)
+    S_NOP 0, implicit undef $vgpr0_vgpr1
+...

@arsenm arsenm marked this pull request as ready for review December 12, 2024 09:51
Copy link

github-actions bot commented Dec 12, 2024

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 28de01787331dc5f4c1e39c239efdd07ffc5d324 62956df9c3aadd057dcccff36321070209a2946d llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp

The following files introduce new uses of undef:

  • llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@arsenm arsenm force-pushed the users/arsenm/amdgpu-delete-spills-of-undef-values branch from 8a05688 to 2bd0046 Compare December 12, 2024 10:46
@arsenm arsenm changed the base branch from users/arsenm/regalloc-fix-undef-uses-all-registers-reserved-in-class to users/arsenm/stackmap-undef December 12, 2024 10:46
@arsenm arsenm force-pushed the users/arsenm/stackmap-undef branch from 03cca4c to 31bfcc4 Compare December 12, 2024 12:07
@arsenm arsenm force-pushed the users/arsenm/amdgpu-delete-spills-of-undef-values branch from 2bd0046 to 0f84c63 Compare December 12, 2024 12:08
@cdevadas
Copy link
Collaborator

Currently, regalloc won't do subreg spilling. But it will be fixed in the foreseeable future. What will happen if you write a test that spills the undef subreg of a tuple?
Do you want to cover it anyway?

@arsenm
Copy link
Contributor Author

arsenm commented Dec 12, 2024

Currently, regalloc won't do subreg spilling. But it will be fixed in the foreseeable future. What will happen if you write a test that spills the undef subreg of a tuple? Do you want to cover it anyway?

There's no change in interpretation of an undef use with subregisters. The register is just unused, it doesn't matter what the subregister index is.

@arsenm arsenm force-pushed the users/arsenm/stackmap-undef branch from 31bfcc4 to 9af0962 Compare December 13, 2024 01:01
@arsenm arsenm force-pushed the users/arsenm/amdgpu-delete-spills-of-undef-values branch from 0f84c63 to 08b4cd5 Compare December 13, 2024 01:01
@arsenm arsenm force-pushed the users/arsenm/stackmap-undef branch from 9af0962 to 28de017 Compare December 16, 2024 01:58
@arsenm arsenm force-pushed the users/arsenm/amdgpu-delete-spills-of-undef-values branch from 08b4cd5 to 62956df Compare December 16, 2024 01:59
Copy link
Contributor Author

arsenm commented Dec 17, 2024

Merge activity

  • Dec 17, 12:51 AM EST: A user started a stack merge that includes this pull request via Graphite.
  • Dec 17, 1:03 AM EST: Graphite rebased this pull request as part of a merge.
  • Dec 17, 1:08 AM EST: A user merged this pull request with Graphite.

@arsenm arsenm force-pushed the users/arsenm/stackmap-undef branch from 28de017 to 726e468 Compare December 17, 2024 05:54
Base automatically changed from users/arsenm/stackmap-undef to main December 17, 2024 05:59
It would be a bit more logical to preserve the undef and do the normal
expansion, but this is less work. This avoids verifier errors in a
future patch which starts deleting liveness from registers after
allocation failures which results in spills of undef values.

https://reviews.llvm.org/D122607
@arsenm arsenm force-pushed the users/arsenm/amdgpu-delete-spills-of-undef-values branch from 62956df to 13d828c Compare December 17, 2024 06:02
@arsenm arsenm merged commit 8387cbd into main Dec 17, 2024
4 of 8 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu-delete-spills-of-undef-values branch December 17, 2024 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants