-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[VPlan] Remove VPBlendRecipe and replace with select VPInstructions #150369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
If a phi is widened with tail folding, all of its predecessors will have a mask of the form %x = logical-and %active-lane-mask, %foo %y = logical-and %active-lane-mask, %bar %z = logical-and %active-lane-mask, %baz ... We can remove the common %active-lane-mask from all of these edge masks, which in turn simplifies a lot of the masks. This allows us to remove VPBlendRecipe and directly emit VPInstruction::Select in another patch.
This helps simplify VPBlendRecipes that are expanded to selects in another patch.
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesStacked on #150368 and #150357 In #133993 we converted VPBlendRecipes to a series of select VPInstructions just before execution. This patch changes VPlanPredicator to directly emit selects, which allows us to simplify the selects even further and remove VPBlendRecipe altogether. Patch is 92.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150369.diff 33 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 99a96a8beb9f4..07643062865df 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -4195,7 +4195,6 @@ static bool willGenerateVectors(VPlan &Plan, ElementCount VF,
case VPDef::VPWidenIntrinsicSC:
case VPDef::VPWidenSC:
case VPDef::VPWidenSelectSC:
- case VPDef::VPBlendSC:
case VPDef::VPFirstOrderRecurrencePHISC:
case VPDef::VPHistogramSC:
case VPDef::VPWidenPHISC:
@@ -6952,6 +6951,7 @@ static bool planContainsAdditionalSimplifications(VPlan &Plan,
};
DenseSet<Instruction *> SeenInstrs;
+ SmallDenseMap<PHINode *, unsigned> BlendPhis;
auto Iter = vp_depth_first_deep(Plan.getVectorLoopRegion()->getEntry());
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(Iter)) {
for (VPRecipeBase &R : *VPBB) {
@@ -6979,6 +6979,15 @@ static bool planContainsAdditionalSimplifications(VPlan &Plan,
if (isa<VPPartialReductionRecipe>(&R))
return true;
+ // VPBlendRecipes are converted to selects and may have been simplified.
+ // Keep track of how many selects each phi has been converted to.
+ using namespace VPlanPatternMatch;
+ if (match(&R, m_VPInstruction<Instruction::Select>(
+ m_VPValue(), m_VPValue(), m_VPValue())))
+ if (auto *Phi = dyn_cast_if_present<PHINode>(
+ R.getVPSingleValue()->getUnderlyingValue()))
+ BlendPhis[Phi]++;
+
/// If a VPlan transform folded a recipe to one producing a single-scalar,
/// but the original instruction wasn't uniform-after-vectorization in the
/// legacy cost model, the legacy cost overestimates the actual cost.
@@ -7002,6 +7011,12 @@ static bool planContainsAdditionalSimplifications(VPlan &Plan,
}
}
+ // If a phi has been simplified then it will have less selects than the number
+ // of incoming values.
+ for (auto [Phi, NumSelects] : BlendPhis)
+ if (NumSelects != Phi->getNumIncomingValues() - 1)
+ return true;
+
// Return true if the loop contains any instructions that are not also part of
// the VPlan or are skipped for VPlan-based cost computations. This indicates
// that the VPlan contains extra simplifications.
@@ -8717,9 +8732,11 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
// latter are added above for masking.
// FIXME: Migrate code relying on the underlying instruction from VPlan0
// to construct recipes below to not use the underlying instruction.
- if (isa<VPCanonicalIVPHIRecipe, VPWidenCanonicalIVRecipe, VPBlendRecipe>(
- &R) ||
- (isa<VPInstruction>(&R) && !UnderlyingValue))
+ if (isa<VPCanonicalIVPHIRecipe, VPWidenCanonicalIVRecipe>(&R) ||
+ (isa<VPInstruction>(&R) && !UnderlyingValue) ||
+ (match(&R, m_VPInstruction<Instruction::Select>(
+ m_VPValue(), m_VPValue(), m_VPValue())) &&
+ isa_and_nonnull<PHINode>(UnderlyingValue)))
continue;
// FIXME: VPlan0, which models a copy of the original scalar loop, should
@@ -9005,20 +9022,20 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
// the phi until LoopExitValue. We keep track of the previous item
// (PreviousLink) to tell which of the two operands of a Link will remain
// scalar and which will be reduced. For minmax by select(cmp), Link will be
- // the select instructions. Blend recipes of in-loop reduction phi's will
+ // the select instructions. Blend selects of in-loop reduction phi's will
// get folded to their non-phi operand, as the reduction recipe handles the
// condition directly.
VPSingleDefRecipe *PreviousLink = PhiR; // Aka Worklist[0].
for (VPSingleDefRecipe *CurrentLink : drop_begin(Worklist)) {
- if (auto *Blend = dyn_cast<VPBlendRecipe>(CurrentLink)) {
- assert(Blend->getNumIncomingValues() == 2 &&
- "Blend must have 2 incoming values");
- if (Blend->getIncomingValue(0) == PhiR) {
- Blend->replaceAllUsesWith(Blend->getIncomingValue(1));
+ using namespace VPlanPatternMatch;
+ VPValue *T, *F;
+ if (match(CurrentLink, m_VPInstruction<Instruction::Select>(
+ m_VPValue(), m_VPValue(T), m_VPValue(F)))) {
+ if (T == PhiR) {
+ CurrentLink->replaceAllUsesWith(F);
} else {
- assert(Blend->getIncomingValue(1) == PhiR &&
- "PhiR must be an operand of the blend");
- Blend->replaceAllUsesWith(Blend->getIncomingValue(0));
+ assert(F == PhiR && "PhiR must be an operand of the select");
+ CurrentLink->replaceAllUsesWith(T);
}
continue;
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 99fd97eb71cad..459ed637adff6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -545,7 +545,6 @@ class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
case VPRecipeBase::VPWidenIntrinsicSC:
case VPRecipeBase::VPWidenSC:
case VPRecipeBase::VPWidenSelectSC:
- case VPRecipeBase::VPBlendSC:
case VPRecipeBase::VPPredInstPHISC:
case VPRecipeBase::VPCanonicalIVPHISC:
case VPRecipeBase::VPActiveLaneMaskPHISC:
@@ -2294,72 +2293,6 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
}
};
-/// A recipe for vectorizing a phi-node as a sequence of mask-based select
-/// instructions.
-class LLVM_ABI_FOR_TEST VPBlendRecipe : public VPSingleDefRecipe {
-public:
- /// The blend operation is a User of the incoming values and of their
- /// respective masks, ordered [I0, M0, I1, M1, I2, M2, ...]. Note that M0 can
- /// be omitted (implied by passing an odd number of operands) in which case
- /// all other incoming values are merged into it.
- VPBlendRecipe(PHINode *Phi, ArrayRef<VPValue *> Operands)
- : VPSingleDefRecipe(VPDef::VPBlendSC, Operands, Phi, Phi->getDebugLoc()) {
- assert(Operands.size() > 0 && "Expected at least one operand!");
- }
-
- VPBlendRecipe *clone() override {
- SmallVector<VPValue *> Ops(operands());
- return new VPBlendRecipe(cast<PHINode>(getUnderlyingValue()), Ops);
- }
-
- VP_CLASSOF_IMPL(VPDef::VPBlendSC)
-
- /// A normalized blend is one that has an odd number of operands, whereby the
- /// first operand does not have an associated mask.
- bool isNormalized() const { return getNumOperands() % 2; }
-
- /// Return the number of incoming values, taking into account when normalized
- /// the first incoming value will have no mask.
- unsigned getNumIncomingValues() const {
- return (getNumOperands() + isNormalized()) / 2;
- }
-
- /// Return incoming value number \p Idx.
- VPValue *getIncomingValue(unsigned Idx) const {
- return Idx == 0 ? getOperand(0) : getOperand(Idx * 2 - isNormalized());
- }
-
- /// Return mask number \p Idx.
- VPValue *getMask(unsigned Idx) const {
- assert((Idx > 0 || !isNormalized()) && "First index has no mask!");
- return Idx == 0 ? getOperand(1) : getOperand(Idx * 2 + !isNormalized());
- }
-
- void execute(VPTransformState &State) override {
- llvm_unreachable("VPBlendRecipe should be expanded by simplifyBlends");
- }
-
- /// Return the cost of this VPWidenMemoryRecipe.
- InstructionCost computeCost(ElementCount VF,
- VPCostContext &Ctx) const override;
-
-#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- /// Print the recipe.
- void print(raw_ostream &O, const Twine &Indent,
- VPSlotTracker &SlotTracker) const override;
-#endif
-
- /// Returns true if the recipe only uses the first lane of operand \p Op.
- bool onlyFirstLaneUsed(const VPValue *Op) const override {
- assert(is_contained(operands(), Op) &&
- "Op must be an operand of the recipe");
- // Recursing through Blend recipes only, must terminate at header phi's the
- // latest.
- return all_of(users(),
- [this](VPUser *U) { return U->onlyFirstLaneUsed(this); });
- }
-};
-
/// VPInterleaveRecipe is a recipe for transforming an interleave group of load
/// or stores into one wide load/store and shuffles. The first operand of a
/// VPInterleave recipe is the address, followed by the stored values, followed
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 3499e650ae853..234d82a3d42a4 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -41,17 +41,6 @@ VPTypeAnalysis::VPTypeAnalysis(const VPlan &Plan)
CanonicalIVTy = cast<VPExpandSCEVRecipe>(TC)->getSCEV()->getType();
}
-Type *VPTypeAnalysis::inferScalarTypeForRecipe(const VPBlendRecipe *R) {
- Type *ResTy = inferScalarType(R->getIncomingValue(0));
- for (unsigned I = 1, E = R->getNumIncomingValues(); I != E; ++I) {
- VPValue *Inc = R->getIncomingValue(I);
- assert(inferScalarType(Inc) == ResTy &&
- "different types inferred for different incoming values");
- CachedTypes[Inc] = ResTy;
- }
- return ResTy;
-}
-
Type *VPTypeAnalysis::inferScalarTypeForRecipe(const VPInstruction *R) {
// Set the result type from the first operand, check if the types for all
// other operands match and cache them.
@@ -290,7 +279,7 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
.Case<VPInstructionWithType, VPWidenIntrinsicRecipe,
VPWidenCastRecipe>(
[](const auto *R) { return R->getResultType(); })
- .Case<VPBlendRecipe, VPInstruction, VPWidenRecipe, VPReplicateRecipe,
+ .Case<VPInstruction, VPWidenRecipe, VPReplicateRecipe,
VPWidenCallRecipe, VPWidenMemoryRecipe, VPWidenSelectRecipe>(
[this](const auto *R) { return inferScalarTypeForRecipe(R); })
.Case<VPInterleaveRecipe>([V](const VPInterleaveRecipe *R) {
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
index cd86d27cf9122..b8007b346d4d9 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
@@ -18,7 +18,6 @@ namespace llvm {
class LLVMContext;
class VPValue;
-class VPBlendRecipe;
class VPInstruction;
class VPWidenRecipe;
class VPWidenCallRecipe;
@@ -48,7 +47,6 @@ class VPTypeAnalysis {
Type *CanonicalIVTy;
LLVMContext &Ctx;
- Type *inferScalarTypeForRecipe(const VPBlendRecipe *R);
Type *inferScalarTypeForRecipe(const VPInstruction *R);
Type *inferScalarTypeForRecipe(const VPWidenCallRecipe *R);
Type *inferScalarTypeForRecipe(const VPWidenRecipe *R);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanPredicator.cpp b/llvm/lib/Transforms/Vectorize/VPlanPredicator.cpp
index f0cab79197b4d..aaec2d68d823f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanPredicator.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanPredicator.cpp
@@ -14,6 +14,7 @@
#include "VPRecipeBuilder.h"
#include "VPlan.h"
#include "VPlanCFG.h"
+#include "VPlanPatternMatch.h"
#include "VPlanTransforms.h"
#include "VPlanUtils.h"
#include "llvm/ADT/PostOrderIterator.h"
@@ -65,6 +66,10 @@ class VPPredicator {
return EdgeMaskCache[{Src, Dst}] = Mask;
}
+ /// Given a widened phi \p PhiR, try to see if its incoming blocks all share a
+ /// common edge and return its mask.
+ VPValue *findCommonEdgeMask(const VPWidenPHIRecipe *PhiR) const;
+
public:
/// Returns the precomputed predicate of the edge from \p Src to \p Dst.
VPValue *getEdgeMask(const VPBasicBlock *Src, const VPBasicBlock *Dst) const {
@@ -78,8 +83,8 @@ class VPPredicator {
/// block of the loop is set to True, or to the loop mask when tail folding.
VPValue *createBlockInMask(VPBasicBlock *VPBB);
- /// Convert phi recipes in \p VPBB to VPBlendRecipes.
- void convertPhisToBlends(VPBasicBlock *VPBB);
+ /// Convert phi recipes in \p VPBB to selects.
+ void convertPhisToSelects(VPBasicBlock *VPBB);
const BlockMaskCacheTy getBlockMaskCache() const { return BlockMaskCache; }
};
@@ -227,7 +232,21 @@ void VPPredicator::createSwitchEdgeMasks(VPInstruction *SI) {
setEdgeMask(Src, DefaultDst, DefaultMask);
}
-void VPPredicator::convertPhisToBlends(VPBasicBlock *VPBB) {
+VPValue *VPPredicator::findCommonEdgeMask(const VPWidenPHIRecipe *PhiR) const {
+ using namespace llvm::VPlanPatternMatch;
+ VPValue *EdgeMask = getEdgeMask(PhiR->getIncomingBlock(0), PhiR->getParent());
+ VPValue *CommonEdgeMask;
+ if (!EdgeMask ||
+ !match(EdgeMask, m_LogicalAnd(m_VPValue(CommonEdgeMask), m_VPValue())))
+ return nullptr;
+ for (unsigned In = 1; In < PhiR->getNumIncoming(); In++)
+ if (!match(getEdgeMask(PhiR->getIncomingBlock(In), PhiR->getParent()),
+ m_LogicalAnd(m_Specific(CommonEdgeMask), m_VPValue())))
+ return nullptr;
+ return CommonEdgeMask;
+}
+
+void VPPredicator::convertPhisToSelects(VPBasicBlock *VPBB) {
SmallVector<VPWidenPHIRecipe *> Phis;
for (VPRecipeBase &R : VPBB->phis())
Phis.push_back(cast<VPWidenPHIRecipe>(&R));
@@ -238,24 +257,34 @@ void VPPredicator::convertPhisToBlends(VPBasicBlock *VPBB) {
// be duplications since this is a simple recursive scan, but future
// optimizations will clean it up.
- SmallVector<VPValue *, 2> OperandsWithMask;
+ VPValue *CommonEdgeMask = findCommonEdgeMask(PhiR);
+ VPValue *Select = PhiR->getIncomingValue(0);
+ if (!getEdgeMask(PhiR->getIncomingBlock(0), VPBB)) {
+ assert(all_equal(PhiR->operands()) &&
+ "Distinct incoming values with one having a full mask");
+ PhiR->replaceAllUsesWith(Select);
+ PhiR->eraseFromParent();
+ continue;
+ }
+
unsigned NumIncoming = PhiR->getNumIncoming();
- for (unsigned In = 0; In < NumIncoming; In++) {
+ for (unsigned In = 1; In < NumIncoming; In++) {
const VPBasicBlock *Pred = PhiR->getIncomingBlock(In);
- OperandsWithMask.push_back(PhiR->getIncomingValue(In));
+ VPValue *Incoming = PhiR->getIncomingValue(In);
VPValue *EdgeMask = getEdgeMask(Pred, VPBB);
- if (!EdgeMask) {
- assert(In == 0 && "Both null and non-null edge masks found");
- assert(all_equal(PhiR->operands()) &&
- "Distinct incoming values with one having a full mask");
- break;
- }
- OperandsWithMask.push_back(EdgeMask);
+
+ // If all incoming blocks share a common edge, remove it from the mask.
+ using namespace llvm::VPlanPatternMatch;
+ VPValue *X;
+ if (match(EdgeMask,
+ m_LogicalAnd(m_Specific(CommonEdgeMask), m_VPValue(X))))
+ EdgeMask = X;
+
+ Select =
+ Builder.createSelect(EdgeMask, Incoming, Select, PhiR->getDebugLoc());
+ Select->setUnderlyingValue(PhiR->getUnderlyingValue());
}
- PHINode *IRPhi = cast<PHINode>(PhiR->getUnderlyingValue());
- auto *Blend = new VPBlendRecipe(IRPhi, OperandsWithMask);
- Builder.insert(Blend);
- PhiR->replaceAllUsesWith(Blend);
+ PhiR->replaceAllUsesWith(Select);
PhiR->eraseFromParent();
}
}
@@ -281,7 +310,7 @@ VPlanTransforms::introduceMasksAndLinearize(VPlan &Plan, bool FoldTail) {
}
Predicator.createBlockInMask(VPBB);
- Predicator.convertPhisToBlends(VPBB);
+ Predicator.convertPhisToSelects(VPBB);
}
// Linearize the blocks of the loop into one serial chain.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 241ac42b685a9..ce9f655594550 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -74,7 +74,6 @@ bool VPRecipeBase::mayWriteToMemory() const {
case VPScalarIVStepsSC:
case VPPredInstPHISC:
return false;
- case VPBlendSC:
case VPReductionEVLSC:
case VPReductionSC:
case VPVectorPointerSC:
@@ -124,7 +123,6 @@ bool VPRecipeBase::mayReadFromMemory() const {
case VPWidenStoreEVLSC:
case VPWidenStoreSC:
return false;
- case VPBlendSC:
case VPReductionEVLSC:
case VPReductionSC:
case VPVectorPointerSC:
@@ -164,7 +162,6 @@ bool VPRecipeBase::mayHaveSideEffects() const {
}
case VPWidenIntrinsicSC:
return cast<VPWidenIntrinsicRecipe>(this)->mayHaveSideEffects();
- case VPBlendSC:
case VPReductionEVLSC:
case VPReductionSC:
case VPScalarIVStepsSC:
@@ -921,6 +918,18 @@ InstructionCost VPInstruction::computeCost(ElementCount VF,
}
switch (getOpcode()) {
+ case Instruction::Select: {
+ // Handle cases where only the first lane is used the same way as the legacy
+ // cost model.
+ if (vputils::onlyFirstLaneUsed(this))
+ return Ctx.TTI.getCFInstrCost(Instruction::PHI, Ctx.CostKind);
+
+ Type *ResultTy = toVectorTy(Ctx.Types.inferScalarType(this), VF);
+ Type *CmpTy = toVectorTy(Type::getInt1Ty(Ctx.Types.getContext()), VF);
+ return Ctx.TTI.getCmpSelInstrCost(Instruction::Select, ResultTy, CmpTy,
+ CmpInst::BAD_ICMP_PREDICATE,
+ Ctx.CostKind);
+ }
case Instruction::ExtractElement: {
// Add on the cost of extracting the element.
auto *VecTy = toVectorTy(Ctx.Types.inferScalarType(getOperand(0)), VF);
@@ -2411,44 +2420,6 @@ void VPVectorPointerRecipe::print(raw_ostream &O, const Twine &Indent,
}
#endif
-InstructionCost VPBlendRecipe::computeCost(ElementCount VF,
- VPCostContext &Ctx) const {
- // Handle cases where only the first lane is used the same way as the legacy
- // cost model.
- if (vputils::onlyFirstLaneUsed(this))
- return Ctx.TTI.getCFInstrCost(Instruction::PHI, Ctx.CostKind);
-
- Type *ResultTy = toVectorTy(Ctx.Types.inferScalarType(this), VF);
- Type *CmpTy = toVectorTy(Type::getInt1Ty(Ctx.Types.getContext()), VF);
- return (getNumIncomingValues() - 1) *
- Ctx.TTI.getCmpSelInstrCost(Instruction::Select, ResultTy, CmpTy,
- CmpInst::BAD_ICMP_PREDICATE, Ctx.CostKind);
-}
-
-#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
-void VPBlendRecipe::print(raw_ostream &O, const Twine &Indent,
- VPSlotTracker &SlotTracker) const {
- O << Indent << "BLEND ";
- printAsOperand(O, SlotTracker);
- O << " =";
- if (getNumIncomingValues() == 1) {
- // Not a User of any mask: not really blending, this is a
- // single-predecessor phi.
- O << " ";
- getIncomingValue(0)->printAsOperand(O, SlotTracker);
- } else {
- for (unsigned I = 0, E = getNumIncomingValues(); I < E; ++I) {
- O << " ";
- getIncomingValue(I)->printAsOperand(O, SlotTracker);
- if (I == 0)
- continue;
- O << "/";
- getMask(I)->printAsOperand(O, SlotTracker);
- }
- }
-}
-#endif
-
void VPReductionRecipe::execute(VPTransformState &State) {
assert(!State.Lane && "Reduction being replicated.");
Value *PrevInChain = State.get(getChainOp(), /*IsScalar*/ true);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 5da43b61c672e..09a6b4ae46970 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -1079,6 +1079,12 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
return;
}
+ if (match(Def, m_Select(m_True(), m_VPValue(X), m_VPValue())))
+ return Def->replaceAllUsesWith(X);
+
+ if (match(Def, m_Select(m_False(), m_VPValue(), m_VPValue(X))))
+ return Def->replaceAllUsesWith(X);
+
if (match(Def, m_Select(m_VPValue(), m_VPValue(X), m_Deferred(X))))
return Def->replaceAllUsesWith(X);
@@ -1254,85 +1260,6 @@ static void narrowToSingleScalarRecipes(VPlan &Plan) {
}
}
-/// Normalize and simplify VPBlendRecipes. Should be run after simplifyRecipes
-/// to make sure the masks are simplified.
-static void simplifyBlends(VPlan &Plan) {
- using namespace llvm::VPlanPatternMatch;
- for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
- vp_depth_first_shallow(Plan.getVectorLoopRegion()->getEntry()))) {
- for (VPRecipeBase &R : make_early_inc_range(*VPBB)) {
- auto *Blend = dyn_cast<VPBlendRecipe>(&R);
- if (!Ble...
[truncated]
|
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Transforms/LoopVectorize/RISCV/blend-simplified.ll llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp llvm/lib/Transforms/Vectorize/VPlanAnalysis.h llvm/lib/Transforms/Vectorize/VPlanPredicator.cpp llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp llvm/lib/Transforms/Vectorize/VPlanUtils.h llvm/lib/Transforms/Vectorize/VPlanValue.h llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp llvm/lib/Transforms/Vectorize/VPlanVerifier.h llvm/test/Transforms/LoopVectorize/AArch64/masked-call-scalarize.ll llvm/test/Transforms/LoopVectorize/AArch64/masked-call.ll llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-reductions.ll llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding.ll llvm/test/Transforms/LoopVectorize/AArch64/tail-fold-uniform-memops.ll llvm/test/Transforms/LoopVectorize/RISCV/pr88802.ll llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-cond-reduction.ll llvm/test/Transforms/LoopVectorize/X86/constant-fold.ll llvm/test/Transforms/LoopVectorize/X86/drop-inbounds-flags-for-reverse-vector-pointer.ll llvm/test/Transforms/LoopVectorize/X86/replicate-uniform-call.ll llvm/test/Transforms/LoopVectorize/pr55167-fold-tail-live-out.ll llvm/test/Transforms/LoopVectorize/predicatedinst-loop-invariant.ll llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll llvm/test/Transforms/LoopVectorize/single-value-blend-phis.ll llvm/test/Transforms/LoopVectorize/uniform-blend.ll llvm/test/Transforms/LoopVectorize/vplan-printing.ll llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll llvm/unittests/Transforms/Vectorize/VPlanTest.cpp llvm/unittests/Transforms/Vectorize/VPlanVerifierTest.cpp The following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
} Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
} Please refer to the Undefined Behavior Manual for more information. |
Stacked on #150368 and #150357
In #133993 we converted VPBlendRecipes to a series of select VPInstructions just before execution.
This patch changes VPlanPredicator to directly emit selects, which allows us to simplify the selects even further and remove VPBlendRecipe altogether.