-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[RISCV][TTI] Cost a subvector insert at a register boundary with exact vlen #85240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -469,6 +469,22 @@ InstructionCost RISCVTTIImpl::getShuffleCost(TTI::ShuffleKind Kind, | |
return LT.first * | ||
getRISCVInstructionCost(RISCV::VSLIDEDOWN_VI, LT.second, CostKind); | ||
case TTI::SK_InsertSubvector: | ||
// If we're inserting a subvector of *exactly* m1 size at a sub-register | ||
// boundary this is a subregister insert at worst and won't require the | ||
// slideup. We require the subvec to to be exactly VLEN as otherwise | ||
// we'd have to account for tail elements in the m1 container if any. | ||
// TODO: Extend for aligned m2, m4 inserts | ||
// TODO: Extend for scalable subvector types | ||
if (std::pair<InstructionCost, MVT> SubLT = getTypeLegalizationCost(SubTp); | ||
SubLT.second.isValid() && SubLT.second.isFixedLengthVector()) { | ||
const unsigned MinVLen = ST->getRealMinVLen(); | ||
const unsigned MaxVLen = ST->getRealMaxVLen(); | ||
Comment on lines
+478
to
+481
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could also use |
||
if (MinVLen == MaxVLen && | ||
SubLT.second.getScalarSizeInBits() * Index % MinVLen == 0 && | ||
SubLT.second.getSizeInBits() == MinVLen) | ||
return TTI::TCC_Free; | ||
} | ||
|
||
// Example sequence: | ||
// vsetivli zero, 4, e8, mf2, tu, ma (ignored) | ||
// vslideup.vi v8, v9, 2 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -527,15 +527,15 @@ define void @fixed_m1_in_m2_notail(<8 x i32> %src, <8 x i32> %passthru) vscale_r | |
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 8, i32 9, i32 10, i32 11, i32 5, i32 6, i32 7> | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %3 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 8, i32 9, i32 10, i32 11, i32 6, i32 7> | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 2, i32 8, i32 9, i32 10, i32 11, i32 7> | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 8, i32 9, i32 10, i32 11> | ||
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %5 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 8, i32 9, i32 10, i32 11> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to call this out because I found it mildly surprising - we only see changes for non-zero insert elements here because a insertsubvector with index zero is a select instead. (That is, it's recognized as a select by the pattern matching before costing is invoked). We may want to revisit that separately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
+1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is #85302 |
||
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void | ||
; | ||
; SIZE-LABEL: 'fixed_m1_in_m2_notail' | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %1 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 8, i32 9, i32 10, i32 11, i32 4, i32 5, i32 6, i32 7> | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 8, i32 9, i32 10, i32 11, i32 5, i32 6, i32 7> | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 8, i32 9, i32 10, i32 11, i32 6, i32 7> | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 2, i32 8, i32 9, i32 10, i32 11, i32 7> | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %5 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 8, i32 9, i32 10, i32 11> | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %5 = shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 8, i32 9, i32 10, i32 11> | ||
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void | ||
; | ||
shufflevector <8 x i32> %src, <8 x i32> %passthru, <8 x i32> <i32 8, i32 9, i32 10, i32 11, i32 4, i32 5, i32 6, i32 7> | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another case we handle as a subregister insert in #84107 is mf{2,4,8} subvector inserts, where the bottom element being inserted is aligned to a vector register and the vector being inserted into is undef.