Skip to content

[SelectionDAG] Introduce ISD::PTRADD #140017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 28, 2025
Merged

Conversation

ritter-x2a
Copy link
Member

This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith().

The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree.
Original authors: @davidchisnall, @jrtc27, @arichardson.

The changes in this PR were extracted from PR #105669.

davidchisnall and others added 2 commits May 15, 2025 02:29
This opcode represents the addition of a pointer value (first operand) and an
integer offset (second operand). PTRADD nodes are only generated if the
TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith().

The PTRADD node and respective visitPTRADD() function were adapted by @rgwott
from the CHERI/Morello LLVM tree.
Original authors: @davidchisnall, @jrtc27, @arichardson.

The changes in this PR were extracted from PR llvm#105669.

Co-authored-by: Jessica Clarke <[email protected]>
Co-authored-by: Alexander Richardson <[email protected]>
Co-authored-by: Rodolfo Wottrich <[email protected]>
node in the DAGCombines.
Also: remove else after return and braces around single-line blocks to
match the coding standards.
@llvmbot
Copy link
Member

llvmbot commented May 15, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-llvm-selectiondag

Author: Fabian Ritter (ritter-x2a)

Changes

This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith().

The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree.
Original authors: @davidchisnall, @jrtc27, @arichardson.

The changes in this PR were extracted from PR #105669.


Full diff: https://github.com/llvm/llvm-project/pull/140017.diff

7 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+5)
  • (modified) llvm/include/llvm/Target/TargetMachine.h (+5)
  • (modified) llvm/include/llvm/Target/TargetSelectionDAG.td (+1-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+100-3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+8-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+9-10)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+1)
diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h
index 80ef32aff62ae..abae3e921117f 100644
--- a/llvm/include/llvm/CodeGen/ISDOpcodes.h
+++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h
@@ -1502,6 +1502,11 @@ enum NodeType {
   // Outputs: [rv], output chain, glue
   PATCHPOINT,
 
+  // PTRADD represents pointer arithmatic semantics, for targets that opt in
+  // using shouldPreservePtrArith().
+  // ptr = PTRADD ptr, offset
+  PTRADD,
+
 // Vector Predication
 #define BEGIN_REGISTER_VP_SDNODE(VPSDID, ...) VPSDID,
 #include "llvm/IR/VPIntrinsics.def"
diff --git a/llvm/include/llvm/Target/TargetMachine.h b/llvm/include/llvm/Target/TargetMachine.h
index 906926729ed74..8536439f0a2d4 100644
--- a/llvm/include/llvm/Target/TargetMachine.h
+++ b/llvm/include/llvm/Target/TargetMachine.h
@@ -468,6 +468,11 @@ class TargetMachine {
     return false;
   }
 
+  /// True if target has some particular form of dealing with pointer arithmetic
+  /// semantics. False if pointer arithmetic should not be preserved for passes
+  /// such as instruction selection, and can fallback to regular arithmetic.
+  virtual bool shouldPreservePtrArith(const Function &F) const { return false; }
+
   /// Create a pass configuration object to be used by addPassToEmitX methods
   /// for generating a pipeline of CodeGen passes.
   virtual TargetPassConfig *createPassConfig(PassManagerBase &PM) {
diff --git a/llvm/include/llvm/Target/TargetSelectionDAG.td b/llvm/include/llvm/Target/TargetSelectionDAG.td
index 41fed692c7025..349fd003ca06e 100644
--- a/llvm/include/llvm/Target/TargetSelectionDAG.td
+++ b/llvm/include/llvm/Target/TargetSelectionDAG.td
@@ -407,7 +407,7 @@ def tblockaddress: SDNode<"ISD::TargetBlockAddress",  SDTPtrLeaf, [],
 
 def add        : SDNode<"ISD::ADD"       , SDTIntBinOp   ,
                         [SDNPCommutative, SDNPAssociative]>;
-def ptradd     : SDNode<"ISD::ADD"       , SDTPtrAddOp, []>;
+def ptradd     : SDNode<"ISD::PTRADD"    , SDTPtrAddOp, []>;
 def sub        : SDNode<"ISD::SUB"       , SDTIntBinOp>;
 def mul        : SDNode<"ISD::MUL"       , SDTIntBinOp,
                         [SDNPCommutative, SDNPAssociative]>;
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index d6e288a59b2ee..6129c1a89912e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -412,7 +412,9 @@ namespace {
     SDValue visitMERGE_VALUES(SDNode *N);
     SDValue visitADD(SDNode *N);
     SDValue visitADDLike(SDNode *N);
-    SDValue visitADDLikeCommutative(SDValue N0, SDValue N1, SDNode *LocReference);
+    SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
+                                    SDNode *LocReference);
+    SDValue visitPTRADD(SDNode *N);
     SDValue visitSUB(SDNode *N);
     SDValue visitADDSAT(SDNode *N);
     SDValue visitSUBSAT(SDNode *N);
@@ -1095,7 +1097,7 @@ bool DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
   // (load/store (add, (add, x, y), offset2)) ->
   // (load/store (add, (add, x, offset2), y)).
 
-  if (N0.getOpcode() != ISD::ADD)
+  if (N0.getOpcode() != ISD::ADD && N0.getOpcode() != ISD::PTRADD)
     return false;
 
   // Check for vscale addressing modes.
@@ -1852,6 +1854,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
   case ISD::TokenFactor:        return visitTokenFactor(N);
   case ISD::MERGE_VALUES:       return visitMERGE_VALUES(N);
   case ISD::ADD:                return visitADD(N);
+  case ISD::PTRADD:             return visitPTRADD(N);
   case ISD::SUB:                return visitSUB(N);
   case ISD::SADDSAT:
   case ISD::UADDSAT:            return visitADDSAT(N);
@@ -2388,7 +2391,7 @@ static bool canFoldInAddressingMode(SDNode *N, SDNode *Use, SelectionDAG &DAG,
   }
 
   TargetLowering::AddrMode AM;
-  if (N->getOpcode() == ISD::ADD) {
+  if (N->getOpcode() == ISD::ADD || N->getOpcode() == ISD::PTRADD) {
     AM.HasBaseReg = true;
     ConstantSDNode *Offset = dyn_cast<ConstantSDNode>(N->getOperand(1));
     if (Offset)
@@ -2617,6 +2620,100 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc &DL) {
   return SDValue();
 }
 
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+/// This function was adapted from DAGCombiner::visitPTRADD() from the Morello
+/// project, which is based on CHERI.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+  SDValue N0 = N->getOperand(0);
+  SDValue N1 = N->getOperand(1);
+  EVT PtrVT = N0.getValueType();
+  EVT IntVT = N1.getValueType();
+  SDLoc DL(N);
+
+  // fold (ptradd undef, y) -> undef
+  if (N0.isUndef())
+    return N0;
+
+  // fold (ptradd x, undef) -> undef
+  if (N1.isUndef())
+    return DAG.getUNDEF(PtrVT);
+
+  // fold (ptradd x, 0) -> x
+  if (isNullConstant(N1))
+    return N0;
+
+  if (N0.getOpcode() == ISD::PTRADD &&
+      !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) {
+    SDValue X = N0.getOperand(0);
+    SDValue Y = N0.getOperand(1);
+    SDValue Z = N1;
+    bool N0OneUse = N0.hasOneUse();
+    bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+    bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+    bool ZOneUse = Z.hasOneUse();
+
+    // (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+    //   * x is a null pointer; or
+    //   * y is a constant and z has one use; or
+    //   * y is a constant and (ptradd x, y) has one use; or
+    //   * (ptradd x, y) and z have one use and z is not a constant.
+    if (isNullConstant(X) || (YIsConstant && ZOneUse) ||
+        (YIsConstant && N0OneUse) || (N0OneUse && ZOneUse && !ZIsConstant)) {
+      SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z});
+
+      // Calling visit() can replace the Add node with ISD::DELETED_NODE if
+      // there aren't any users, so keep a handle around whilst we visit it.
+      HandleSDNode ADDHandle(Add);
+
+      SDValue VisitedAdd = visit(Add.getNode());
+      if (VisitedAdd) {
+        // If visit() returns the same node, it means the SDNode was RAUW'd, and
+        // therefore we have to load the new value to perform the checks whether
+        // the reassociation fold is profitable.
+        if (VisitedAdd.getNode() == Add.getNode())
+          Add = ADDHandle.getValue();
+        else
+          Add = VisitedAdd;
+      }
+
+      return DAG.getMemBasePlusOffset(X, Add, DL, SDNodeFlags());
+    }
+
+    // TODO: There is another possible fold here that was proven useful.
+    // It would be this:
+    //
+    // (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+    //   * (ptradd x, y) has one use; and
+    //   * y is a constant; and
+    //   * z is not a constant.
+    //
+    // In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+    // opportunity to select more complex instructions such as SUBPT and
+    // MSUBPT. However, a hypothetical corner case has been found that we could
+    // not avoid. Consider this (pseudo-POSIX C):
+    //
+    // char *foo(char *x, int z) {return (x + LARGE_CONSTANT) + z;}
+    // char *p = mmap(LARGE_CONSTANT);
+    // char *q = foo(p, -LARGE_CONSTANT);
+    //
+    // Then x + LARGE_CONSTANT is one-past-the-end, so valid, and a
+    // further + z takes it back to the start of the mapping, so valid,
+    // regardless of the address mmap gave back. However, if mmap gives you an
+    // address < LARGE_CONSTANT (ignoring high bits), x - LARGE_CONSTANT will
+    // borrow from the high bits (with the subsequent + z carrying back into
+    // the high bits to give you a well-defined pointer) and thus trip
+    // FEAT_CPA's pointer corruption checks.
+    //
+    // We leave this fold as an opportunity for future work, addressing the
+    // corner case for FEAT_CPA, as well as reconciling the solution with the
+    // more general application of pointer arithmetic in other future targets.
+  }
+
+  return SDValue();
+}
+
 /// Try to fold a 'not' shifted sign-bit with add/sub with constant operand into
 /// a shift and add with a different constant.
 static SDValue foldAddSubOfSignBit(SDNode *N, const SDLoc &DL,
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index bbf1b0fd590ef..1483a5f4d5b95 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5641,7 +5641,8 @@ bool SelectionDAG::isADDLike(SDValue Op, bool NoWrap) const {
 
 bool SelectionDAG::isBaseWithConstantOffset(SDValue Op) const {
   return Op.getNumOperands() == 2 && isa<ConstantSDNode>(Op.getOperand(1)) &&
-         (Op.getOpcode() == ISD::ADD || isADDLike(Op));
+         (Op.getOpcode() == ISD::ADD || Op.getOpcode() == ISD::PTRADD ||
+          isADDLike(Op));
 }
 
 bool SelectionDAG::isKnownNeverNaN(SDValue Op, bool SNaN,
@@ -8144,7 +8145,12 @@ SDValue SelectionDAG::getMemBasePlusOffset(SDValue Ptr, SDValue Offset,
                                            const SDNodeFlags Flags) {
   assert(Offset.getValueType().isInteger());
   EVT BasePtrVT = Ptr.getValueType();
-  return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
+  if (!this->getTarget().shouldPreservePtrArith(
+          this->getMachineFunction().getFunction())) {
+    return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
+  } else {
+    return getNode(ISD::PTRADD, DL, BasePtrVT, Ptr, Offset, Flags);
+  }
 }
 
 /// Returns true if memcpy source is constant data.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 8e74a076cc013..b3227d9b589dd 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4284,8 +4284,8 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) {
             (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap()))
           Flags |= SDNodeFlags::NoUnsignedWrap;
 
-        N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N,
-                        DAG.getConstant(Offset, dl, N.getValueType()), Flags);
+        N = DAG.getMemBasePlusOffset(
+            N, DAG.getConstant(Offset, dl, N.getValueType()), dl, Flags);
       }
     } else {
       // IdxSize is the width of the arithmetic according to IR semantics.
@@ -4329,7 +4329,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) {
 
         OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType());
 
-        N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, OffsVal, Flags);
+        N = DAG.getMemBasePlusOffset(N, OffsVal, dl, Flags);
         continue;
       }
 
@@ -4389,7 +4389,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) {
       SDNodeFlags AddFlags;
       AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap());
 
-      N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
+      N = DAG.getMemBasePlusOffset(N, IdxN, dl, AddFlags);
     }
   }
 
@@ -9138,8 +9138,8 @@ bool SelectionDAGBuilder::visitMemPCpyCall(const CallInst &I) {
   Size = DAG.getSExtOrTrunc(Size, sdl, Dst.getValueType());
 
   // Adjust return pointer to point just past the last dst byte.
-  SDValue DstPlusSize = DAG.getNode(ISD::ADD, sdl, Dst.getValueType(),
-                                    Dst, Size);
+  SDNodeFlags Flags;
+  SDValue DstPlusSize = DAG.getMemBasePlusOffset(Dst, Size, sdl, Flags);
   setValue(&I, DstPlusSize);
   return true;
 }
@@ -11230,10 +11230,9 @@ TargetLowering::LowerCallTo(TargetLowering::CallLoweringInfo &CLI) const {
     MachineFunction &MF = CLI.DAG.getMachineFunction();
     Align HiddenSRetAlign = MF.getFrameInfo().getObjectAlign(DemoteStackIdx);
     for (unsigned i = 0; i < NumValues; ++i) {
-      SDValue Add =
-          CLI.DAG.getNode(ISD::ADD, CLI.DL, PtrVT, DemoteStackSlot,
-                          CLI.DAG.getConstant(Offsets[i], CLI.DL, PtrVT),
-                          SDNodeFlags::NoUnsignedWrap);
+      SDValue Add = CLI.DAG.getMemBasePlusOffset(
+          DemoteStackSlot, CLI.DAG.getConstant(Offsets[i], CLI.DL, PtrVT),
+          CLI.DL, SDNodeFlags::NoUnsignedWrap);
       SDValue L = CLI.DAG.getLoad(
           RetTys[i], CLI.DL, CLI.Chain, Add,
           MachinePointerInfo::getFixedStack(CLI.DAG.getMachineFunction(),
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 8faf97271d99e..45ef475dffe6e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -269,6 +269,7 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
 
   // Binary operators
   case ISD::ADD:                        return "add";
+  case ISD::PTRADD:                     return "ptradd";
   case ISD::SUB:                        return "sub";
   case ISD::MUL:                        return "mul";
   case ISD::MULHU:                      return "mulhu";

@ritter-x2a
Copy link
Member Author

The changes in the first commit are taken verbatim from #105669, the second one contains two minor fixes by me.

@ritter-x2a ritter-x2a marked this pull request as ready for review May 15, 2025 07:38
@davidchisnall davidchisnall requested a review from resistor May 15, 2025 11:39
@davidchisnall
Copy link
Contributor

This looks very much like code I wrote (sorry). Tagging @resistor to check that it still looks like code that does the right thing for us.

Copy link
Collaborator

@resistor resistor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM and matches what we have in the cheriot downstream: https://github.com/CHERIoT-Platform/llvm-project/tree/cheriot-upstream

I have noted a couple of places where I'd like to understand some optimization divergences from what we have, but I don't think those are blockers.

That said, I would like to hear from at least one other relevant reviewer before this goes in.

@momchil-velikov momchil-velikov removed their request for review May 15, 2025 12:40
Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like @jrtc27 to confirm the fold part looks good now.

ritter-x2a added a commit to ritter-x2a/llvm-project that referenced this pull request May 22, 2025
This change was split from PR llvm#140017, which is based on a part of PR llvm#105669 by
@rgwott, which was adapted from work by @jrtc27, @arichardson, @davidchisnall in
the CHERI/Morello LLVM tree.

Co-authored-by: David Chisnall <[email protected]>
Co-authored-by: Jessica Clarke <[email protected]>
Co-authored-by: Alexander Richardson <[email protected]>
Co-authored-by: Rodolfo Wottrich <[email protected]>
@s-barannikov
Copy link
Contributor

This patch adds more calls to getMemBasePlusOffset(), but the method wasn't updated to generate ISD::PTRADD, is this intended?

@topperc
Copy link
Collaborator

topperc commented May 23, 2025

This patch adds more calls to getMemBasePlusOffset(), but the method wasn't updated to generate ISD::PTRADD, is this intended?

Wasn't it updated here?

SDValue SelectionDAG::getMemBasePlusOffset(SDValue Ptr, SDValue Offset,
                                           const SDLoc &DL,
                                           const SDNodeFlags Flags) {
  assert(Offset.getValueType().isInteger());
  EVT BasePtrVT = Ptr.getValueType();
  if (TLI->shouldPreservePtrArith(this->getMachineFunction().getFunction(),
                                  BasePtrVT))
    return getNode(ISD::PTRADD, DL, BasePtrVT, Ptr, Offset, Flags);
  return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
}

@@ -9138,8 +9138,8 @@ bool SelectionDAGBuilder::visitMemPCpyCall(const CallInst &I) {
Size = DAG.getSExtOrTrunc(Size, sdl, Dst.getValueType());

// Adjust return pointer to point just past the last dst byte.
SDValue DstPlusSize = DAG.getNode(ISD::ADD, sdl, Dst.getValueType(),
Dst, Size);
SDNodeFlags Flags;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the Flags argument of getMemBasePlusOffset have a default value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I've now removed the unnecessary explicit argument setting here, thanks! I think we could actually use NUW here since this operation computes the position one-past-the-end of a memcpy'd memory range (which should either be within or one-past-the-end of an allocated object, which may not wrap according to the IR LangRef), but let's not add more functionality changes to this PR.

@s-barannikov
Copy link
Contributor

This patch adds more calls to getMemBasePlusOffset(), but the method wasn't updated to generate ISD::PTRADD, is this intended?

Wasn't it updated here?

SDValue SelectionDAG::getMemBasePlusOffset(SDValue Ptr, SDValue Offset,
                                           const SDLoc &DL,
                                           const SDNodeFlags Flags) {
  assert(Offset.getValueType().isInteger());
  EVT BasePtrVT = Ptr.getValueType();
  if (TLI->shouldPreservePtrArith(this->getMachineFunction().getFunction(),
                                  BasePtrVT))
    return getNode(ISD::PTRADD, DL, BasePtrVT, Ptr, Offset, Flags);
  return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
}

I looked for it three times =\

ritter-x2a added a commit to ritter-x2a/llvm-project that referenced this pull request May 26, 2025
This change was split from PR llvm#140017, which is based on a part of PR llvm#105669 by
@rgwott, which was adapted from work by @jrtc27, @arichardson, @davidchisnall in
the CHERI/Morello LLVM tree.

Co-authored-by: David Chisnall <[email protected]>
Co-authored-by: Jessica Clarke <[email protected]>
Co-authored-by: Alexander Richardson <[email protected]>
Co-authored-by: Rodolfo Wottrich <[email protected]>
@ritter-x2a ritter-x2a merged commit 8adcc8a into llvm:main May 28, 2025
11 checks passed
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
This opcode represents the addition of a pointer value (first operand)
and an integer offset (second operand). PTRADD nodes are only generated
if the TargetMachine opts in by overriding
TargetMachine::shouldPreservePtrArith().

The PTRADD node and respective visitPTRADD() function were adapted by
@rgwott from the CHERI/Morello LLVM tree.
Original authors: @davidchisnall, @jrtc27, @arichardson.

The changes in this PR were extracted from PR llvm#105669.

---------

Co-authored-by: David Chisnall <[email protected]>
Co-authored-by: Jessica Clarke <[email protected]>
Co-authored-by: Alexander Richardson <[email protected]>
Co-authored-by: Rodolfo Wottrich <[email protected]>
rgwott added a commit that referenced this pull request Jun 20, 2025
CPA stands for Checked Pointer Arithmetic and is part of the 2023 MTE
architecture extensions for A-profile.
The new CPA instructions perform regular pointer arithmetic (such as
base register + offset) but check for overflow in the most significant
bits of the result, enhancing security by detecting address tampering.

In this patch we intend to capture the semantics of pointer arithmetic
when it is not folded into loads/stores, then generate the appropriate
scalar CPA instructions. In order to preserve pointer arithmetic
semantics through the backend, we use the PTRADD SelectionDAG node type.

Use backend option `-aarch64-use-featcpa-codegen=true` to enable CPA
CodeGen (for a target with CPA enabled).

The story of this PR is that initially it introduced the PTRADD
SelectionDAG node and the respective visitPTRADD() function, adapted
from the CHERI/Morello LLVM tree. The original authors are
@davidchisnall, @jrtc27, @arichardson.
After a while, @ritter-x2a took the part of the code that was
target-independent and merged it separately in #140017. This PR thus
remains as the AArch64-part only.

Mode details about the CPA extension can be found at:

-
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023
- https://developer.arm.com/documentation/ddi0602/2023-09/ (e.g ADDPT
instruction)

This PR follows #79569.
It does not address vector FEAT_CPA instructions.
Jaddyen pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 23, 2025
CPA stands for Checked Pointer Arithmetic and is part of the 2023 MTE
architecture extensions for A-profile.
The new CPA instructions perform regular pointer arithmetic (such as
base register + offset) but check for overflow in the most significant
bits of the result, enhancing security by detecting address tampering.

In this patch we intend to capture the semantics of pointer arithmetic
when it is not folded into loads/stores, then generate the appropriate
scalar CPA instructions. In order to preserve pointer arithmetic
semantics through the backend, we use the PTRADD SelectionDAG node type.

Use backend option `-aarch64-use-featcpa-codegen=true` to enable CPA
CodeGen (for a target with CPA enabled).

The story of this PR is that initially it introduced the PTRADD
SelectionDAG node and the respective visitPTRADD() function, adapted
from the CHERI/Morello LLVM tree. The original authors are
@davidchisnall, @jrtc27, @arichardson.
After a while, @ritter-x2a took the part of the code that was
target-independent and merged it separately in llvm#140017. This PR thus
remains as the AArch64-part only.

Mode details about the CPA extension can be found at:

-
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023
- https://developer.arm.com/documentation/ddi0602/2023-09/ (e.g ADDPT
instruction)

This PR follows llvm#79569.
It does not address vector FEAT_CPA instructions.
ritter-x2a added a commit that referenced this pull request Jul 24, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 28, 2025
This flag applies to G_PTR_ADD instructions and indicates that the operation
implements an inbounds getelementptr operation, i.e., the pointer operand is in
bounds wrt. the allocated object it is based on, and the arithmetic does not
change that.

It is set when the IRTranslator lowers inbounds GEPs (currently only in some
cases, to be extended with a future PR), and in the
(build|materialize)ObjectPtrOffset functions.

Inbounds information is useful in ISel when we have instructions that perform
address computations whose intermediate steps must be in the same memory region
as the final result. A follow-up patch will start using it for AMDGPU's flat
memory instructions, where the immediate offset must not affect the memory
aperture of the address.

This is analogous to a concurrent effort in SDAG: #131862
(related: #140017, #141725).

For SWDEV-516125.
kraj pushed a commit to kraj/llvm-project that referenced this pull request Jul 28, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: llvm#131862, related: llvm#140017, llvm#141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 29, 2025
#150392)

These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 30, 2025
This flag applies to G_PTR_ADD instructions and indicates that the operation
implements an inbounds getelementptr operation, i.e., the pointer operand is in
bounds wrt. the allocated object it is based on, and the arithmetic does not
change that.

It is set when the IRTranslator lowers inbounds GEPs (currently only in some
cases, to be extended with a future PR), and in the
(build|materialize)ObjectPtrOffset functions.

Inbounds information is useful in ISel when we have instructions that perform
address computations whose intermediate steps must be in the same memory region
as the final result. A follow-up patch will start using it for AMDGPU's flat
memory instructions, where the immediate offset must not affect the memory
aperture of the address.

This is analogous to a concurrent effort in SDAG: #131862
(related: #140017, #141725).

For SWDEV-516125.
ritter-x2a added a commit that referenced this pull request Jul 30, 2025
This flag applies to G_PTR_ADD instructions and indicates that the operation
implements an inbounds getelementptr operation, i.e., the pointer operand is in
bounds wrt. the allocated object it is based on, and the arithmetic does not
change that.

It is set when the IRTranslator lowers inbounds GEPs (currently only in some
cases, to be extended with a future PR), and in the
(build|materialize)ObjectPtrOffset functions.

Inbounds information is useful in ISel when we have instructions that perform
address computations whose intermediate steps must be in the same memory region
as the final result. A follow-up patch will start using it for AMDGPU's flat
memory instructions, where the immediate offset must not affect the memory
aperture of the address.

This is analogous to a concurrent effort in SDAG: #131862
(related: #140017, #141725).

For SWDEV-516125.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:globalisel llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants