[SelectionDAG] Introduce ISD::PTRADD #140017

ritter-x2a · 2025-05-15T07:34:07Z

This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith().

The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree.
Original authors: @davidchisnall, @jrtc27, @arichardson.

The changes in this PR were extracted from PR #105669.

@rgwott

This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith(). The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree. Original authors: @davidchisnall, @jrtc27, @arichardson. The changes in this PR were extracted from PR llvm#105669. Co-authored-by: Jessica Clarke <[email protected]> Co-authored-by: Alexander Richardson <[email protected]> Co-authored-by: Rodolfo Wottrich <[email protected]>

node in the DAGCombines. Also: remove else after return and braces around single-line blocks to match the coding standards.

llvmbot · 2025-05-15T07:34:30Z

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-llvm-selectiondag

Author: Fabian Ritter (ritter-x2a)

Changes

This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith().

The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree.
Original authors: @davidchisnall, @jrtc27, @arichardson.

The changes in this PR were extracted from PR #105669.

Full diff: https://github.com/llvm/llvm-project/pull/140017.diff

7 Files Affected:

(modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+5)
(modified) llvm/include/llvm/Target/TargetMachine.h (+5)
(modified) llvm/include/llvm/Target/TargetSelectionDAG.td (+1-1)
(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+100-3)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+8-2)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+9-10)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+1)

diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h
index 80ef32aff62ae..abae3e921117f 100644
--- a/llvm/include/llvm/CodeGen/ISDOpcodes.h
+++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h
@@ -1502,6 +1502,11 @@ enum NodeType {
   // Outputs: [rv], output chain, glue
   PATCHPOINT,
 
+  // PTRADD represents pointer arithmatic semantics, for targets that opt in
+  // using shouldPreservePtrArith().
+  // ptr = PTRADD ptr, offset
+  PTRADD,
+
 // Vector Predication
 #define BEGIN_REGISTER_VP_SDNODE(VPSDID, ...) VPSDID,
 #include "llvm/IR/VPIntrinsics.def"
diff --git a/llvm/include/llvm/Target/TargetMachine.h b/llvm/include/llvm/Target/TargetMachine.h
index 906926729ed74..8536439f0a2d4 100644
--- a/llvm/include/llvm/Target/TargetMachine.h
+++ b/llvm/include/llvm/Target/TargetMachine.h
@@ -468,6 +468,11 @@ class TargetMachine {
     return false;
   }
 
+  /// True if target has some particular form of dealing with pointer arithmetic
+  /// semantics. False if pointer arithmetic should not be preserved for passes
+  /// such as instruction selection, and can fallback to regular arithmetic.
+  virtual bool shouldPreservePtrArith(const Function &F) const { return false; }
+
   /// Create a pass configuration object to be used by addPassToEmitX methods
   /// for generating a pipeline of CodeGen passes.
   virtual TargetPassConfig *createPassConfig(PassManagerBase &PM) {
diff --git a/llvm/include/llvm/Target/TargetSelectionDAG.td b/llvm/include/llvm/Target/TargetSelectionDAG.td
index 41fed692c7025..349fd003ca06e 100644
--- a/llvm/include/llvm/Target/TargetSelectionDAG.td
+++ b/llvm/include/llvm/Target/TargetSelectionDAG.td
@@ -407,7 +407,7 @@ def tblockaddress: SDNode<"ISD::TargetBlockAddress",  SDTPtrLeaf, [],
 
 def add        : SDNode<"ISD::ADD"       , SDTIntBinOp   ,
                         [SDNPCommutative, SDNPAssociative]>;
-def ptradd     : SDNode<"ISD::ADD"       , SDTPtrAddOp, []>;
+def ptradd     : SDNode<"ISD::PTRADD"    , SDTPtrAddOp, []>;
 def sub        : SDNode<"ISD::SUB"       , SDTIntBinOp>;
 def mul        : SDNode<"ISD::MUL"       , SDTIntBinOp,
                         [SDNPCommutative, SDNPAssociative]>;
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index d6e288a59b2ee..6129c1a89912e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -412,7 +412,9 @@ namespace {
     SDValue visitMERGE_VALUES(SDNode *N);
     SDValue visitADD(SDNode *N);
     SDValue visitADDLike(SDNode *N);
-    SDValue visitADDLikeCommutative(SDValue N0, SDValue N1, SDNode *LocReference);
+    SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
+                                    SDNode *LocReference);
+    SDValue visitPTRADD(SDNode *N);
     SDValue visitSUB(SDNode *N);
     SDValue visitADDSAT(SDNode *N);
     SDValue visitSUBSAT(SDNode *N);
@@ -1095,7 +1097,7 @@ bool DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
   // (load/store (add, (add, x, y), offset2)) ->
   // (load/store (add, (add, x, offset2), y)).
 
-  if (N0.getOpcode() != ISD::ADD)
+  if (N0.getOpcode() != ISD::ADD && N0.getOpcode() != ISD::PTRADD)
     return false;
 
   // Check for vscale addressing modes.
@@ -1852,6 +1854,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
   case ISD::TokenFactor:        return visitTokenFactor(N);
   case ISD::MERGE_VALUES:       return visitMERGE_VALUES(N);
   case ISD::ADD:                return visitADD(N);
+  case ISD::PTRADD:             return visitPTRADD(N);
   case ISD::SUB:                return visitSUB(N);
   case ISD::SADDSAT:
   case ISD::UADDSAT:            return visitADDSAT(N);
@@ -2388,7 +2391,7 @@ static bool canFoldInAddressingMode(SDNode *N, SDNode *Use, SelectionDAG &DAG,
   }
 
   TargetLowering::AddrMode AM;
-  if (N->getOpcode() == ISD::ADD) {
+  if (N->getOpcode() == ISD::ADD || N->getOpcode() == ISD::PTRADD) {
     AM.HasBaseReg = true;
     ConstantSDNode *Offset = dyn_cast<ConstantSDNode>(N->getOperand(1));
     if (Offset)
@@ -2617,6 +2620,100 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc &DL) {
   return SDValue();
 }
 
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+/// This function was adapted from DAGCombiner::visitPTRADD() from the Morello
+/// project, which is based on CHERI.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+  SDValue N0 = N->getOperand(0);
+  SDValue N1 = N->getOperand(1);
+  EVT PtrVT = N0.getValueType();
+  EVT IntVT = N1.getValueType();
+  SDLoc DL(N);
+
+  // fold (ptradd undef, y) -> undef
+  if (N0.isUndef())
+    return N0;
+
+  // fold (ptradd x, undef) -> undef
+  if (N1.isUndef())
+    return DAG.getUNDEF(PtrVT);
+
+  // fold (ptradd x, 0) -> x
+  if (isNullConstant(N1))
+    return N0;
+
+  if (N0.getOpcode() == ISD::PTRADD &&
+      !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) {
+    SDValue X = N0.getOperand(0);
+    SDValue Y = N0.getOperand(1);
+    SDValue Z = N1;
+    bool N0OneUse = N0.hasOneUse();
+    bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+    bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+    bool ZOneUse = Z.hasOneUse();
+
+    // (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+    //   * x is a null pointer; or
+    //   * y is a constant and z has one use; or
+    //   * y is a constant and (ptradd x, y) has one use; or
+    //   * (ptradd x, y) and z have one use and z is not a constant.
+    if (isNullConstant(X) || (YIsConstant && ZOneUse) ||
+        (YIsConstant && N0OneUse) || (N0OneUse && ZOneUse && !ZIsConstant)) {
+      SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z});
+
+      // Calling visit() can replace the Add node with ISD::DELETED_NODE if
+      // there aren't any users, so keep a handle around whilst we visit it.
+      HandleSDNode ADDHandle(Add);
+
+      SDValue VisitedAdd = visit(Add.getNode());
+      if (VisitedAdd) {
+        // If visit() returns the same node, it means the SDNode was RAUW'd, and
+        // therefore we have to load the new value to perform the checks whether
+        // the reassociation fold is profitable.
+        if (VisitedAdd.getNode() == Add.getNode())
+          Add = ADDHandle.getValue();
+        else
+          Add = VisitedAdd;
+      }
+
+      return DAG.getMemBasePlusOffset(X, Add, DL, SDNodeFlags());
+    }
+
+    // TODO: There is another possible fold here that was proven useful.
+    // It would be this:
+    //
+    // (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+    //   * (ptradd x, y) has one use; and
+    //   * y is a constant; and
+    //   * z is not a constant.
+    //
+    // In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+    // opportunity to select more complex instructions such as SUBPT and
+    // MSUBPT. However, a hypothetical corner case has been found that we could
+    // not avoid. Consider this (pseudo-POSIX C):
+    //
+    // char *foo(char *x, int z) {return (x + LARGE_CONSTANT) + z;}
+    // char *p = mmap(LARGE_CONSTANT);
+    // char *q = foo(p, -LARGE_CONSTANT);
+    //
+    // Then x + LARGE_CONSTANT is one-past-the-end, so valid, and a
+    // further + z takes it back to the start of the mapping, so valid,
+    // regardless of the address mmap gave back. However, if mmap gives you an
+    // address < LARGE_CONSTANT (ignoring high bits), x - LARGE_CONSTANT will
+    // borrow from the high bits (with the subsequent + z carrying back into
+    // the high bits to give you a well-defined pointer) and thus trip
+    // FEAT_CPA's pointer corruption checks.
+    //
+    // We leave this fold as an opportunity for future work, addressing the
+    // corner case for FEAT_CPA, as well as reconciling the solution with the
+    // more general application of pointer arithmetic in other future targets.
+  }
+
+  return SDValue();
+}
+
 /// Try to fold a 'not' shifted sign-bit with add/sub with constant operand into
 /// a shift and add with a different constant.
 static SDValue foldAddSubOfSignBit(SDNode *N, const SDLoc &DL,
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index bbf1b0fd590ef..1483a5f4d5b95 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5641,7 +5641,8 @@ bool SelectionDAG::isADDLike(SDValue Op, bool NoWrap) const {
 
 bool SelectionDAG::isBaseWithConstantOffset(SDValue Op) const {
   return Op.getNumOperands() == 2 && isa<ConstantSDNode>(Op.getOperand(1)) &&
-         (Op.getOpcode() == ISD::ADD || isADDLike(Op));
+         (Op.getOpcode() == ISD::ADD || Op.getOpcode() == ISD::PTRADD ||
+          isADDLike(Op));
 }
 
 bool SelectionDAG::isKnownNeverNaN(SDValue Op, bool SNaN,
@@ -8144,7 +8145,12 @@ SDValue SelectionDAG::getMemBasePlusOffset(SDValue Ptr, SDValue Offset,
                                            const SDNodeFlags Flags) {
   assert(Offset.getValueType().isInteger());
   EVT BasePtrVT = Ptr.getValueType();
-  return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
+  if (!this->getTarget().shouldPreservePtrArith(
+          this->getMachineFunction().getFunction())) {
+    return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
+  } else {
+    return getNode(ISD::PTRADD, DL, BasePtrVT, Ptr, Offset, Flags);
+  }
 }
 
 /// Returns true if memcpy source is constant data.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 8e74a076cc013..b3227d9b589dd 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4284,8 +4284,8 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) {
             (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap()))
           Flags |= SDNodeFlags::NoUnsignedWrap;
 
-        N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N,
-                        DAG.getConstant(Offset, dl, N.getValueType()), Flags);
+        N = DAG.getMemBasePlusOffset(
+            N, DAG.getConstant(Offset, dl, N.getValueType()), dl, Flags);
       }
     } else {
       // IdxSize is the width of the arithmetic according to IR semantics.
@@ -4329,7 +4329,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) {
 
         OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType());
 
-        N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, OffsVal, Flags);
+        N = DAG.getMemBasePlusOffset(N, OffsVal, dl, Flags);
         continue;
       }
 
@@ -4389,7 +4389,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) {
       SDNodeFlags AddFlags;
       AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap());
 
-      N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
+      N = DAG.getMemBasePlusOffset(N, IdxN, dl, AddFlags);
     }
   }
 
@@ -9138,8 +9138,8 @@ bool SelectionDAGBuilder::visitMemPCpyCall(const CallInst &I) {
   Size = DAG.getSExtOrTrunc(Size, sdl, Dst.getValueType());
 
   // Adjust return pointer to point just past the last dst byte.
-  SDValue DstPlusSize = DAG.getNode(ISD::ADD, sdl, Dst.getValueType(),
-                                    Dst, Size);
+  SDNodeFlags Flags;
+  SDValue DstPlusSize = DAG.getMemBasePlusOffset(Dst, Size, sdl, Flags);
   setValue(&I, DstPlusSize);
   return true;
 }
@@ -11230,10 +11230,9 @@ TargetLowering::LowerCallTo(TargetLowering::CallLoweringInfo &CLI) const {
     MachineFunction &MF = CLI.DAG.getMachineFunction();
     Align HiddenSRetAlign = MF.getFrameInfo().getObjectAlign(DemoteStackIdx);
     for (unsigned i = 0; i < NumValues; ++i) {
-      SDValue Add =
-          CLI.DAG.getNode(ISD::ADD, CLI.DL, PtrVT, DemoteStackSlot,
-                          CLI.DAG.getConstant(Offsets[i], CLI.DL, PtrVT),
-                          SDNodeFlags::NoUnsignedWrap);
+      SDValue Add = CLI.DAG.getMemBasePlusOffset(
+          DemoteStackSlot, CLI.DAG.getConstant(Offsets[i], CLI.DL, PtrVT),
+          CLI.DL, SDNodeFlags::NoUnsignedWrap);
       SDValue L = CLI.DAG.getLoad(
           RetTys[i], CLI.DL, CLI.Chain, Add,
           MachinePointerInfo::getFixedStack(CLI.DAG.getMachineFunction(),
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 8faf97271d99e..45ef475dffe6e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -269,6 +269,7 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
 
   // Binary operators
   case ISD::ADD:                        return "add";
+  case ISD::PTRADD:                     return "ptradd";
   case ISD::SUB:                        return "sub";
   case ISD::MUL:                        return "mul";
   case ISD::MULHU:                      return "mulhu";

ritter-x2a · 2025-05-15T07:36:08Z

The changes in the first commit are taken verbatim from #105669, the second one contains two minor fixes by me.

davidchisnall · 2025-05-15T11:42:09Z

This looks very much like code I wrote (sorry). Tagging @resistor to check that it still looks like code that does the right thing for us.

resistor

Overall LGTM and matches what we have in the cheriot downstream: https://github.com/CHERIoT-Platform/llvm-project/tree/cheriot-upstream

I have noted a couple of places where I'd like to understand some optimization divergences from what we have, but I don't think those are blockers.

That said, I would like to hear from at least one other relevant reviewer before this goes in.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

arichardson

LGTM, but I'd like @jrtc27 to confirm the fold part looks good now.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/include/llvm/Target/TargetMachine.h

@rgwott

This change was split from PR llvm#140017, which is based on a part of PR llvm#105669 by @rgwott, which was adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello LLVM tree. Co-authored-by: David Chisnall <[email protected]> Co-authored-by: Jessica Clarke <[email protected]> Co-authored-by: Alexander Richardson <[email protected]> Co-authored-by: Rodolfo Wottrich <[email protected]>

s-barannikov · 2025-05-23T10:02:53Z

This patch adds more calls to getMemBasePlusOffset(), but the method wasn't updated to generate ISD::PTRADD, is this intended?

topperc · 2025-05-23T16:15:30Z

This patch adds more calls to getMemBasePlusOffset(), but the method wasn't updated to generate ISD::PTRADD, is this intended?

Wasn't it updated here?

SDValue SelectionDAG::getMemBasePlusOffset(SDValue Ptr, SDValue Offset,
                                           const SDLoc &DL,
                                           const SDNodeFlags Flags) {
  assert(Offset.getValueType().isInteger());
  EVT BasePtrVT = Ptr.getValueType();
  if (TLI->shouldPreservePtrArith(this->getMachineFunction().getFunction(),
                                  BasePtrVT))
    return getNode(ISD::PTRADD, DL, BasePtrVT, Ptr, Offset, Flags);
  return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
}

topperc · 2025-05-23T16:17:50Z

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

@@ -9138,8 +9138,8 @@ bool SelectionDAGBuilder::visitMemPCpyCall(const CallInst &I) {
  Size = DAG.getSExtOrTrunc(Size, sdl, Dst.getValueType());

  // Adjust return pointer to point just past the last dst byte.
-  SDValue DstPlusSize = DAG.getNode(ISD::ADD, sdl, Dst.getValueType(),
-                                    Dst, Size);
+  SDNodeFlags Flags;


Doesn't the Flags argument of getMemBasePlusOffset have a default value?

Indeed, I've now removed the unnecessary explicit argument setting here, thanks! I think we could actually use NUW here since this operation computes the position one-past-the-end of a memcpy'd memory range (which should either be within or one-past-the-end of an allocated object, which may not wrap according to the IR LangRef), but let's not add more functionality changes to this PR.

s-barannikov · 2025-05-24T18:53:21Z

This patch adds more calls to getMemBasePlusOffset(), but the method wasn't updated to generate ISD::PTRADD, is this intended?

Wasn't it updated here?

SDValue SelectionDAG::getMemBasePlusOffset(SDValue Ptr, SDValue Offset,
                                           const SDLoc &DL,
                                           const SDNodeFlags Flags) {
  assert(Offset.getValueType().isInteger());
  EVT BasePtrVT = Ptr.getValueType();
  if (TLI->shouldPreservePtrArith(this->getMachineFunction().getFunction(),
                                  BasePtrVT))
    return getNode(ISD::PTRADD, DL, BasePtrVT, Ptr, Offset, Flags);
  return getNode(ISD::ADD, DL, BasePtrVT, Ptr, Offset, Flags);
}

I looked for it three times =\

@rgwott

This change was split from PR llvm#140017, which is based on a part of PR llvm#105669 by @rgwott, which was adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello LLVM tree. Co-authored-by: David Chisnall <[email protected]> Co-authored-by: Jessica Clarke <[email protected]> Co-authored-by: Alexander Richardson <[email protected]> Co-authored-by: Rodolfo Wottrich <[email protected]>

@rgwott

This opcode represents the addition of a pointer value (first operand) and an integer offset (second operand). PTRADD nodes are only generated if the TargetMachine opts in by overriding TargetMachine::shouldPreservePtrArith(). The PTRADD node and respective visitPTRADD() function were adapted by @rgwott from the CHERI/Morello LLVM tree. Original authors: @davidchisnall, @jrtc27, @arichardson. The changes in this PR were extracted from PR llvm#105669. --------- Co-authored-by: David Chisnall <[email protected]> Co-authored-by: Jessica Clarke <[email protected]> Co-authored-by: Alexander Richardson <[email protected]> Co-authored-by: Rodolfo Wottrich <[email protected]>

@davidchisnall

CPA stands for Checked Pointer Arithmetic and is part of the 2023 MTE architecture extensions for A-profile. The new CPA instructions perform regular pointer arithmetic (such as base register + offset) but check for overflow in the most significant bits of the result, enhancing security by detecting address tampering. In this patch we intend to capture the semantics of pointer arithmetic when it is not folded into loads/stores, then generate the appropriate scalar CPA instructions. In order to preserve pointer arithmetic semantics through the backend, we use the PTRADD SelectionDAG node type. Use backend option `-aarch64-use-featcpa-codegen=true` to enable CPA CodeGen (for a target with CPA enabled). The story of this PR is that initially it introduced the PTRADD SelectionDAG node and the respective visitPTRADD() function, adapted from the CHERI/Morello LLVM tree. The original authors are @davidchisnall, @jrtc27, @arichardson. After a while, @ritter-x2a took the part of the code that was target-independent and merged it separately in #140017. This PR thus remains as the AArch64-part only. Mode details about the CPA extension can be found at: - https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023 - https://developer.arm.com/documentation/ddi0602/2023-09/ (e.g ADDPT instruction) This PR follows #79569. It does not address vector FEAT_CPA instructions.

@davidchisnall

CPA stands for Checked Pointer Arithmetic and is part of the 2023 MTE architecture extensions for A-profile. The new CPA instructions perform regular pointer arithmetic (such as base register + offset) but check for overflow in the most significant bits of the result, enhancing security by detecting address tampering. In this patch we intend to capture the semantics of pointer arithmetic when it is not folded into loads/stores, then generate the appropriate scalar CPA instructions. In order to preserve pointer arithmetic semantics through the backend, we use the PTRADD SelectionDAG node type. Use backend option `-aarch64-use-featcpa-codegen=true` to enable CPA CodeGen (for a target with CPA enabled). The story of this PR is that initially it introduced the PTRADD SelectionDAG node and the respective visitPTRADD() function, adapted from the CHERI/Morello LLVM tree. The original authors are @davidchisnall, @jrtc27, @arichardson. After a while, @ritter-x2a took the part of the code that was target-independent and merged it separately in llvm#140017. This PR thus remains as the AArch64-part only. Mode details about the CPA extension can be found at: - https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023 - https://developer.arm.com/documentation/ddi0602/2023-09/ (e.g ADDPT instruction) This PR follows llvm#79569. It does not address vector FEAT_CPA instructions.

These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: #131862, related: #140017, #141725) that will also be set in the (build|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.

This flag applies to G_PTR_ADD instructions and indicates that the operation implements an inbounds getelementptr operation, i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that. It is set when the IRTranslator lowers inbounds GEPs (currently only in some cases, to be extended with a future PR), and in the (build|materialize)ObjectPtrOffset functions. Inbounds information is useful in ISel when we have instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. This is analogous to a concurrent effort in SDAG: #131862 (related: #140017, #141725). For SWDEV-516125.

These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: llvm#131862, related: llvm#140017, llvm#141725) that will also be set in the (build|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.

These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: #131862, related: #140017, #141725) that will also be set in the (build|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.

#150392) These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: #131862, related: #140017, #141725) that will also be set in the (build|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.

This flag applies to G_PTR_ADD instructions and indicates that the operation implements an inbounds getelementptr operation, i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that. It is set when the IRTranslator lowers inbounds GEPs (currently only in some cases, to be extended with a future PR), and in the (build|materialize)ObjectPtrOffset functions. Inbounds information is useful in ISel when we have instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. This is analogous to a concurrent effort in SDAG: #131862 (related: #140017, #141725). For SWDEV-516125.

davidchisnall and others added 2 commits May 15, 2025 02:29

Use AddToWorklist() instead of manually re-visiting a newly generated

ba0b2ee

node in the DAGCombines. Also: remove else after return and braces around single-line blocks to match the coding standards.

ritter-x2a requested review from arsenm, davidchisnall, jrtc27, jthackray, arichardson, momchil-velikov, pratlucas, davemgreen, rgwott, kbeyls, efriedma-quic and tmatheson-arm May 15, 2025 07:34

ritter-x2a added llvm:globalisel llvm:SelectionDAG SelectionDAGISel as well labels May 15, 2025

ritter-x2a marked this pull request as ready for review May 15, 2025 07:38

ritter-x2a mentioned this pull request May 15, 2025

[AArch64] Add CodeGen support for scalar FEAT_CPA #105669

Merged

davidchisnall requested a review from resistor May 15, 2025 11:39

resistor reviewed May 15, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Show resolved Hide resolved

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Outdated Show resolved Hide resolved

momchil-velikov removed their request for review May 15, 2025 12:40

arichardson approved these changes May 15, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Outdated Show resolved Hide resolved

Remove DAGCombine comment referring to Morello and CHERI

f6a4cda

jrtc27 reviewed May 15, 2025

View reviewed changes

llvm/include/llvm/Target/TargetMachine.h Outdated Show resolved Hide resolved

Add the pointer's EVT to the signature of TM::shouldPreservePtrArith()

d5f083e

arsenm reviewed May 16, 2025

View reviewed changes

llvm/include/llvm/Target/TargetMachine.h Outdated Show resolved Hide resolved

arsenm reviewed May 16, 2025

View reviewed changes

llvm/include/llvm/Target/TargetMachine.h Outdated Show resolved Hide resolved

arsenm approved these changes May 22, 2025

View reviewed changes

arichardson approved these changes May 22, 2025

View reviewed changes

topperc reviewed May 23, 2025

View reviewed changes

Remove redundant flag argument

7329a37

ritter-x2a merged commit 8adcc8a into llvm:main May 28, 2025
11 checks passed

ritter-x2a mentioned this pull request Jul 24, 2025

[GISel] Introduce MachineIRBuilder::(build|materialize)ObjectPtrOffset #150392

Merged

ritter-x2a mentioned this pull request Jul 28, 2025

[GISel] Introduce MIFlags::InBounds #150900

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SelectionDAG] Introduce ISD::PTRADD #140017

[SelectionDAG] Introduce ISD::PTRADD #140017

Uh oh!

ritter-x2a commented May 15, 2025

Uh oh!

llvmbot commented May 15, 2025 •

edited

Loading

Uh oh!

ritter-x2a commented May 15, 2025

Uh oh!

davidchisnall commented May 15, 2025

Uh oh!

resistor left a comment

Uh oh!

Uh oh!

Uh oh!

arichardson left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s-barannikov commented May 23, 2025

Uh oh!

topperc commented May 23, 2025

Uh oh!

topperc May 23, 2025

Uh oh!

ritter-x2a May 26, 2025

Uh oh!

s-barannikov commented May 24, 2025

Uh oh!

Uh oh!

Uh oh!

[SelectionDAG] Introduce ISD::PTRADD #140017

[SelectionDAG] Introduce ISD::PTRADD #140017

Uh oh!

Conversation

ritter-x2a commented May 15, 2025

Uh oh!

llvmbot commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ritter-x2a commented May 15, 2025

Uh oh!

davidchisnall commented May 15, 2025

Uh oh!

resistor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

arichardson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s-barannikov commented May 23, 2025

Uh oh!

topperc commented May 23, 2025

Uh oh!

topperc May 23, 2025

Choose a reason for hiding this comment

Uh oh!

ritter-x2a May 26, 2025

Choose a reason for hiding this comment

Uh oh!

s-barannikov commented May 24, 2025

Uh oh!

Uh oh!

Uh oh!

llvmbot commented May 15, 2025 •

edited

Loading