Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly] Fix rethrow's index calculation #114693

Merged
merged 8 commits into from
Nov 6, 2024
Merged

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Nov 3, 2024

So far we have assumed that we only rethrow the exception caught in the innermost EH pad. This is true in code we directly generate, but after inlining this may not be the case. For example, consider this code:

ehcleanup:
  %0 = cleanuppad ...
  call @destructor
  cleanupret from %0 unwind label %catch.dispatch

If destructor gets inlined into this function, the code can be like

ehcleanup:
  %0 = cleanuppad ...
  invoke @throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch

We lower a cleanupret into rethrow, which assumes it rethrows the exception caught by the nearest dominating EH pad. But after the inlining, the nearest dominating EH pad is not ehcleanup but catch.start.i.

The problem exists in the same manner in the new (exnref) EH, because it assumes the exception comes from the nearest EH pad and saves an exnref from that EH pad and rethrows it (using throw_ref).

This problem can be fixed easily if cleanupret has the basic block where its matching cleanuppad is. The bitcode instruction cleanupret kind of has that info (it has a token from the cleanuppad), but that info is lost when when we enter ISel, because TargetSelectionDAG.td's cleanupret node does not have any arguments:

def cleanupret : SDNode<"ISD::CLEANUPRET" , SDTNone, [SDNPHasChain]>;
Note that catchret already has two basic block arguments, even though neither of them means catchpad's BB.

This PR adds the cleanuppad's BB as an argument to cleanupret node in ISel and uses it in the Wasm backend. Because this node is also used in X86 backend we need to note its argument there too but nothing more needs to change there as long as X86 doesn't need it.


  • Details about changes in the Wasm backend:

After this PR, our pseudo RETHROW instruction takes a BB, which means the EH pad whose exception it needs to rethrow. There are currently two ways to generate a RETHROW: one is from llvm.wasm.rethrow intrinsic and the other is from CLEANUPRET we discussed above. In case of llvm.wasm.rethrow, we add a '0' as a placeholder argument when it is lowered to a RETHROW, and change it to a BB in LateEHPrepare. As written in the comments, this PR doesn't change how this BB is computed. The BB argument will be converted to an immediate argument as with other control flow instructions in CFGStackify.

In case of CLEANUPRET, it already has a BB argument pointing to an EH pad, so it is just converted to a RETHROW with the same BB argument in LateEHPrepare. This will also be lowered to an immediate in CFGStackify with other control flow instructions.


Fixes #114600.

So far we have assumed that we only rethrow the exception caught in the
innermost EH pad. This is true in code we directly generate, but after
inlining this may not be the case. For example, consider this code:
```ll
ehcleanup:
  %0 = cleanuppad ...
  call @Destructor
  cleanupret from %0 unwind label %catch.dispatch
```

If `destructor` gets inlined into this function, the code can be like
```ll
ehcleanup:
  %0 = cleanuppad ...
  invoke @throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch
```

We lower a `cleanupret` into `rethrow`, which assumes it rethrows the
exception caught by the nearest dominating EH pad. But after the
inlining, the nearest dominating EH pad is not `ehcleanup` but
`catch.start.i`.

The problem exists in the same manner in the new (exnref) EH, because it
assumes the exception comes from the nearest EH pad and saves an exnref
from that EH pad and rethrows it (using `throw_ref`).

This problem can be fixed easily if `cleanupret` has the basic block
where its matching `cleanuppad` is. The bitcode instruction `cleanupret`
kind of has that info (it has a token from the `cleanuppad`), but that
info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s
`cleanupret` node does not have any arguments:
https://github.com/llvm/llvm-project/blob/5091a359d9807db8f7d62375696f93fc34226969/llvm/include/llvm/Target/TargetSelectionDAG.td#L700
Note that `catchret` already has two basic block arguments, even though
neither of them means `catchpad`'s BB.

This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node
in ISel and uses it in the Wasm backend. Because this node is also used
in X86 backend we need to note its argument there too but nothing more
needs to change there as long as X86 doesn't need it.

---

- Details about changes in the Wasm backend:

After this PR, our pseudo `RETHROW` instruction takes a BB, which means
the EH pad whose exception it needs to rethrow. There are currently two
ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic
and the other is from `CLEANUPRET` we discussed above. In case of
`llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is
lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As
written in the comments, this PR doesn't change how this BB is computed.
The BB argument will be converted to an immediate argument as with other
control flow instructions in CFGStackify.

In case of `CLEANUPRET`, it already has a BB argument pointing to an EH
pad, so it is just converted to a `RETHROW` with the same BB argument in
LateEHPrepare. This will also be lowered to an immediate in CFGStackify
with other control flow instructions.

---

Fixes llvm#114600.
@llvmbot
Copy link
Member

llvmbot commented Nov 3, 2024

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-x86

Author: Heejin Ahn (aheejin)

Changes

So far we have assumed that we only rethrow the exception caught in the innermost EH pad. This is true in code we directly generate, but after inlining this may not be the case. For example, consider this code:

ehcleanup:
  %0 = cleanuppad ...
  call @<!-- -->destructor
  cleanupret from %0 unwind label %catch.dispatch

If destructor gets inlined into this function, the code can be like

ehcleanup:
  %0 = cleanuppad ...
  invoke @<!-- -->throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @<!-- -->some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch

We lower a cleanupret into rethrow, which assumes it rethrows the exception caught by the nearest dominating EH pad. But after the inlining, the nearest dominating EH pad is not ehcleanup but catch.start.i.

The problem exists in the same manner in the new (exnref) EH, because it assumes the exception comes from the nearest EH pad and saves an exnref from that EH pad and rethrows it (using throw_ref).

This problem can be fixed easily if cleanupret has the basic block where its matching cleanuppad is. The bitcode instruction cleanupret kind of has that info (it has a token from the cleanuppad), but that info is lost when when we enter ISel, because TargetSelectionDAG.td's cleanupret node does not have any arguments:

def cleanupret : SDNode<"ISD::CLEANUPRET" , SDTNone, [SDNPHasChain]>;
Note that catchret already has two basic block arguments, even though neither of them means catchpad's BB.

This PR adds the cleanuppad's BB as an argument to cleanupret node in ISel and uses it in the Wasm backend. Because this node is also used in X86 backend we need to note its argument there too but nothing more needs to change there as long as X86 doesn't need it.


  • Details about changes in the Wasm backend:

After this PR, our pseudo RETHROW instruction takes a BB, which means the EH pad whose exception it needs to rethrow. There are currently two ways to generate a RETHROW: one is from llvm.wasm.rethrow intrinsic and the other is from CLEANUPRET we discussed above. In case of llvm.wasm.rethrow, we add a '0' as a placeholder argument when it is lowered to a RETHROW, and change it to a BB in LateEHPrepare. As written in the comments, this PR doesn't change how this BB is computed. The BB argument will be converted to an immediate argument as with other control flow instructions in CFGStackify.

In case of CLEANUPRET, it already has a BB argument pointing to an EH pad, so it is just converted to a RETHROW with the same BB argument in LateEHPrepare. This will also be lowered to an immediate in CFGStackify with other control flow instructions.


Fixes #114600.


Patch is 25.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114693.diff

11 Files Affected:

  • (modified) llvm/include/llvm/Target/TargetSelectionDAG.td (+5-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+4-2)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (+7-41)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp (+13)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td (+4-5)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp (+38-17)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+2-1)
  • (modified) llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.mir (+7-4)
  • (modified) llvm/test/CodeGen/WebAssembly/exception-legacy.ll (+108)
  • (modified) llvm/test/CodeGen/WebAssembly/exception-legacy.mir (+1-1)
  • (modified) llvm/test/CodeGen/WebAssembly/exception.ll (+115)
diff --git a/llvm/include/llvm/Target/TargetSelectionDAG.td b/llvm/include/llvm/Target/TargetSelectionDAG.td
index fa516fc9b10175..e4485d997c34cc 100644
--- a/llvm/include/llvm/Target/TargetSelectionDAG.td
+++ b/llvm/include/llvm/Target/TargetSelectionDAG.td
@@ -234,6 +234,10 @@ def SDTCatchret : SDTypeProfile<0, 2, [     // catchret
   SDTCisVT<0, OtherVT>, SDTCisVT<1, OtherVT>
 ]>;
 
+def SDTCleanupret : SDTypeProfile<0, 1, [   // cleanupret
+  SDTCisVT<0, OtherVT>
+]>;
+
 def SDTNone : SDTypeProfile<0, 0, []>;      // ret, trap
 
 def SDTUBSANTrap : SDTypeProfile<0, 1, []>;      // ubsantrap
@@ -697,7 +701,7 @@ def brind      : SDNode<"ISD::BRIND"      , SDTBrind,  [SDNPHasChain]>;
 def br         : SDNode<"ISD::BR"         , SDTBr,     [SDNPHasChain]>;
 def catchret   : SDNode<"ISD::CATCHRET"   , SDTCatchret,
                         [SDNPHasChain, SDNPSideEffect]>;
-def cleanupret : SDNode<"ISD::CLEANUPRET" , SDTNone,   [SDNPHasChain]>;
+def cleanupret : SDNode<"ISD::CLEANUPRET" , SDTCleanupret, [SDNPHasChain]>;
 
 def trap       : SDNode<"ISD::TRAP"       , SDTNone,
                         [SDNPHasChain, SDNPSideEffect]>;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 8450553743074c..14351e98c567d7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -2154,8 +2154,10 @@ void SelectionDAGBuilder::visitCleanupRet(const CleanupReturnInst &I) {
   FuncInfo.MBB->normalizeSuccProbs();
 
   // Create the terminator node.
-  SDValue Ret =
-      DAG.getNode(ISD::CLEANUPRET, getCurSDLoc(), MVT::Other, getControlRoot());
+  MachineBasicBlock *CleanupPadMBB =
+      FuncInfo.getMBB(I.getCleanupPad()->getParent());
+  SDValue Ret = DAG.getNode(ISD::CLEANUPRET, getCurSDLoc(), MVT::Other,
+                            getControlRoot(), DAG.getBasicBlock(CleanupPadMBB));
   DAG.setRoot(Ret);
 }
 
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
index a5f73fabca3542..59a47f9f4626bf 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
@@ -93,9 +93,8 @@ class WebAssemblyCFGStackify final : public MachineFunctionPass {
                           const MachineBasicBlock *MBB);
   unsigned getDelegateDepth(const SmallVectorImpl<EndMarkerInfo> &Stack,
                             const MachineBasicBlock *MBB);
-  unsigned
-  getRethrowDepth(const SmallVectorImpl<EndMarkerInfo> &Stack,
-                  const SmallVectorImpl<const MachineBasicBlock *> &EHPadStack);
+  unsigned getRethrowDepth(const SmallVectorImpl<EndMarkerInfo> &Stack,
+                           const MachineBasicBlock *EHPadToRethrow);
   void rewriteDepthImmediates(MachineFunction &MF);
   void fixEndsAtEndOfFunction(MachineFunction &MF);
   void cleanupFunctionData(MachineFunction &MF);
@@ -1875,34 +1874,13 @@ unsigned WebAssemblyCFGStackify::getDelegateDepth(
 
 unsigned WebAssemblyCFGStackify::getRethrowDepth(
     const SmallVectorImpl<EndMarkerInfo> &Stack,
-    const SmallVectorImpl<const MachineBasicBlock *> &EHPadStack) {
+    const MachineBasicBlock *EHPadToRethrow) {
   unsigned Depth = 0;
-  // In our current implementation, rethrows always rethrow the exception caught
-  // by the innermost enclosing catch. This means while traversing Stack in the
-  // reverse direction, when we encounter END_TRY, we should check if the
-  // END_TRY corresponds to the current innermost EH pad. For example:
-  // try
-  //   ...
-  // catch         ;; (a)
-  //   try
-  //     rethrow 1 ;; (b)
-  //   catch       ;; (c)
-  //     rethrow 0 ;; (d)
-  //   end         ;; (e)
-  // end           ;; (f)
-  //
-  // When we are at 'rethrow' (d), while reversely traversing Stack the first
-  // 'end' we encounter is the 'end' (e), which corresponds to the 'catch' (c).
-  // And 'rethrow' (d) rethrows the exception caught by 'catch' (c), so we stop
-  // there and the depth should be 0. But when we are at 'rethrow' (b), it
-  // rethrows the exception caught by 'catch' (a), so when traversing Stack
-  // reversely, we should skip the 'end' (e) and choose 'end' (f), which
-  // corresponds to 'catch' (a).
   for (auto X : reverse(Stack)) {
     const MachineInstr *End = X.second;
     if (End->getOpcode() == WebAssembly::END_TRY) {
       auto *EHPad = TryToEHPad[EndToBegin[End]];
-      if (EHPadStack.back() == EHPad)
+      if (EHPadToRethrow == EHPad)
         break;
     }
     ++Depth;
@@ -1914,7 +1892,6 @@ unsigned WebAssemblyCFGStackify::getRethrowDepth(
 void WebAssemblyCFGStackify::rewriteDepthImmediates(MachineFunction &MF) {
   // Now rewrite references to basic blocks to be depth immediates.
   SmallVector<EndMarkerInfo, 8> Stack;
-  SmallVector<const MachineBasicBlock *, 8> EHPadStack;
 
   auto RewriteOperands = [&](MachineInstr &MI) {
     // Rewrite MBB operands to be depth immediates.
@@ -1925,6 +1902,8 @@ void WebAssemblyCFGStackify::rewriteDepthImmediates(MachineFunction &MF) {
       if (MO.isMBB()) {
         if (MI.getOpcode() == WebAssembly::DELEGATE)
           MO = MachineOperand::CreateImm(getDelegateDepth(Stack, MO.getMBB()));
+        else if (MI.getOpcode() == WebAssembly::RETHROW)
+          MO = MachineOperand::CreateImm(getRethrowDepth(Stack, MO.getMBB()));
         else
           MO = MachineOperand::CreateImm(getBranchDepth(Stack, MO.getMBB()));
       }
@@ -1951,12 +1930,8 @@ void WebAssemblyCFGStackify::rewriteDepthImmediates(MachineFunction &MF) {
         Stack.pop_back();
         break;
 
-      case WebAssembly::END_TRY: {
-        auto *EHPad = TryToEHPad[EndToBegin[&MI]];
-        EHPadStack.push_back(EHPad);
-        [[fallthrough]];
-      }
       case WebAssembly::END_BLOCK:
+      case WebAssembly::END_TRY:
       case WebAssembly::END_TRY_TABLE:
         Stack.push_back(std::make_pair(&MBB, &MI));
         break;
@@ -1965,15 +1940,6 @@ void WebAssemblyCFGStackify::rewriteDepthImmediates(MachineFunction &MF) {
         Stack.push_back(std::make_pair(EndToBegin[&MI]->getParent(), &MI));
         break;
 
-      case WebAssembly::CATCH_LEGACY:
-      case WebAssembly::CATCH_ALL_LEGACY:
-        EHPadStack.pop_back();
-        break;
-
-      case WebAssembly::RETHROW:
-        MI.getOperand(0).setImm(getRethrowDepth(Stack, EHPadStack));
-        break;
-
       case WebAssembly::DELEGATE:
         RewriteOperands(MI);
         Stack.push_back(std::make_pair(&MBB, &MI));
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
index b5b9cbeacfa18f..630fcfe879196c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
@@ -248,6 +248,19 @@ void WebAssemblyDAGToDAGISel::Select(SDNode *Node) {
       ReplaceNode(Node, Throw);
       return;
     }
+    case Intrinsic::wasm_rethrow: {
+      // RETHROW's BB argument will be populated in LateEHPrepare. Just use a
+      // '0' as a placeholder for now.
+      MachineSDNode *Rethrow = CurDAG->getMachineNode(
+          WebAssembly::RETHROW, DL,
+          MVT::Other, // outchain type
+          {
+              CurDAG->getConstant(0, DL, MVT::i32), // placeholder
+              Node->getOperand(0)                   // inchain
+          });
+      ReplaceNode(Node, Rethrow);
+      return;
+    }
     }
     break;
   }
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
index 97ff6d77f54b15..3fac140f5606b8 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
@@ -168,7 +168,8 @@ defm CATCH_ALL_REF : I<(outs EXNREF:$dst), (ins), (outs), (ins), []>;
 // Pseudo instructions: cleanupret / catchret
 let isTerminator = 1, hasSideEffects = 1, isBarrier = 1, hasCtrlDep = 1,
     isPseudo = 1, isEHScopeReturn = 1 in {
-  defm CLEANUPRET : NRI<(outs), (ins), [(cleanupret)], "cleanupret", 0>;
+  defm CLEANUPRET : NRI<(outs), (ins bb_op:$ehpad), [(cleanupret bb:$ehpad)],
+                        "cleanupret", 0>;
   defm CATCHRET : NRI<(outs), (ins bb_op:$dst, bb_op:$from),
                       [(catchret bb:$dst, bb:$from)], "catchret", 0>;
 } // isTerminator = 1, hasSideEffects = 1, isBarrier = 1, hasCtrlDep = 1,
@@ -180,11 +181,9 @@ let isTerminator = 1, hasSideEffects = 1, isBarrier = 1, hasCtrlDep = 1,
 // Rethrowing an exception: rethrow
 // The new exnref proposal also uses this instruction as an interim pseudo
 // instruction before we convert it to a THROW_REF.
+// $ehpad is the EH pad where the exception the rethrow has been caught.
 let isTerminator = 1, hasCtrlDep = 1, isBarrier = 1 in
-defm RETHROW : NRI<(outs), (ins i32imm:$depth), [], "rethrow \t$depth", 0x09>;
-// The depth argument will be computed in CFGStackify. We set it to 0 here for
-// now.
-def : Pat<(int_wasm_rethrow), (RETHROW 0)>;
+defm RETHROW : NRI<(outs), (ins bb_op:$ehpad), [], "rethrow \t$ehpad", 0x09>;
 
 // Region within which an exception is caught: try / end_try
 let Uses = [VALUE_STACK], Defs = [VALUE_STACK] in {
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp
index 70b406b6552bf0..f5eec0e3337cbb 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp
@@ -251,15 +251,40 @@ bool WebAssemblyLateEHPrepare::replaceFuncletReturns(MachineFunction &MF) {
       Changed = true;
       break;
     }
+    case WebAssembly::RETHROW:
+      // These RETHROWs here were lowered from llvm.wasm.rethrow() intrinsics,
+      // generated in Clang for when an exception is not caught by the given
+      // type (e.g. catch (int)).
+      //
+      // RETHROW's BB argument is the EH pad the exception to rethrow has been
+      // caught. (Until this point, RETHROW has just a '0' as a placeholder
+      // argument.) For these llvm.wasm.rethrow()s, we can safely assume the
+      // exception comes from the nearest dominating EH pad, because catch.start
+      // EH pad is structured like this:
+      //
+      // catch.start:
+      //   catchpad ...
+      //   %matches = compare ehselector with typeid
+      //   br i1 %matches, label %catch, label %rethrow
+      //
+      // rethrow:
+      //   ;; rethrows the exception caught in 'catch.start'
+      //   call @llvm.wasm.rethrow()
+      while (TI->getNumOperands() > 0)
+        TI->removeOperand(TI->getNumOperands() - 1);
+      TI->addOperand(MachineOperand::CreateMBB(getMatchingEHPad(TI)));
+      Changed = true;
+      break;
     case WebAssembly::CLEANUPRET: {
-      // Replace a cleanupret with a rethrow. For C++ support, currently
-      // rethrow's immediate argument is always 0 (= the latest exception).
+      // CLEANUPRETs have the EH pad BB the exception to rethrow has been caught
+      // as an argument. Use it and change the instruction opcode to 'RETHROW'
+      // to make rethrowing instructions consistent.
       //
-      // Even when -wasm-enable-exnref is true, we use a RETHROW here for the
-      // moment. This will be converted to a THROW_REF in
-      // addCatchRefsAndThrowRefs.
+      // This is because we cannot safely assume that it is always the nearest
+      // dominating EH pad, in case there are code transformations such as
+      // inlining.
       BuildMI(MBB, TI, TI->getDebugLoc(), TII.get(WebAssembly::RETHROW))
-          .addImm(0);
+          .addMBB(TI->getOperand(0).getMBB());
       TI->eraseFromParent();
       Changed = true;
       break;
@@ -272,21 +297,17 @@ bool WebAssemblyLateEHPrepare::replaceFuncletReturns(MachineFunction &MF) {
 // Add CATCH_REF and CATCH_ALL_REF pseudo instructions to EH pads, and convert
 // RETHROWs to THROW_REFs.
 bool WebAssemblyLateEHPrepare::addCatchRefsAndThrowRefs(MachineFunction &MF) {
-  bool Changed = false;
   const auto &TII = *MF.getSubtarget<WebAssemblySubtarget>().getInstrInfo();
   auto &MRI = MF.getRegInfo();
   DenseMap<MachineBasicBlock *, SmallVector<MachineInstr *, 2>> EHPadToRethrows;
 
   // Create a map of <EH pad, a vector of RETHROWs rethrowing its exception>
-  for (auto &MBB : MF) {
-    for (auto &MI : MBB) {
-      if (MI.getOpcode() == WebAssembly::RETHROW) {
-        Changed = true;
-        auto *EHPad = getMatchingEHPad(&MI);
-        EHPadToRethrows[EHPad].push_back(&MI);
-      }
-    }
-  }
+  for (auto &MBB : MF)
+    for (auto &MI : MBB)
+      if (MI.getOpcode() == WebAssembly::RETHROW)
+        EHPadToRethrows[MI.getOperand(0).getMBB()].push_back(&MI);
+  if (EHPadToRethrows.empty())
+    return false;
 
   // Convert CATCH into CATCH_REF and CATCH_ALL into CATCH_ALL_REF, when the
   // caught exception is rethrown. And convert RETHROWs to THROW_REFs.
@@ -325,7 +346,7 @@ bool WebAssemblyLateEHPrepare::addCatchRefsAndThrowRefs(MachineFunction &MF) {
     }
   }
 
-  return Changed;
+  return true;
 }
 
 // Remove unnecessary unreachables after a throw/rethrow/throw_ref.
diff --git a/llvm/lib/Target/X86/X86InstrCompiler.td b/llvm/lib/Target/X86/X86InstrCompiler.td
index a05c3f028442c0..880061d82036d5 100644
--- a/llvm/lib/Target/X86/X86InstrCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrCompiler.td
@@ -195,7 +195,8 @@ def EH_RETURN64   : I<0xC3, RawFrm, (outs), (ins GR64:$addr),
 
 let isTerminator = 1, hasSideEffects = 1, isBarrier = 1, hasCtrlDep = 1,
     isCodeGenOnly = 1, isReturn = 1, isEHScopeReturn = 1 in {
-  def CLEANUPRET : I<0, Pseudo, (outs), (ins), "# CLEANUPRET", [(cleanupret)]>;
+  def CLEANUPRET : I<0, Pseudo, (outs), (ins), "# CLEANUPRET",
+                     [(cleanupret bb)]>;
 
   // CATCHRET needs a custom inserter for SEH.
   let usesCustomInserter = 1 in
diff --git a/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.mir b/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.mir
index c37a82fe80826d..2494ad1ad581d5 100644
--- a/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.mir
+++ b/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.mir
@@ -40,15 +40,16 @@ body: |
     ; CHECK: RETHROW 1
     EH_LABEL <mcsymbol .Ltmp2>
     %0:i32 = CATCH_LEGACY &__cpp_exception, implicit-def dead $arguments
-    RETHROW 0, implicit-def dead $arguments
+    RETHROW %bb.1, implicit-def dead $arguments
 
   bb.2 (landing-pad):
+    successors: %bb.3
     ; CHECK: bb.2 (landing-pad):
     ; CHECK: CATCH_LEGACY
     ; CHECK: RETHROW 0
     EH_LABEL <mcsymbol .Ltmp3>
     %1:i32 = CATCH_LEGACY &__cpp_exception, implicit-def dead $arguments
-    RETHROW 0, implicit-def dead $arguments
+    RETHROW %bb.2, implicit-def dead $arguments
 
   bb.3:
     ; CHECK: bb.3:
@@ -105,12 +106,14 @@ body: |
     RETURN %0:i32, implicit-def dead $arguments
 
   bb.3 (landing-pad):
+    successors:
     EH_LABEL <mcsymbol .Ltmp4>
     %0:i32 = CATCH_LEGACY &__cpp_exception, implicit-def dead $arguments
-    RETHROW 0, implicit-def dead $arguments
+    RETHROW %bb.3, implicit-def dead $arguments
 
   bb.4 (landing-pad):
+    successors:
     EH_LABEL <mcsymbol .Ltmp5>
     %1:i32 = CATCH_LEGACY &__cpp_exception, implicit-def dead $arguments
-    RETHROW 0, implicit-def dead $arguments
+    RETHROW %bb.4, implicit-def dead $arguments
 ...
diff --git a/llvm/test/CodeGen/WebAssembly/exception-legacy.ll b/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
index 3327d8be894f0c..3aac61aee340c2 100644
--- a/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
+++ b/llvm/test/CodeGen/WebAssembly/exception-legacy.ll
@@ -400,6 +400,110 @@ unreachable:                                      ; preds = %rethrow
   unreachable
 }
 
+; The bitcode below is generated when the code below is compiled and
+; Temp::~Temp() is inlined into inlined_cleanupret():
+;
+; void inlined_cleanupret() {
+; try {
+;   Temp t;
+;   throw 2;
+; } catch (...)
+; }
+;
+; Temp::~Temp() {
+;   try {
+;     throw 1;
+;   } catch (...) {
+;   }
+; }
+;
+; ~Temp() generates cleanupret, which is lowered to a 'rethrow' later. That
+; rethrow's immediate argument should correctly target the top-level cleanuppad
+; (catch_all). This is a regression test for the bug where we did not compute
+; rethrow's argument correctly.
+
+; CHECK-LABEL: inlined_cleanupret:
+; CHECK: try
+; CHECK:   call  __cxa_throw
+; CHECK: catch_all
+; CHECK:   try
+; CHECK:     try
+; CHECK:       call  __cxa_throw
+; CHECK:       catch
+; CHECK:       call  __cxa_end_catch
+; CHECK:       try
+; CHECK:         try
+; Note that this rethrow targets the top-level catch_all
+; CHECK:           rethrow   4
+; CHECK:         catch
+; CHECK:           try
+; CHECK:             call  __cxa_end_catch
+; CHECK:           delegate    5
+; CHECK:           return
+; CHECK:         end_try
+; CHECK:       delegate    3
+; CHECK:     end_try
+; CHECK:     unreachable
+; CHECK:   catch_all
+; CHECK:     call  _ZSt9terminatev
+; CHECK:     unreachable
+; CHECK:   end_try
+; CHECK: end_try
+; CHECK: unreachable
+define void @inlined_cleanupret() personality ptr @__gxx_wasm_personality_v0 {
+entry:
+  %exception = tail call ptr @__cxa_allocate_exception(i32 4)
+  store i32 2, ptr %exception, align 16
+  invoke void @__cxa_throw(ptr nonnull %exception, ptr nonnull @_ZTIi, ptr null)
+          to label %unreachable unwind label %ehcleanup
+
+ehcleanup:                                        ; preds = %entry
+  %0 = cleanuppad within none []
+  %exception.i = call ptr @__cxa_allocate_exception(i32 4) [ "funclet"(token %0) ]
+  store i32 1, ptr %exception.i, align 16
+  invoke void @__cxa_throw(ptr nonnull %exception.i, ptr nonnull @_ZTIi, ptr null) [ "funclet"(token %0) ]
+          to label %unreachable unwind label %catch.dispatch.i
+
+catch.dispatch.i:                                 ; preds = %ehcleanup
+  %1 = catchswitch within %0 [label %catch.start.i] unwind label %terminate.i
+
+catch.start.i:                                    ; preds = %catch.dispatch.i
+  %2 = catchpad within %1 [ptr null]
+  %3 = tail call ptr @llvm.wasm.get.exception(token %2)
+  %4 = tail call i32 @llvm.wasm.get.ehselector(token %2)
+  %5 = call ptr @__cxa_begin_catch(ptr %3) [ "funclet"(token %2) ]
+  invoke void @__cxa_end_catch() [ "funclet"(token %2) ]
+          to label %invoke.cont.i unwind label %terminate.i
+
+invoke.cont.i:                                    ; preds = %catch.start.i
+  catchret from %2 to label %_ZN4TempD2Ev.exit
+
+terminate.i:                                      ; preds = %catch.start.i, %catch.dispatch.i
+  %6 = cleanuppad within %0 []
+  call void @_ZSt9terminatev() [ "funclet"(token %6) ]
+  unreachable
+
+_ZN4TempD2Ev.exit:                                ; preds = %invoke.cont.i
+  cleanupret from %0 unwind label %catch.dispatch
+
+catch.dispatch:                                   ; preds = %_ZN4TempD2Ev.exit
+  %7 = catchswitch within none [label %catch.start] unwind to caller
+
+catch.start:                                      ; preds = %catch.dispatch
+  %8 = catchpad within %7 [ptr null]
+  %9 = tail call ptr @llvm.wasm.get.exception(token %8)
+  %10 = tail call i32 @llvm.wasm.get.ehselector(token %8)
+  %11 = call ptr @__cxa_begin_catch(ptr %9) #8 [ "funclet"(token %8) ]
+  call void @__cxa_end_catch() [ "funclet"(token %8) ]
+  catchret from %8 to label %try.cont
+
+try.cont:                                         ; preds = %catch.start
+  ret void
+
+unreachable:                                      ; preds = %entry
+  unreachable
+}
+
 
 declare void @foo()
 declare void @bar(ptr)
@@ -415,8 +519,12 @@ declare i32 @llvm.wasm.get.ehselector(token) #0
 declare void @llvm.wasm.rethrow() #1
 ; Function Attrs: nounwind
 declare i32 @llvm.eh.typeid.for(ptr) #0
+; Function Attrs: nounwind
+declare ptr @__cxa_allocate_exception(i32) #0
 declare ptr @__cxa_begin_catch(ptr)
 declare void @__cxa_end_catch()
+; Function Attrs: noreturn
+declare void @__cxa_throw(ptr, ptr, ptr) #1
 declare void @_ZSt9terminatev()
 declare ptr @_ZN4TempD2Ev(ptr returned)
 
diff --git a/llvm/test/CodeGen/WebAssembly/exception-legacy.mir b/llvm/test/CodeGen/WebAssembly/exception-legacy.mir
index f9eb40bb03cc7a..fbed7db1dcb110 100644
--- a/llvm/test/CodeGen/WebAssembly/exception-legacy.mir
+++ b/llvm/test/CodeGen/WebAssembly/exception-legacy.mir
@@ -105,7 +105,7 @@ body: |
 
   bb.2:
     ...
[truncated]

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The selection DAG builder changes look good to me, it's pretty straightforward to pass through the information we're already modeling.

Since C++11, destructors are implicitly noexcept(true), so this has become a pretty obscure edge case, since if I understand the scenario right, an uhandled exception has to unwind out of an inlined destructor call.

@coolreader18
Copy link

Since C++11, destructors are implicitly noexcept(true), so this has become a pretty obscure edge case, since if I understand the scenario right, an uhandled exception has to unwind out of an inlined destructor call.

(I helped to discover the original issue in rust-lang/rust#132416) -- this bug actually can occur when raising and catching an exception while in the process of unwinding from another. It doesn't require an exception to unwind from a destructor, just for the try { throw; } catch () {} destructor to run while an exception is unwinding.

llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td Outdated Show resolved Hide resolved
llvm/lib/Target/WebAssembly/WebAssemblyLateEHPrepare.cpp Outdated Show resolved Hide resolved
// rethrow:
// ;; rethrows the exception caught in 'catch.start'
// call @llvm.wasm.rethrow()
while (TI->getNumOperands() > 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we expect there to be more than one operand on the end here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the implicit operands (implicit-def $arguments below), and was wondering if I append a new operand, that might go after the implicit ones:

RETHROW 0, implicit-def dead $arguments

But turns out all explicit operands automatically go before implicit ones so I think I can just do TI->removeOperand(0). Fixed.

@aheejin aheejin merged commit 492812f into llvm:main Nov 6, 2024
8 checks passed
@aheejin aheejin deleted the rethrow_fix branch November 6, 2024 05:45
aheejin added a commit to aheejin/llvm-project that referenced this pull request Nov 12, 2024
aheejin added a commit to aheejin/llvm-project that referenced this pull request Nov 12, 2024
So far we have assumed that we only rethrow the exception caught in the
innermost EH pad. This is true in code we directly generate, but after
inlining this may not be the case. For example, consider this code:
```ll
ehcleanup:
  %0 = cleanuppad ...
  call @Destructor
  cleanupret from %0 unwind label %catch.dispatch
```

If `destructor` gets inlined into this function, the code can be like
```ll
ehcleanup:
  %0 = cleanuppad ...
  invoke @throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch
```

We lower a `cleanupret` into `rethrow`, which assumes it rethrows the
exception caught by the nearest dominating EH pad. But after the
inlining, the nearest dominating EH pad is not `ehcleanup` but
`catch.start.i`.

The problem exists in the same manner in the new (exnref) EH, because it
assumes the exception comes from the nearest EH pad and saves an exnref
from that EH pad and rethrows it (using `throw_ref`).

This problem can be fixed easily if `cleanupret` has the basic block
where its matching `cleanuppad` is. The bitcode instruction `cleanupret`
kind of has that info (it has a token from the `cleanuppad`), but that
info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s
`cleanupret` node does not have any arguments:

https://github.com/llvm/llvm-project/blob/5091a359d9807db8f7d62375696f93fc34226969/llvm/include/llvm/Target/TargetSelectionDAG.td#L700
Note that `catchret` already has two basic block arguments, even though
neither of them means `catchpad`'s BB.

This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node
in ISel and uses it in the Wasm backend. Because this node is also used
in X86 backend we need to note its argument there too but nothing more
needs to change there as long as X86 doesn't need it.

---

- Details about changes in the Wasm backend:

After this PR, our pseudo `RETHROW` instruction takes a BB, which means
the EH pad whose exception it needs to rethrow. There are currently two
ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic
and the other is from `CLEANUPRET` we discussed above. In case of
`llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is
lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As
written in the comments, this PR doesn't change how this BB is computed.
The BB argument will be converted to an immediate argument as with other
control flow instructions in CFGStackify.

In case of `CLEANUPRET`, it already has a BB argument pointing to an EH
pad, so it is just converted to a `RETHROW` with the same BB argument in
LateEHPrepare. This will also be lowered to an immediate in CFGStackify
with other control flow instructions.

---

Fixes llvm#114600.
tru pushed a commit to aheejin/llvm-project that referenced this pull request Nov 19, 2024
So far we have assumed that we only rethrow the exception caught in the
innermost EH pad. This is true in code we directly generate, but after
inlining this may not be the case. For example, consider this code:
```ll
ehcleanup:
  %0 = cleanuppad ...
  call @Destructor
  cleanupret from %0 unwind label %catch.dispatch
```

If `destructor` gets inlined into this function, the code can be like
```ll
ehcleanup:
  %0 = cleanuppad ...
  invoke @throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch
```

We lower a `cleanupret` into `rethrow`, which assumes it rethrows the
exception caught by the nearest dominating EH pad. But after the
inlining, the nearest dominating EH pad is not `ehcleanup` but
`catch.start.i`.

The problem exists in the same manner in the new (exnref) EH, because it
assumes the exception comes from the nearest EH pad and saves an exnref
from that EH pad and rethrows it (using `throw_ref`).

This problem can be fixed easily if `cleanupret` has the basic block
where its matching `cleanuppad` is. The bitcode instruction `cleanupret`
kind of has that info (it has a token from the `cleanuppad`), but that
info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s
`cleanupret` node does not have any arguments:

https://github.com/llvm/llvm-project/blob/5091a359d9807db8f7d62375696f93fc34226969/llvm/include/llvm/Target/TargetSelectionDAG.td#L700
Note that `catchret` already has two basic block arguments, even though
neither of them means `catchpad`'s BB.

This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node
in ISel and uses it in the Wasm backend. Because this node is also used
in X86 backend we need to note its argument there too but nothing more
needs to change there as long as X86 doesn't need it.

---

- Details about changes in the Wasm backend:

After this PR, our pseudo `RETHROW` instruction takes a BB, which means
the EH pad whose exception it needs to rethrow. There are currently two
ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic
and the other is from `CLEANUPRET` we discussed above. In case of
`llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is
lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As
written in the comments, this PR doesn't change how this BB is computed.
The BB argument will be converted to an immediate argument as with other
control flow instructions in CFGStackify.

In case of `CLEANUPRET`, it already has a BB argument pointing to an EH
pad, so it is just converted to a `RETHROW` with the same BB argument in
LateEHPrepare. This will also be lowered to an immediate in CFGStackify
with other control flow instructions.

---

Fixes llvm#114600.
nikic pushed a commit to rust-lang/llvm-project that referenced this pull request Nov 20, 2024
So far we have assumed that we only rethrow the exception caught in the
innermost EH pad. This is true in code we directly generate, but after
inlining this may not be the case. For example, consider this code:
```ll
ehcleanup:
  %0 = cleanuppad ...
  call @Destructor
  cleanupret from %0 unwind label %catch.dispatch
```

If `destructor` gets inlined into this function, the code can be like
```ll
ehcleanup:
  %0 = cleanuppad ...
  invoke @throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch
```

We lower a `cleanupret` into `rethrow`, which assumes it rethrows the
exception caught by the nearest dominating EH pad. But after the
inlining, the nearest dominating EH pad is not `ehcleanup` but
`catch.start.i`.

The problem exists in the same manner in the new (exnref) EH, because it
assumes the exception comes from the nearest EH pad and saves an exnref
from that EH pad and rethrows it (using `throw_ref`).

This problem can be fixed easily if `cleanupret` has the basic block
where its matching `cleanuppad` is. The bitcode instruction `cleanupret`
kind of has that info (it has a token from the `cleanuppad`), but that
info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s
`cleanupret` node does not have any arguments:

https://github.com/llvm/llvm-project/blob/5091a359d9807db8f7d62375696f93fc34226969/llvm/include/llvm/Target/TargetSelectionDAG.td#L700
Note that `catchret` already has two basic block arguments, even though
neither of them means `catchpad`'s BB.

This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node
in ISel and uses it in the Wasm backend. Because this node is also used
in X86 backend we need to note its argument there too but nothing more
needs to change there as long as X86 doesn't need it.

---

- Details about changes in the Wasm backend:

After this PR, our pseudo `RETHROW` instruction takes a BB, which means
the EH pad whose exception it needs to rethrow. There are currently two
ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic
and the other is from `CLEANUPRET` we discussed above. In case of
`llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is
lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As
written in the comments, this PR doesn't change how this BB is computed.
The BB argument will be converted to an immediate argument as with other
control flow instructions in CFGStackify.

In case of `CLEANUPRET`, it already has a BB argument pointing to an EH
pad, so it is just converted to a `RETHROW` with the same BB argument in
LateEHPrepare. This will also be lowered to an immediate in CFGStackify
with other control flow instructions.

---

Fixes llvm#114600.
sunfishcode added a commit to WebAssembly/wasi-sdk that referenced this pull request Dec 11, 2024
Update to LLVM 19.1.5, and wasi-libc 574b88da. This notably picks up:
 - [WebAssembly] Fix rethrow's index calculation (llvm/llvm-project#114693)
 - [WebAssembly] Fix feature coalescing (llvm/llvm-project#110647)
sunfishcode added a commit to WebAssembly/wasi-sdk that referenced this pull request Dec 12, 2024
Update to LLVM 19.1.5, and wasi-libc 574b88da. This notably picks up:
 - [WebAssembly] Fix rethrow's index calculation (llvm/llvm-project#114693)
 - [WebAssembly] Fix feature coalescing (llvm/llvm-project#110647)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WASM rethrow instruction is generated with a wrong index
5 participants