Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust arguments to transformed UnsafeAPI calls when offheap is enabled #20333

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

midronij
Copy link
Contributor

@midronij midronij commented Oct 11, 2024

During recognizedCallTransformer and unsafeFastPath, Unsafe.getAndAdd() or Unsafe.getAndSet() can be transformed into an atomicFetchAndAdd/atomicSwap helper call. For the UnsafeAPI methods, the destination address is passed in as two separate values: a base object address and an offset. When they are transformed into atomic helper calls, these values are added together and passed in as a single destination address value. However, this does not account for the differing shape of array objects under offheap/balanced GC policy (as opposed to the default, gencon).

Thus, this contribution includes the following changes

  • For both recognizedCallTransformer and unsafeFastPath, if the object is an array, the dataAddr pointer is loaded and used as the base address.
  • For recognizedCallTransformer, an array check is added to the series of runtime tests performed before calling the atomic helper, to ensure that dataAddr is only used if the object is indeed an array.

@midronij
Copy link
Contributor Author

@zl-wang here is the Unsafe.getAndAdd()/getAndSet() fix! Marked as a WIP since I am still finalizing the unsafeFastPath changes (the last two commits)

@midronij midronij force-pushed the offheap-unsafe-getAndAdd branch 2 times, most recently from 318add6 to 88cb0dc Compare October 15, 2024 18:14
@ianguo
Copy link

ianguo commented Oct 23, 2024

During recognizedCallTransformer and unsafeFastPath, Unsafe.getAndAdd() or Unsafe.getAndSet() can be transformed into an atomicFetchAndAdd/atomicSwap helper call. For the UnsafeAPI methods, the destination address is passed in as two separate values: a base object address and an offset. When they are transformed into atomic helper calls, these values are added together and passed in as a single destination address value. However, this does not account for the differing shape of array objects under offheap/balanced GC policy (as opposed to the default, gencon).

Thus, this contribution includes the following changes

  • For both recognizedCallTransformer and unsafeFastPath, if the object is an array, the dataAddr pointer is loaded and used as the base address.
  • For recognizedCallTransformer, an array check is added to the series of runtime tests performed before calling the atomic helper, to ensure that dataAddr is only used if the object is indeed an array.

During the running period of our application, from the thread stack, it has been "staying" at AtomicReferenceArray#getAndSet. A phenomenon similar to livelock has occurred. As a result, the thread is blocked and the application is seriously unrecoverable. I am not sure if it is related to your problem? @midronij @zl-wang

thread stack:
"pool-EventCacheMessageHandler-thread-1" Id=21859 RUNNABLE
at sun.misc.Unsafe.getAndSetObject(Unsafe.java:1165)
at java.util.concurrent.atomic.AtomicReferenceArray.getAndSet(AtomicReferenceArray.java:164)
at com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:159)
at com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:270)
at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1164)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3629)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3597)

java version:
openjdk version "1.8.0_422"
IBM Semeru Runtime Open Edition (build 1.8.0_422-b05)
Eclipse OpenJ9 VM (build openj9-0.46.0, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20240802_1004 (JIT enabled, AOT enabled)
OpenJ9 - 1a6f612
OMR - 840a9adba
JCL - a75ff73ce5 based on jdk8u422-b05)

@midronij midronij force-pushed the offheap-unsafe-getAndAdd branch 2 times, most recently from 091e81f to 1fe41b4 Compare November 12, 2024 19:01
@midronij midronij changed the title WIP: Adjust arguments to transformed UnsafeAPI calls when offheap is enabled Adjust arguments to transformed UnsafeAPI calls when offheap is enabled Nov 12, 2024
@midronij midronij force-pushed the offheap-unsafe-getAndAdd branch 2 times, most recently from 6b74410 to d46717a Compare November 12, 2024 20:13
Copy link
Contributor

@zl-wang zl-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of work remaining, it looks to me.

runtime/compiler/optimizer/J9RecognizedCallTransformer.cpp Outdated Show resolved Hide resolved
runtime/compiler/optimizer/J9RecognizedCallTransformer.cpp Outdated Show resolved Hide resolved
returnBlock->getEntry())));

//load dataAddr
TR::Node *objBaseAddrNode = addrToAccessNode->getChild(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you seemed just trusting with the tree shape ... having three depth-3 getChild(0). i am not so sure about it. look at the original objectNode = unsafeCall->getChild(1);. your code already accessed things one-level deeper and it is not getChild(0) at all.

Copy link
Contributor Author

@midronij midronij Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe at this point, we're no longer working with the Unsafe.getAndAdd() call. It has already been replaced by the atomicFetchAndAdd helper, and we're just making a copy and modifying it to have the necessary offheap adjustments for the case where the object is an array. The helper tree will look like this:

n92n  treetop
n93n    icall  <atomicFetchAndAdd>
n94n      aladd
n95n        aload  <temp slot 5>
n96n        lload  <temp slot 3>
n97n      iload  <temp slot 4>

(taken from here)

with arrayAccessTreeTop->getNode() will pointing to n92n. So I think that we should be able to use addrToAccessNode = arrayAccessTreeTop->getNode()->getChild(0)->getChild(0) to get the aladd node, and from there, we can modify its children, namely the object base address (which is pointed at by addrToAccessNode->getChild(0))

Copy link
Contributor Author

@midronij midronij Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zl-wang for a closer look at how Unsafe.getAndAdd() is handled, I created a very simple test case that calls the following method:

public static int foo(java.lang.Object a, long offset, int val)
{
    return unsafe.getAndAddInt(a, offset, val);
}

Here is the full trace log for that method: trace_UnsafeGetAndAdd.log

Copy link
Contributor Author

@midronij midronij Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After taking a closer look at the code in processUnsafeAtomicCall(), I think we can safely assume that the atomic helper tree will always take this shape. The arguments passed in to the original Unsafe.getAndAdd() call are stored to temps here:

TR::TransformUtil::createTempsForCall(this, treetop);

Which results in these IL trees being generated just before the runtime tests are performed:

n35n      astore  <temp slot 5>[#407  Auto] [flags 0x7 0x0 ]                                  [    0x7c10af0050a0] bci=[-1,6,33] rc=0 vc=0 vn=- li=- udi=- nc=1
n3n         ==>aload
n37n      astore  <temp slot 6>[#408  Auto] [flags 0x7 0x0 ]                                  [    0x7c10af005140] bci=[-1,6,33] rc=0 vc=0 vn=- li=- udi=- nc=1
n4n         aload  <parm 0 Ljava/lang/Object;>[#398  Parm] [flags 0x40000107 0x0 ]            [    0x7c10af0046f0] bci=[-1,3,33] rc=1 vc=0 vn=- li=- udi=- nc=0
n39n      lstore  <temp slot 7>[#409  Auto] [flags 0x4 0x0 ]                                  [    0x7c10af0051e0] bci=[-1,6,33] rc=0 vc=0 vn=- li=- udi=- nc=1
n5n         lload  <parm 1 J>[#399  Parm] [flags 0x40000104 0x0 ]                             [    0x7c10af004740] bci=[-1,4,33] rc=1 vc=0 vn=- li=- udi=- nc=0
n41n      istore  <temp slot 8>[#410  Auto] [flags 0x3 0x0 ]                                  [    0x7c10af005280] bci=[-1,6,33] rc=0 vc=0 vn=- li=- udi=- nc=1
n6n         iload  <parm 3 I>[#400  Parm] [flags 0x40000103 0x0 ]                             [    0x7c10af004790] bci=[-1,5,33] rc=1 vc=0 vn=- li=- udi=- nc=0

Note that createTempsForCall() will replace the original children of the call node with loads of the temps that the arguments are stored to:

// Replace the old child with a load of the new sym ref
TR::Node *value = TR::Node::createLoad(callNode, newSymbolReference);

Then, when generating the atomic fetchAndAdd helper tree, the address that is passed in is constructed as an aladd/aiadd node that takes the sum of the first and second children of the call node (i.e.: the base address and the offset):

address = comp()->target().is32Bit() ? TR::Node::create(TR::aiadd, 2, objectNode->duplicateTree(), TR::Node::create(TR::l2i, 1, offsetNode->duplicateTree())) :
TR::Node::create(TR::aladd, 2, objectNode->duplicateTree(), offsetNode->duplicateTree());
}
// Transmute the unsafe call to a call to atomic method helper
unsafeCall->getChild(0)->recursivelyDecReferenceCount(); // replace unsafe object with the address
unsafeCall->setAndIncChild(0, address);

So I believe that no matter what kind of node/tree is given to the original call node, the atomic fetchAndAdd helper tree will always take the form detailed above, meaning that we can safely assume that arrayAccessTreeTop->getNode()->getChild(0)->getChild(0)->getChild(0) will give us the base array address, and load dataAddr from it.

if (isUnsafeCallerAccessingArrayElement(callerMethod) && TR::Compiler->om.isOffHeapAllocationEnabled() && comp()->target().is64Bit())
{
TR::Node *object = node->getChild(1);
TR::Node *offset = node->getChild(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you assume the incoming call tree looks like? are they child(1) and child(2), instead of child(0) and child(1)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that for Unsafe.compareAndSwap() calls, the first child is the Unsafe object, not the base object address. For example:

n467n       icall  jdk/internal/misc/Unsafe.compareAndSetObject(Ljava/lang/Object;JLjava/lang/Object;Ljava/lang/Object;)Z
n458n         aload  java/lang/invoke/VarHandle._unsafe Ljdk/internal/misc/Unsafe;
n248n         aload  <temp slot 4>
n463n         ==>lconst 16
n249n         ==>aload
n250n         ==>aload

So the object would be node->getChild(1) and the offset would be node->getChild(2)

runtime/compiler/optimizer/UnsafeFastPath.cpp Show resolved Hide resolved
runtime/compiler/optimizer/UnsafeFastPath.cpp Outdated Show resolved Hide resolved
@@ -279,6 +280,26 @@ bool TR_UnsafeFastPath::tryTransformUnsafeAtomicCallInVarHandleAccessMethod(TR::
if (isVarHandleOperationMethodOnNonStaticField(callerMethod) &&
performTransformation(comp(), "%s transforming Unsafe.CAS [" POINTER_PRINTF_FORMAT "] into codegen inlineable\n", optDetailString(), node))
{
#if defined(J9VM_GC_ENABLE_SPARSE_HEAP_ALLOCATION)
if (isUnsafeCallerAccessingArrayElement(callerMethod) && TR::Compiler->om.isOffHeapAllocationEnabled() && comp()->target().is64Bit())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this appears to assume: at compile time, it is definitely known if it is array access or not?

Copy link
Contributor Author

@midronij midronij Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this code will only ever be executed if the calling java method being compiled belongs to a very specific set of methods. The following check is performed before calling tryTransformUnsafeAtomicCallInVarHandleAccessMethod() (which is where the code you highlighted above is located):

if (isVarHandleOperationMethod(caller) &&
(isTransformableUnsafeAtomic(comp(), callee) ||
symbol->getMethod()->isUnsafeCAS()))

And isVarHandleOperationMethod(caller) will only return true if caller is one of the following:

java_lang_invoke_ArrayVarHandle_ArrayVarHandleOperations_OpMethod:
java_lang_invoke_StaticFieldVarHandle_StaticFieldVarHandleOperations_OpMethod:
java_lang_invoke_InstanceFieldVarHandle_InstanceFieldVarHandleOperations_OpMethod:
java_lang_invoke_ByteArrayViewVarHandle_ByteArrayViewVarHandleOperations_OpMethod:

(taken from TR_J9MethodBase::isVarHandleOperationMethod(), which is called by isVarHandleOperationMethod())

I believe the rationale here is that since we know that the calling method must be part of this very specific set, we can use what we know about how these methods are implemented to determine whether the object that gets passed in to Unsafe.compareAndSwap() is an array or not at compile time. Similar logic is used to perform other checks that would normally have to be done at runtime (such as a static field check).

As an example: since all ArrayVarHandle methods that call Unsafe.CAS() are performed on array objects, we can be certain even at compile time that the object being passed in is an array.

When Unsafe.getAndAdd()/getAndSet() is called on an array and offheap
changes are enabled, adjust arguments so that dataAddr is passed in as
the base address of the object.

Signed-off-by: midronij <[email protected]>
intrinsics during unsafeFastPath

During unsafeFastPath, some UnsafeAPI calls (such as Unsafe.getAndAdd())
are converted to atomic intrinsics (such as atomicFetchAndAdd). When
this occurs with offheap allocation enabled, and the object is an array,
load dataAddr.

Signed-off-by: midronij <[email protected]>
Copy link
Contributor

@zl-wang zl-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now ...

@zl-wang
Copy link
Contributor

zl-wang commented Nov 25, 2024

Jenkins test sanity.functional all jdk8,jdk23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants