-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attachment implementation to return value from visitor methods #21
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick initial feedback.
} | ||
|
||
@VisibleForTesting(otherwise = PRIVATE) | ||
final Deque<Attachment<?>> state = new ArrayDeque<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the Deque until we prove that it's actually needed. The design doc doesn't mention it, and it wasn't needed in the POC (which covered a good bit of the query API)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I overthought. The POC implementation is good enough and it covers everything already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expressing a few concerns without having reviewed the whole PR.
*/ | ||
public interface AttachmentKey<T> { | ||
AttachmentKey<String> COLUMN_NAME = new AttachmentKey<>() {}; | ||
AttachmentKey<String> COLLECTION_NAME = new AttachmentKey<>() {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toString
of such objects does not seem to be descriptive, yet it is used in Attachment.expect
(
throw new IllegalArgumentException(format("No value attached for attachment key: %s", attachmentKey))
). Let's improve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Ideally the output not only includes key name, but also the type info.
Given the keys will be added incrementally, it would be great to make the addition as easy as possible (without manual labour overhead).
I come up with the following idea to output both generic type info and names using Java reflection:
default String render() {
return Arrays.stream(AttachmentKey.class.getDeclaredFields())
.filter(field -> {
try {
return field.get(null) == this;
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
})
.findFirst()
.map(field -> format("%s: %s", field.getGenericType(), field.getName()))
.orElse(toString());
}
so for the following new key constant:
AttachmentKey<List<Map<String, Object>>> NO_NAME = new AttachmentKey<>() {};
the above method will output the following info:
AttachmentKey<java.util.List<java.util.Map<java.lang.String, java.lang.Object>>>: NO_NAME
Not sure whether it is over-engineering or not.
In normal visitor usage scenario, usually type matching is enough. It is rare to return multiple different result from the same visitxxx(..) method (by conditional branching) for if that is the case, a better design is to split it into multiple different visitor methods instead.
* @see AttachmentKey | ||
*/ | ||
@SuppressWarnings("unchecked") | ||
public final class Attachment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think, the "attachment" term is a misnomer here. It came from
RetryState.attachment
, but here the idea is very different, despite the implementation being similar.Attachment
exists to allow yielding a value from avoid
method in a situation when we can't make it non-void
. The fact that this is implemented by writing that value into the instance field ofAttachment
(which one may see as "attaching") is not of importance, certainly not on importance to the extent that it dictates the naming. - This mechanism, allowing us to yield values from
void
methods, is not coupled with the SQL AST walker, so I propose us to remove the story about the walker, which is essentially the whole class-level documentation ofAttachment
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I read the simulated stuff in Java driver (mostly written by you), I agree Attachment
is inappropriate here. It was used for storing retry state related common states. In our usage, we have different requirements, including:
- we store one single value for the visitor method returning void in question; attachment is barely feasible in the sense that we did store value somewhere
- A more subtle requirement to restore previous state after finishing the current visitor method to accommodate tree depth-first traversal, which is the salient feature and we should make it explicit in doc; it is based on a clever trick that we spread ancestor states within the various method calling stack trace corresponding to the visitor methods of its various ancestors.
import org.jspecify.annotations.Nullable; | ||
|
||
/** | ||
* Manages global state while translating SQL AST via {@link org.hibernate.sql.ast.SqlAstWalker}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Manages global state" - I don't think that is correct. This utility provides a way to yield values from void
methods. It manages only its own object-local state, just like any other stateful object, and this does not require any special mentions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I agree with this, given the existence of fields like AttachmentKey#COLUMN_NAME
. We could make it true though by moving those static AttachmentKey
instances into a separate class that is more closely tied to the AST walker.
*/ | ||
|
||
@NullMarked | ||
package com.mongodb.hibernate.translate.attachment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this package part of our API? If not, then let's make it internal.
The same question is applicable to com.mongodb.hibernate.translate
.
* @param <T> the expected Java generic type | ||
*/ | ||
public <T> T expect(AttachmentKey<T> attachmentKey, Runnable attacher) { | ||
var previousAttachmentKey = this.attachmentKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it true that previousAttachmentKey
is always null
, provided that Attachment.expect
is not called concurrently with itself (note that a such a thing may happen even within the same thread of execution)?
I suspect, that all we need is to assert here that this.attachmentKey
is null
, and we don't need previousAttachmentKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, previouslyAttachmentKey could be anything. We use it to cover such common AST visiting pattern: one visitor method calls another one, which calls another one, ..... AST tree depth could be ad-hoc, sub query is a good example.
That is why I originally used Stack
to record the tree traversal (post-order for we attach value bottom-up), but later on I realize Jeff's implementation has covered the case of backtracking by reverting back previous states.
So suppose in a deeply nested visitor method calling, all the "previous" state will be stored in their corresponding statck trace local variables. This seems an elegant solution (though it takes some getting used to and deep understanding, which could confuse new coder invariably).
* {@code null} | ||
* @param <T> the expected Java generic type | ||
*/ | ||
public <T> T expect(AttachmentKey<T> attachmentKey, Runnable attacher) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that neither the method name nor the parameter names here express what they are. This method executes a Runnable
action, and allows that action to yield a value, which is returned by this method. None of this is expressed by the expect
name of the method, and attacher
does not express that it's the action we are executing.
Following is an example of an alternative naming we may consider:
SideChannel {
<T> T execute(ValueDescriptor<T> valueDescriptor, Runnable actionYieldingValue);
void yield(ValueDescriptor<T> valueDescriptor, T value)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the naming was by Jeff in POC project. It is challenging to come up with perfect naming, tbh. In tech doc you mentioned ReturnType
in lieu of AttachmentKey
.
@jyemin , I think his naming seems a little bit better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have zero attachment (pun intended) to the names used in the POC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me that this interface was introduced only to be mentioned in the story written at the class level of Attachment
. If we end up deleting that story as suggested in https://github.com/mongodb/mongo-hibernate/pull/21/files#r1870405938, we should also remove this interface from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it could and should be deleted.
} | ||
|
||
/** | ||
* Detaches the attached value by invoking some visitor method which returns {@code void} but internally the method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "detaches" part (this.attachedValue = null
) is conceptually an implementation detail, it cannot in principle be part of the method contract because it is not an observable behavior, provided that the expect
methods is not called concurrently with itself. Thus, it should not be mentioned or even alluded to in the documentation.
} | ||
|
||
@VisibleForTesting(otherwise = VisibleForTesting.AccessModifier.PRIVATE) | ||
boolean isBlank() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this method, even for tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A key feature of Attachment is to support embedded attaching (there is a unit testing case to demonstrate that). I used this method to verify that after usage, its initial blank state should be reverted back. That covers the case that some dangling data lingers, e.g.:
Runnalbe
throws an exception and ourAttachment
didn't restore previous state;- deeply nested attachting didn't work as expected due to defective "previous state restoration"
tbh, all these Attachment complexity is really difficult to understand and use. Without a perfect unit testing coverage, it is so error-prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh, all these Attachment complexity is really difficult to understand and use.
Do we have a better alternative though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have an obvious choice here. A common issue with common solution. E.g. see https://stackoverflow.com/questions/1711797/how-to-store-state-on-a-node-for-a-visitor-pattern
It is also explained in detail in one book I read about ANTLR library (https://pragprog.com/titles/tpantlr2/the-definitive-antlr-4-reference/). Ultimately the reason Hibernate uses such visitor pattern is simply due to ANTLR library (which generates visitor API as the first vistior API in Hibernate; Hibernate simulates it to accommodate its SQL translation step. Actually Hibernate uses such visitor walker more than once, including:
- SQM AST -> SQL AST (e.g. https://github.com/hibernate/hibernate-orm/blob/main/hibernate-core/src/main/java/org/hibernate/query/sqm/sql/BaseSqmToSqlAstConverter.java). Return type is used for the various visitor methods for Hibernate has internal SQL AST model. Below is the method to convert
SqmUpdateStatement
toSqmUpdateStatement
:
- SQL AST -> SQL: given the target is flat SQL string, so it is natural Hibernate opts for void visitor methods, which is not enough for us for we have intermediate mongo AST
I explored whether we could make the above second item to return type as in SQM AST -> SQL AST case, but it has long-term maintainability issue so we'd better adhere to Hibernate's existing walker API without customization.
The most natural way to solve the issue is as follows:
- maintain a global map from SQL AST tree node reference to its corresponding visitor method's returned value; ANTLR book even provides the detail to use
IdentityHashMap
instead ofHashMap
for opt. - when we visit an AST node in its visit method, we adopt post-order traversal by visiting its children nodes first, then we look up in the global map to get their values, then we finish computing the returned value for the current node and store it into global map as well.
There is still complexity like ensuring type safety but I want to say it enjoys the biggest benefit: easy understanding and easy maintaining. Suppose I am community contributor, I would feel easy to get the hang of the subtlety without firstly scratching my hairs to figure out the complicated Attachment
mechanism (including its subtle old state restoration detail, which relies on stacktrace local variables to play hand in hand with method calling recursion or depth-first tree travseral).
return (T) this.attachedValue; | ||
} finally { | ||
// revert back old state | ||
this.attachmentKey = previousAttachmentKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, this is equivalent to this.attachmentKey = null
, and is needed only to allow us to use an instance of Attachment
for executing more than one Runnable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as explained previously, it is used for visiting AST tree with ad-hoc depth. We need to restore the previous ancestor node state for we only have one single flat field to store the 'current state'.
} | ||
if (this.attachedValue != null) { | ||
throw new IllegalStateException(format( | ||
"Current attachment with key [%s] has attached value [%s] already when another new value is provided [%s]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Attachment
is not part of API, which I suspect is true, all the checks in Attachment
are assertions. So if this is not part of API, let's make them assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is definitively not part of API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking more about this API, I came up with the following proposal: YieldingRunnable
(the full commit is stIncMale@0bdc3ac).
The usage looks like this:
String result = YieldingRunnable.run(ValueDescriptors.COLLECTION_NAME, yieldableValue -> {
voidMethod();
yieldableValue.yield(ValueDescriptors.COLLECTION_NAME, "my string");
});
See tests for the description of the features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look. Seems the ValueDescriptors
is pretty good. But the yieldableValue
might not be very useful for the yielding is supposed to be happening within the voidMethod()
or visitor method, that is the key challenge here. We have to yield value without returning from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could grab the ValueDescriptor
from the storage per se within the void visitor method, but that is not the design. The design here is to provide such AttachmentKey
when some value is attached for proactive key validation.
https://jira.mongodb.org/browse/HIBERNATE-32
try to simulate the similar stuff in POC project.