Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct Behavior of Persisted objects #935

Closed
xenoterracide opened this issue Mar 15, 2024 · 16 comments
Closed

Correct Behavior of Persisted objects #935

xenoterracide opened this issue Mar 15, 2024 · 16 comments

Comments

@xenoterracide
Copy link

xenoterracide commented Mar 15, 2024

Take 50,026. (using json5 for brevity)

Context

Preliminary context, presume the object may only be partially hydrated, so checking all fields is not viable.

Maybe this is better by answered what the behavior with sets should be in the case of a persisted set.

Things we know are true.

  • Objects may not have all of their properties loaded and we don't want to load them for this
  • 2 objects with the same id may not be equal if they are differing in type, so id comparison alone is not sufficient.
  • 2 objects of the same type, with the same, non null, id can not be equal.
    • Imagine a projection that uses a numeric based identifier, a "list of could in theory represent 2 different objects using the same projection. Projection might not be the only use case.
    [
      { id: 1, name: "foo", type: "Foo" },
      { id: 1, name: "bar", type: "Bar" }
    ]
  • An Entity must be the same instanceof another Entity to possibly be equivalent
  • An Entity must have the same Identity to be considered equivalent to another Entity
    • Identity can be a business key, a surrogate key or both.

Questions

Let's assume these are "order line items", although those aren't usually aggregates, but they are entities, we could also think of them as "orders". The Given array is being added to an empty set and the Then the result is.

Also, there's more than one way to do a lot of these, demo-ed includes auto incrementing generated integer vs uuid, and a nullable version like with a timestamp (bad? race condition...). There are many many ways to dirty check. I've excluded the possibility of checking to see if the proxy is initialized which can also be done, but maybe I should add that. Hibernate is not the only technology that would have these problems, so I'm trying to be technology agnostic.

  • Identity and version:

    They have the same Identity and are from the same persisted state, e.g. meaning our @Version optimistic lock hasn't been updated, or whatever else we're using for optimistic locking, hibernate for example also allows this to be a business key.

Identity is null on new

Given

[
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "bar", version: null, eventsToBePublished: [] },
]
When: nothing should be checked
Then
[
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "bar", version: null, eventsToBePublished: [] },
]
When: Properties should be checked
Then
[
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "bar", version: null, eventsToBePublished: [] },
]
When: Identity should be checked
Then
[
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
]

Non Null Identity on new with nullable version

but version is null so it could be a timestamp. The problem is we have to determining it's new

Given

[
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1" , name: "foo", version: null, eventsToBePublished: [{ name: "baz"}] },
  { id: "UUID 2", name: "bar", version: null, eventsToBePublished: [] },
]
When: Identity check
Then
[
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 2", name: "bar", version: null, eventsToBePublished: [] },
]
When: Version, and Identity check
Then
[
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 2", name: "bar", version: null, eventsToBePublished: [] },
]
When: Dirty, Version, and Identity check
Then
[
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [{ name: "baz"}] },
  { id: "UUID 2", name: "bar", version: null, eventsToBePublished: [] },
]

Non Null Identity, and version

but version is null so it could be a timestamp. The problem is we have to determining it's new

Given

[
  { id: "UUID 1", name: "foo", version: 0, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 0, eventsToBePublished: [] },
  { id: "UUID 1" , name: "foo", version: 0, eventsToBePublished: [{ name: "baz"}] },
  { id: "UUID 2", name: "bar", version: 0, eventsToBePublished: [] },
]
When: Identity check
Then
[
  { id: "UUID 1", name: "foo", version: 0, eventsToBePublished: [] },
  { id: "UUID 2", name: "bar", version: 0, eventsToBePublished: [] },
]
When: Version, and Identity check
Then
[
  { id: "UUID 1", name: "foo", version: 0, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 2", name: "bar", version: 0, eventsToBePublished: [] },
]
When: Dirty, Version, and Identity check
Then
[
  { id: "UUID 1", name: "foo", version: 0, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 0, eventsToBePublished: [{ name: "baz"}] },
  { id: "UUID 2", name: "bar", version: 0, eventsToBePublished: [] },
]

Persisted

If I have 3 Entities with the same Identity and add them to a set, 2 will never exist if and only if (I'm assuming first wins).

Given

[
  { id: 1, name: "foo", version: 1, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [{ name: "baz"}] },
]
When: Identity only check
Then
  [
    { id: 1, name: "foo" },
  ]
When: Version, Identity check
Then
[
  { id: 1, name: "foo", version: 1, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [] },
]
When: Dirty, Version, Identity check
[
  { id: 1, name: "foo", version: 1, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [{ name: "baz"}] },
]

going to update more in a comment with what the actual behavior is, and then some conclusions.

@xenoterracide
Copy link
Author

ok resolution thought 1. equality should not be used to guess that 2 new entities are duplicates.

simply put an order line item with the exact same values may not be intentional on an order. Other business logic should be used than Java's Equality, including ways of debouncing and other API methods to prevent duplicate submission.

Identity is null on new

Given

[
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "bar", version: null, eventsToBePublished: [] },
]
When: nothing should be checked
Then
[
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "foo", version: null, eventsToBePublished: [] },
  { id: null, name: "bar", version: null, eventsToBePublished: [] },
]

@xenoterracide
Copy link
Author

xenoterracide commented Mar 18, 2024

ok, couple of things. Set behavior, first in wins, meaning if you try to add an object with the same equality again, the set won't be updated with the new instance (at least Hibernates PersistedSet works this way). Also, if you try to save a PersistedSet with the same identity hibernate will throw an error. In DDD, I would say that neither thing is what I'd want.

org.springframework.dao.InvalidDataAccessApiUsageException: Multiple representations of the same entity [com.xenoterracide.jpa.BarEntity#018e53aa-fa7c-755b-8f66-2834f8df44f3] are being merged. Detached: [BarEntity@51eade15[018e53aa-fa7c-755b-8f66-2834f8df44f3,name='new0']]; Managed: [BarEntity@1c09921a[018e53aa-fa7c-755b-8f66-2834f8df44f3,name='new']]

Solution? well, the best situation for a developer is actually going to be the exception. Since attempting to add an existing entity to a set will silently fail. Based on that information I would suggest that all 3 things must be compared always. Meaning you need an identity, version, and some method of dirty tracking to be compared when adding. Note: Hibernate supports dirty tracking internally, but I think people should not rely on that, instead some kind of transient field. Although the aforementioned array can work, that shouldn't be required as generally only aggregates track events. So I would suggest that there be some kind of "dirty tracking fields" method. Although, dirty tracking could also be some kind of a non property method call.

Non Null Identity on new with nullable version

but version is null so it could be a timestamp. The problem is we have to determining it's new

Given

[
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1" , name: "foo", version: null, eventsToBePublished: [{ name: "baz"}] },
  { id: "UUID 2", name: "bar", version: null, eventsToBePublished: [] },
]

Honestly, the best solution based on the fact that the last wins is to assume they are not equal

When: Dirty, Version, and Identity check
Then
[
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: 1, eventsToBePublished: [] },
  { id: "UUID 1", name: "foo", version: null, eventsToBePublished: [{ name: "baz"}] },
  { id: "UUID 2", name: "bar", version: null, eventsToBePublished: [] },
]

@xenoterracide
Copy link
Author

Persisted

If I have 3 Entities with the same Identity and add them to a set, 2 will never exist if and only if (I'm assuming first wins).

Given

[
  { id: 1, name: "foo", version: 1, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [{ name: "baz"}] },
]
When: Dirty, Version, Identity check
[
  { id: 1, name: "foo", version: 1, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [] },
  { id: 1, name: "bar", version: 2, eventsToBePublished: [{ name: "baz"}] },
]

@xenoterracide
Copy link
Author

@jqno I think I'm done here, thoughts?

@jqno
Copy link
Owner

jqno commented Mar 20, 2024

@jqno I think I'm done here, thoughts?

My first thought, to be perfectly honest, is that this quite a lot of text and I'm not sure where to find the time to read and fully understand it all 😅. Let alone to figure out what it actually is that you need here, and whether or not EqualsVerifier already provides in this need. Please keep in mind, I do have a day job that isn't EqualsVerifier, and also a family, so the time I get to spend on EqualsVerifier really is limited and I prefer spending it in the code, rather than the issue tracker, as I'm sure you'll understand...

But let me attempt to summarize; please correct me if I'm wrong. You've done quite a lot of work compiling a list of scenarios. Each scenario starts with a set of objects, and ends with the objects that you end up with if you were to put the given objects in a set, given some implementation of equals.

What, specifically, are you requesting though? Are you asking me to help find out what that equals method should look like, in order to fulfill these scenarios? Or do you already have one in mind, and are trying to check it with EqualsVerifier? If the latter, is it working, or are you missing something?

We've already discussed what should happen when the id field is null in #932, and also briefly touched upon Hibernate's @Version annotation and the concept of dirty tracking, in both #932 and #934. You have a section named Questions which, ironically, doesn't contain any questions. So I'm not really sure what you're asking.

Have you taken a look at suppressing Warning.IDENTICAL_COPY_FOR_VERSIONED_ENTITIY? It allows you to have a different flow through equals depending on whether the id is null. You can read more about it here.

@xenoterracide
Copy link
Author

xenoterracide commented Mar 20, 2024

Please keep in mind, I do have a day job that isn't EqualsVerifier, and also a family, so the time I get to spend on EqualsVerifier really is limited and I prefer spending it in the code, rather than the issue tracker, as I'm sure you'll understand...

yes, I do, and yes it is. The conclusions are in my comments... that will trim it down a bit.

What, specifically, are you requesting though? Are you asking me to help find out what that equals method should look like, in order to fulfill these scenarios? Or do you already have one in mind, and are trying to check it with EqualsVerifier? If the latter, is it working, or are you missing something?

Well, this is kind of me trying to figure out "what" equals should mean for an Entity or Aggregate per Domain Driven Design and based on how JPA and Set's actually work in practice for people to avoid mistakes. I'd love feedback on my assumptions/conclusions since I'm a big dumb dumb. This can then be used to drive other issues. Basically this is a record of what and why to not noise up multiple other issues.

Questions which, ironically, doesn't contain any questions.

hah, see what makes sense to my brain. Each of those scenarios ("given") could only have ONE of the "when/then" combinations be true, but I gave multiple "when/then" possibilities. Essentially the ticket is obnoxiously all the scenario combinations that I looked at to see what was a good idea. The comments are the conclusions I came to.

Have you taken a look at suppressing Warning.IDENTICAL_COPY_FOR_VERSIONED_ENTITIY? It allows you to have a different flow through equals depending on whether the id is null. You can read more about it here.

well, other ticket, it doesn't seem like I can override anything but that's not really supposed to be the focus of this ... pseudo issue. Which is "what should the behavior of a persisted entity be" If we have the answer to that (which I think I do) it makes it easier to see how equalsverifier should behave.

note: I ran through my experiments in this method, it more or less documents how I expect this to work https://github.com/xenoterracide/spring-app-commons/blob/76a672e68d1fc2cc246bab18f54ccc00bd9daa0a/module/jpa/src/test/java/com/xenoterracide/jpa/JpaAggregateTest.java#L43-L119

@jqno
Copy link
Owner

jqno commented Mar 20, 2024

OK, I see where you're coming from now. Still, this is a lot of context to load into my mind, especially since my mind is used to thinking in terms of EqualsVerifier itself, not so much in terms of sets -- it adds a lot of additional bandwidth for me.

So my question to you is, given all this, do you think everything is covered by the other tickets, or are we still missing a behavior somewhere?

@xenoterracide
Copy link
Author

xenoterracide commented Mar 20, 2024

if you want Entity/Aggregates to have a good comparison out of the box I think there's more needed than what's encapsulated out of the box. TBH, it's also the only real use case I ever have for implementing equals...

Really it's 3 things id, version, and dirty. realistically version and dirty are just a replacement for checking all the properties. We don't have a ticket for dirty at all, and there's no standard for that as it has to be handled by a transient property. This of course is all complexity based on whether you're using a Set, it "goes away" if the in memory collection allows "duplicates".

#934 is my first priority, as right now I've disabled equalsverifier due to it.

#932 isn't a priority for me personally (at this time) because I'm generating a UUID client side, thus it can be treated as never null, and for me it's fine then if null == null. However, that's a problem for anyone using JPA's id generation feature (or pick your object mapper where the id comes from the database, long term performance dragons if your id doesn't come from your database). I think of this ticket as id only

Do you want tickets for @Version and dirty (complicated maybe) separately, or the same? I feel that's a you managing thing. I just went through the work of testing and consequences. I think it might be worth rethinking some design for a new api; Less magic, more explicit.

@jqno
Copy link
Owner

jqno commented Mar 20, 2024

Let's not create more issues until we really need them 😅

I could add some behaviour for @Version, but given that there doesn't seem to be a standard on handling dirty states, I don't want to add any 'magic' behaviour that makes sense for your use case but might leave other users confused. Once #934 is fixed, it should be possible to do withOnlyTheseFields on your entity again, which should at least make your use case possible.

I agree about the API redesign, I'm thinking about it for a possible new major release, but I have some other things I want to do as well, so that might take a while.

@xenoterracide
Copy link
Author

xenoterracide commented Mar 20, 2024

well... "magic" I'm thinking of being less magical, but more explicit about the goal, problem.

EqualsVerifier.forLazyLoading(MyAggregate.class)
.dbGeneratedId("id")      // means it's null by default and thus null != null
.persistenceChangeFields("version") // optimistic locking, but this can be a so called natural key, see that document I linked, you can just read everything
.dirtyFields("dirty") // could be a boolean, array whatever
.dirtyMethods("isInitialized") // just going to be honest, you could call a method that checks the proxy, I did that in my test.
.verify();

I'm thinking given how many databases out there might support lazy loading proxies... it's also worth saying that with JPA it's a perfectly valid pattern to generate an id client side like I'm doing, and have your entire aggregate eager loaded and hydrated. That should receive no special treatment in equals.

Obviously, I think this is a bit more sketch... like something will bite this person but they're saying they're good... kind of the same as suppressing an optimistic lock and dirty check warning.

EqualsVerifier.forLazyLoading(MyAggregate.class)
.naturalKey("name") // or never null id
.noOptimisticLock()
.noDirtyChecking()
.verify();

@xenoterracide
Copy link
Author

In any sense... I hope this ticket provides you with enough information to inform future API decisions and documentation around dealing with the lazy loading persistence problem. I do appreciate your time and work btw ;) .

@jqno
Copy link
Owner

jqno commented Mar 22, 2024

Thanks! And thanks for sitting down and actually doing the research, too.

I just released version 3.16, which should fix most of the problems that caused this issue. As I mentioned in another ticket, improving the API is on my roadmap, but I don't know when I'll get to it. I'll bookmark this issue in my notes for when I do get to it. In the mean time, I'll close it, since the ticket isn't actionable since the new release.

@xenoterracide
Copy link
Author

Honest opinion, although I think I already said it. When you redo the API/next major release. I think you should remove special JPA handling. I suspect it creates more work for you and is actually a large portion of why I've created so many issues. That doesn't mean there doesn't need to be a surrogate key, and documentation around dealing with surrogate keys and lazy loading (I think this issue could be used to improve a how to on that). Simply that magically dealing with a "single database" framework is probably a bad idea and is ... error prone.

Essentially at this point I'd rather see documented advice on how to implement, and what considerations, and how to make equalsverifier pass said implementation.

@jqno
Copy link
Owner

jqno commented Mar 24, 2024

Maybe you're right, and I should never have implemented it in the first place. But that's moot, it's here, and I can't just take it away anymore, people are using this.

@xenoterracide
Copy link
Author

xenoterracide commented Mar 24, 2024

Fair enough, although one could provide an alternate way of doing the same thing. TBF, I thought it was a good idea at the time, too. It may have even been mine in the first place (I think it was :'( ). Just trying to make your life easier (long term), although I may have had other reasons I haven't encountered/don't remember yet.

I do appreciate all your hard work btw, You have already caught a bug of mine (someone forgot to update hashcode ;) )

@jqno
Copy link
Owner

jqno commented Mar 24, 2024

I did a quick search, the first version that added some specific behavior for classes with the @entity annotation was version 1.0 in 2011. Your first ticket is from 2015, so you're clear 😉

Happy to hear EqualsVerifier is helpful to you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants