-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare entity-level commit order computation in the UnitOfWork
#10651
Prepare entity-level commit order computation in the UnitOfWork
#10651
Conversation
UnitOfWork
e53681d
to
1b1e673
Compare
This is the second chunk to break doctrine#10547 into smaller PRs suitable for reviewing. It prepares the `UnitOfWork` to work with a commit order computed on the entity level, as opposed to a class-based ordering as before. #### Background doctrine#10531 and doctrine#10532 show that it is not always possible to run `UnitOfWork::commit()` with a commit order given in terms of entity _classes_. Instead it is necessary to process entities in an order computed on the _object_ level. That may include * a particular order for multiple entities of the _same_ class * a particular order for entities of _different_ classes, possibly even going back and forth (entity `$a1` of class `A`, then `$b` of class `B`, then `$a2` of class `A` – revisiting that class). This PR is about preparing the `UnitOfWork` so that its methods will be able to perform inserts and deletions on that level of granularity. Subsequent PRs will deal with implementing that particular order computation. #### Suggested change Change the private `executeInserts` and `executeDeletions` methods so that they do not take a `ClassMetadata` identifying the class of entities that shall be processed, but pass them the list of entities directly. The lists of entities are built in two dedicated methods. That happens basically as previously, by iterating over `$this->entityInsertions` or `$this->entityDeletions` and filtering those by class. #### Potential (BC breaking?) changes that need review scrutiny * `\Doctrine\ORM\Persisters\Entity\EntityPersister::addInsert()` was previously called multiple times, before the insert would be performed by a call to `\Doctrine\ORM\Persisters\Entity\EntityPersister::executeInserts()`. With the changes here, this batching effectively no longer takes place. `executeInserts()` will always see one entity/insert only, and there may be multiple `executeInserts()` calls during a single `UoW::commit()` phase. * The caching in `\Doctrine\ORM\Cache\Persister\Entity\AbstractEntityPersister` previously would cache entities from the last executed insert batch only. Now it will collect entities across multiple batches. I don't know if that poses a problem. * Overhead in `\Doctrine\ORM\Persisters\Entity\BasicEntityPersister::executeInserts` is incurred multiple times; that may, however, only be about SQL statement preparation and might be salvageable. * The `postPersist` event previously was not dispatched before _all_ instances of a particular entity class had been written to the database. Now, it will be dispatched immediately after every single entity that has been inserted.
1b1e673
to
b42cf99
Compare
@FabioBatSilva you were the last one to make relevant API changes to the persister classes in #808. Maybe you still remember those days and have any thoughts on this? |
Wow.. Can't believe that i'ts been almost 10y since that PR 😱 I'ts been a while since i looked into that code, |
@FabioBatSilva thanks a lot for your feedback. Yes, time flies... In particular, with regard to 2nd level caching, do you see any problem with calling |
Is This bit here might be problematic : orm/lib/Doctrine/ORM/Cache/Persister/Entity/AbstractEntityPersister.php Lines 317 to 318 in 1142a39
orm/lib/Doctrine/ORM/Cache/Persister/Entity/NonStrictReadWriteCachedEntityPersister.php Line 23 in 1142a39
IIRC it assumes the all inserts are present when |
I think I’ve got this covered with the changes to lib/Doctrine/ORM/Cache/Persister/Entity/AbstractEntityPersister.php in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is entity-level-commit-order
going to be merged into the 3.0.x branch? If the impact of the changes isn't graspable, do we miss tests for the commit order?
@SenseException Thank your for taking the time to review this!
Reasons (in my opinion) not to aim at 3.0:
That being said, yes - there always remains a risk that changes break things for users, and I try hard to come up with creative scenarios to imagine how that might happen. But, on the other hand, we also need to make clear where users overstressed our promises given. One example is the semantics of the In these cases, we'll need to discuss/decide what the reasonable assumptions are that one might have made, and where users (unknowingly?) relied on Hm, did that answer your question 🤔 ? Regarding test coverage: The commit order is very central for everything that writes entities to the DB, so almost every functional test we have depends on it in one way or another. For the tricky edge cases that currently do not work right (reported by many users over the years), I have created test cases in various other PRs, but all of them are consolidated in #10547. |
Okay, I was afraid that this would introduce changes that would only fit into a major release. Sounds like it is not.
Yes, it did. Thank you. |
Thanks @mpdude ! |
This is the third step to break doctrine#10547 into smaller PRs suitable for reviewing. It uses the new topological sort implementation from doctrine#10592 and the refactoring from doctrine#10651 to compute the UoW's commit order for entity insertions not on the entity class level, but for single entities and their actual dependencies instead. #### Current situation `UnitOfWork::getCommitOrder()` would compute the entity sequence on the class level with the following code: https://github.com/doctrine/orm/blob/70477d81e96c0044ad6fd8c13c37b2270d082792/lib/Doctrine/ORM/UnitOfWork.php#L1310-L1325 #### Suggested change * Instead of considering the classes of all entities that need to be inserted, updated or deleted, consider the new (inserted) entities only. We only need to find a sequence in situations where there are foreign key relationships between two _new_ entities. * In the dependency graph, add edges for all to-one association target entities. * Make edges "optional" when the association is nullable. #### Test changes I have not tried to fully understand the few changes necessary to fix the tests. My guess is that those are edge cases where the insert order changed and we need to consider this during clean-up. Keep in mind that many of the functional tests we have assume that entities have IDs assigned in the order that they were added to the EntityManager. That does not change – so the order of entities is generally stable, equal to the previous implementation.
This is the third step to break doctrine#10547 into smaller PRs suitable for reviewing. It uses the new topological sort implementation from doctrine#10592 and the refactoring from doctrine#10651 to compute the UoW's commit order for entity insertions not on the entity class level, but for single entities and their actual dependencies instead. #### Current situation `UnitOfWork::getCommitOrder()` would compute the entity sequence on the class level with the following code: https://github.com/doctrine/orm/blob/70477d81e96c0044ad6fd8c13c37b2270d082792/lib/Doctrine/ORM/UnitOfWork.php#L1310-L1325 #### Suggested change * Instead of considering the classes of all entities that need to be inserted, updated or deleted, consider the new (inserted) entities only. We only need to find a sequence in situations where there are foreign key relationships between two _new_ entities. * In the dependency graph, add edges for all to-one association target entities. * Make edges "optional" when the association is nullable. #### Test changes I have not tried to fully understand the few changes necessary to fix the tests. My guess is that those are edge cases where the insert order changed and we need to consider this during clean-up. Keep in mind that many of the functional tests we have assume that entities have IDs assigned in the order that they were added to the EntityManager. That does not change – so the order of entities is generally stable, equal to the previous implementation.
This is the third step to break doctrine#10547 into smaller PRs suitable for reviewing. It uses the new topological sort implementation from doctrine#10592 and the refactoring from doctrine#10651 to compute the UoW's commit order for entity insertions not on the entity class level, but for single entities and their actual dependencies instead. #### Current situation `UnitOfWork::getCommitOrder()` would compute the entity sequence on the class level with the following code: https://github.com/doctrine/orm/blob/70477d81e96c0044ad6fd8c13c37b2270d082792/lib/Doctrine/ORM/UnitOfWork.php#L1310-L1325 #### Suggested change * Instead of considering the classes of all entities that need to be inserted, updated or deleted, consider the new (inserted) entities only. We only need to find a sequence in situations where there are foreign key relationships between two _new_ entities. * In the dependency graph, add edges for all to-one association target entities. * Make edges "optional" when the association is nullable. #### Test changes I have not tried to fully understand the few changes necessary to fix the tests. My guess is that those are edge cases where the insert order changed and we need to consider this during clean-up. Keep in mind that many of the functional tests we have assume that entities have IDs assigned in the order that they were added to the EntityManager. That does not change – so the order of entities is generally stable, equal to the previous implementation.
This is the third step to break doctrine#10547 into smaller PRs suitable for reviewing. It uses the new topological sort implementation from doctrine#10592 and the refactoring from doctrine#10651 to compute the UoW's commit order for entity insertions not on the entity class level, but for single entities and their actual dependencies instead. #### Current situation `UnitOfWork::getCommitOrder()` would compute the entity sequence on the class level with the following code: https://github.com/doctrine/orm/blob/70477d81e96c0044ad6fd8c13c37b2270d082792/lib/Doctrine/ORM/UnitOfWork.php#L1310-L1325 #### Suggested change * Instead of considering the classes of all entities that need to be inserted, updated or deleted, consider the new (inserted) entities only. We only need to find a sequence in situations where there are foreign key relationships between two _new_ entities. * In the dependency graph, add edges for all to-one association target entities. * Make edges "optional" when the association is nullable. #### Test changes I have not tried to fully understand the few changes necessary to fix the tests. My guess is that those are edge cases where the insert order changed and we need to consider this during clean-up. Keep in mind that many of the functional tests we have assume that entities have IDs assigned in the order that they were added to the EntityManager. That does not change – so the order of entities is generally stable, equal to the previous implementation.
This is the third step to break doctrine#10547 into smaller PRs suitable for reviewing. It uses the new topological sort implementation from doctrine#10592 and the refactoring from doctrine#10651 to compute the UoW's commit order for entity insertions not on the entity class level, but for single entities and their actual dependencies instead. #### Current situation `UnitOfWork::getCommitOrder()` would compute the entity sequence on the class level with the following code: https://github.com/doctrine/orm/blob/70477d81e96c0044ad6fd8c13c37b2270d082792/lib/Doctrine/ORM/UnitOfWork.php#L1310-L1325 #### Suggested change * Instead of considering the classes of all entities that need to be inserted, updated or deleted, consider the new (inserted) entities only. We only need to find a sequence in situations where there are foreign key relationships between two _new_ entities. * In the dependency graph, add edges for all to-one association target entities. * Make edges "optional" when the association is nullable. #### Test changes I have not tried to fully understand the few changes necessary to fix the tests. My guess is that those are edge cases where the insert order changed and we need to consider this during clean-up. Keep in mind that many of the functional tests we have assume that entities have IDs assigned in the order that they were added to the EntityManager. That does not change – so the order of entities is generally stable, equal to the previous implementation.
This is the second chunk to break #10547 into smaller PRs suitable for reviewing. It prepares the
UnitOfWork
to work with a commit order computed on the entity level, as opposed to a class-based ordering as before.Background
#10531 and #10532 show that it is not always possible to run
UnitOfWork::commit()
with a commit order given in terms of entity classes. Instead it is necessary to process entities in an order computed on the object level. That may include$a1
of classA
, then$b
of classB
, then$a2
of classA
– revisiting that class).This PR is about preparing the
UnitOfWork
so that its methods will be able to perform inserts and deletions on that level of granularity. Subsequent PRs will deal with implementing that particular order computation.Suggested change
Change the private
executeInserts
andexecuteDeletions
methods so that they do not take aClassMetadata
identifying the class of entities that shall be processed, but pass them the list of entities directly.The lists of entities are built in two dedicated methods. That happens basically as previously, by iterating over
$this->entityInsertions
or$this->entityDeletions
and filtering those by class.Potential (BC breaking?) changes that need review scrutiny
\Doctrine\ORM\Persisters\Entity\EntityPersister::addInsert()
was previously called multiple times, before the insert would be performed by a call to\Doctrine\ORM\Persisters\Entity\EntityPersister::executeInserts()
. With the changes here, this batching effectively no longer takes place.executeInserts()
will always see one entity/insert only, and there may be multipleexecuteInserts()
calls during a singleUoW::commit()
phase.\Doctrine\ORM\Cache\Persister\Entity\AbstractEntityPersister
previously would cache entities from the last executed insert batch only. Now it will collect entities across multiple batches. I don't know if that poses a problem.\Doctrine\ORM\Persisters\Entity\BasicEntityPersister::executeInserts
is incurred multiple times; that may, however, only be about SQL statement preparation and might be salvageable.postPersist
event previously was not dispatched before all instances of a particular entity class had been written to the database. Now, it will be dispatched immediately after every single entity that has been inserted.postPersist
event changes – meaningful?Documentation on the
postPersist
event is here: https://www.doctrine-project.org/projects/doctrine-orm/en/2.14/reference/events.html#postupdate-postremove-postpersist"It will be invoked after the database insert operations" is a bit imprecise. It previously was "after the database insert operations for that entity class, but before operations for other classes". With this PR, it would be "after the database insert operations for this entity instance".
"Generated primary key values are available in the postPersist event" should better be written as "a generated primary key value for the entity is available in the postPersist event".
Alternatively, we could defer
postPersist
events until all new entities have been inserted. Either way, it's a slight change from the previous implementation where the event would run after all entities of a particular class had been processed.Extra bonus
This is what the DALL·E AI thinks it looks like when an entity persister will have to process entities one-by-one and is no longer able to batch the inserts.