-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a safeguard against multiple objects competing for the same identity map entry #10785
Add a safeguard against multiple objects competing for the same identity map entry #10785
Conversation
c4e2e84
to
1221add
Compare
@@ -56,7 +56,7 @@ public function testIssueProxyClear(): void | |||
|
|||
$user2 = $this->_em->getReference(DDC1238User::class, $userId); | |||
|
|||
$user->__load(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beberlei Do you still remember what you intended to do here? Why should it be possible or allowed to load an entity through a reference when the entity manager has been clear()
ed in the meantime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this tests that the proxy generation has a shortcut for getId
methods that returns the value passed to getReference(.., $id);
I am not sure this needs the calls to __load anymore, but really don't know anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it should be possible to initialize a proxy after the EM it was obtained from has been cleared?
I don’t think so, the EM has forgotten about that proxy instance.
72bba01
to
8cd6a40
Compare
just a quick note: I tried running this PR against some of my code, and I can hit the exception by roughly doing the following:
If I skip step 4 (specifically, the I know this is not really a great analysis, but maybe it'll give someone an idea what goes wrong here. I find this proxy stuff really hard to debug, but it looks like it's causing this issue |
@flack thank you for the observation! Regarding 2) – what kind of association does B have to A, and does it cascade-persist A as well? |
@mpdude it looks like this in
|
…ing merge() This PR tries to improve the situation/problem explained in doctrine#3037: Under certain conditions – there may be multiple and not all are known/well-understood – we may get inconsistencies between the `\Doctrine\ORM\UnitOfWork::$entityIdentifiers` and `\Doctrine\ORM\UnitOfWork::$identityMap` arrays. Since the `::$identityMap` is a plain array holding object references, objects contained in it cannot be garbage-collected. `::$entityIdentifiers`, however, is indexed by `spl_object_id` values. When those objects are destructed and/or garbage-collected, the OID may be reused and reassigned to other objects later on. When the OID re-assignment happens to be for another entity, the UoW may assume incorrect entity states and, for example, miss INSERT or UPDATE operations. One cause for such inconsistencies is _replacing_ identity map entries with other object instances: This makes it possible that the old object becomes GC'd, while its OID is not cleaned up. Since that is not a use case we need to support (IMHO), doctrine#10785 is about adding a safeguard against it. In this test shown here, the `merge()` operation is currently too eager in creating a proxy object for another referred-to entity. This proxy represents an entity already present in the identity map at that time, potentially leading to this problem later on.
…ing merge() This PR tries to improve the situation/problem explained in doctrine#3037: Under certain conditions – there may be multiple and not all are known/well-understood – we may get inconsistencies between the `\Doctrine\ORM\UnitOfWork::$entityIdentifiers` and `\Doctrine\ORM\UnitOfWork::$identityMap` arrays. Since the `::$identityMap` is a plain array holding object references, objects contained in it cannot be garbage-collected. `::$entityIdentifiers`, however, is indexed by `spl_object_id` values. When those objects are destructed and/or garbage-collected, the OID may be reused and reassigned to other objects later on. When the OID re-assignment happens to be for another entity, the UoW may assume incorrect entity states and, for example, miss INSERT or UPDATE operations. One cause for such inconsistencies is _replacing_ identity map entries with other object instances: This makes it possible that the old object becomes GC'd, while its OID is not cleaned up. Since that is not a use case we need to support (IMHO), doctrine#10785 is about adding a safeguard against it. In this test shown here, the `merge()` operation is currently too eager in creating a proxy object for another referred-to entity. This proxy represents an entity already present in the identity map at that time, potentially leading to this problem later on.
…ing merge() This PR tries to improve the situation/problem explained in doctrine#3037: Under certain conditions – there may be multiple and not all are known/well-understood – we may get inconsistencies between the `\Doctrine\ORM\UnitOfWork::$entityIdentifiers` and `\Doctrine\ORM\UnitOfWork::$identityMap` arrays. Since the `::$identityMap` is a plain array holding object references, objects contained in it cannot be garbage-collected. `::$entityIdentifiers`, however, is indexed by `spl_object_id` values. When those objects are destructed and/or garbage-collected, the OID may be reused and reassigned to other objects later on. When the OID re-assignment happens to be for another entity, the UoW may assume incorrect entity states and, for example, miss INSERT or UPDATE operations. One cause for such inconsistencies is _replacing_ identity map entries with other object instances: This makes it possible that the old object becomes GC'd, while its OID is not cleaned up. Since that is not a use case we need to support (IMHO), doctrine#10785 is about adding a safeguard against it. In this test shown here, the `merge()` operation is currently too eager in creating a proxy object for another referred-to entity. This proxy represents an entity already present in the identity map at that time, potentially leading to this problem later on.
c453b74
to
9f40d74
Compare
9f40d74
to
36240b6
Compare
2b4d1a8
to
0d56f09
Compare
0d56f09
to
0c8cb84
Compare
Haven’t run the code, but my guess is: The EM now notices the ID collision already when you pass the second entity to @derrabus Maybe it would make sense to use something more specific than |
This is indeed a behavior change that might be unexpected for existing apps that trusted that the database would reject a colliding ID. We might need to add a feature flag for that. 😕 |
But that means with the flag disabled, we risk the corruption again, right? Because we accept to have two different object instances representing the same thing. |
The database still will reject a colliding ID, it's just that now in most cases, you don't even get that far, because the ORM layer rejects it first. So now your code would need to catch both exceptions. Ideally, this duplication of functionality should be avoided, and as the DB backend is kind of the final source of truth the check should ideally be there, but then, the identyMap logic would need to be different to still be able to avoid corruption |
This is a temporary workaround for the changes made in Doctrine ORM 2.16, discussed in doctrine/orm#10785 When that discussion is resolved, we should remove the check for `RuntimeException`, because it might be thrown by other errors that have nothing to do with duplicate IDs, which would lead to our `PaymentOverrideException` hinting at the wrong thing. Also remove the PHPStan rule that had an exception for this check. When the Exception comes back, the method should have an annotation that it's thrown. Otherwise we'll have to put the PHPStan rule back or try out phpstan-doctrine where this has been solved: phpstan/phpstan-doctrine#315
That was my point, to change |
Yes, but apps could live with this kind of corruption previously. If the flush resulted in a I believe, we should do both:
|
Feature flag – that needs to go into a 2.17 release? |
No, let's treat this as a bugfix. |
Out of curiosity: Is there a design document somewhere that outlines the rationale for the entity graph/identity map? Because if the face of concurrent requests it seems to me that the graph should always be treated as a slightly out-of-date snapshot of the actual DB. |
In doctrine#10785, a check was added that prevents entity instances from getting into the identity map when another object for the same ID is already being tracked. This caused regressions for users that work with application-provided IDs and expect this condition to fail with `UniqueConstraintViolationExceptions` when flushing to the database. Thus, this PR turns the exception into a deprecation notice. Users can opt-in to the new behavior. In 3.0, the exception will be used. Implements doctrine#10871.
In doctrine#10785, a check was added that prevents entity instances from getting into the identity map when another object for the same ID is already being tracked. This caused regressions for users that work with application-provided IDs and expect this condition to fail with `UniqueConstraintViolationExceptions` when flushing to the database. Thus, this PR turns the exception into a deprecation notice. Users can opt-in to the new behavior. In 3.0, the exception will be used. Implements doctrine#10871.
…0878) In #10785, a check was added that prevents entity instances from getting into the identity map when another object for the same ID is already being tracked. This caused regressions for users that work with application-provided IDs and expect this condition to fail with `UniqueConstraintViolationExceptions` when flushing to the database. Thus, this PR turns the exception into a deprecation notice. Users can opt-in to the new behavior. In 3.0, the exception will be used. Implements #10871.
This adds a dedicated exception for the case that objects with colliding identities are to be put into the identity map. Implements doctrine#10872.
This adds a dedicated exception for the case that objects with colliding identities are to be put into the identity map. Implements doctrine#10872.
…ies loaded through fetch=EAGER + using inheritance doctrine#10880 reports a case where the changes from doctrine#10785 cause entity updates to be missed. Upon closer inspection, this change seems to be causing it: https://github.com/doctrine/orm/pull/10785/files#diff-55a900494fc8033ab498c53929716caf0aa39d6bdd7058e7d256787a24412ee4L2990-L3003 The code was changed to use `registerManaged()` instead, which basically does the same things, but (since doctrine#10785) also includes an additional check against duplicate entity instances. But, one detail slipped through tests and reviews: `registerManaged()` also updates `\Doctrine\ORM\UnitOfWork::$originalEntityData`, which is used to compute entity changesets. An empty array `[]` was passed for $data here. This will make the changeset computation assume that a partial object was loaded and effectively ignore all field updates here: https://github.com/doctrine/orm/blob/a616914887ea160db4158d2c67752e99624f7c8a/lib/Doctrine/ORM/UnitOfWork.php#L762-L764 I think that, effectively, it is sufficient to call `registerManaged()` only in the two cases where a proxy was created. Calling `registerManaged()` with `[]` as data for a proxy object is consistent with e. g. `\Doctrine\ORM\EntityManager::getReference()`. In the case that a full entity has to be loaded, we need not call `registerManaged()` at all, since that will already happen inside `EntityManager::find()` (or, more specifically, `UnitOfWork::createEntity()` called inside it). Note that the test case has to make some provisions so that we actually reach this case: * Load an entity that uses `fetch="EAGER"` on a to-one association * That association being against a class that uses inheritance (why's that?)
…ies loaded through fetch=EAGER + using inheritance doctrine#10880 reports a case where the changes from doctrine#10785 cause entity updates to be missed. Upon closer inspection, this change seems to be causing it: https://github.com/doctrine/orm/pull/10785/files#diff-55a900494fc8033ab498c53929716caf0aa39d6bdd7058e7d256787a24412ee4L2990-L3003 The code was changed to use `registerManaged()` instead, which basically does the same things, but (since doctrine#10785) also includes an additional check against duplicate entity instances. But, one detail slipped through tests and reviews: `registerManaged()` also updates `\Doctrine\ORM\UnitOfWork::$originalEntityData`, which is used to compute entity changesets. An empty array `[]` was passed for $data here. This will make the changeset computation assume that a partial object was loaded and effectively ignore all field updates here: https://github.com/doctrine/orm/blob/a616914887ea160db4158d2c67752e99624f7c8a/lib/Doctrine/ORM/UnitOfWork.php#L762-L764 I think that, effectively, it is sufficient to call `registerManaged()` only in the two cases where a proxy was created. Calling `registerManaged()` with `[]` as data for a proxy object is consistent with e. g. `\Doctrine\ORM\EntityManager::getReference()`. In the case that a full entity has to be loaded, we need not call `registerManaged()` at all, since that will already happen inside `EntityManager::find()` (or, more specifically, `UnitOfWork::createEntity()` called inside it). Note that the test case has to make some provisions so that we actually reach this case: * Load an entity that uses `fetch="EAGER"` on a to-one association * That association being against a class that uses inheritance (why's that?)
…ctrine#10878) In doctrine#10785, a check was added that prevents entity instances from getting into the identity map when another object for the same ID is already being tracked. This caused regressions for users that work with application-provided IDs and expect this condition to fail with `UniqueConstraintViolationExceptions` when flushing to the database. Thus, this PR turns the exception into a deprecation notice. Users can opt-in to the new behavior. In 3.0, the exception will be used. Implements doctrine#10871.
…ies loaded through fetch=EAGER + using inheritance doctrine#10880 reports a case where the changes from doctrine#10785 cause entity updates to be missed. Upon closer inspection, this change seems to be causing it: https://github.com/doctrine/orm/pull/10785/files#diff-55a900494fc8033ab498c53929716caf0aa39d6bdd7058e7d256787a24412ee4L2990-L3003 The code was changed to use `registerManaged()` instead, which basically does the same things, but (since doctrine#10785) also includes an additional check against duplicate entity instances. But, one detail slipped through tests and reviews: `registerManaged()` also updates `\Doctrine\ORM\UnitOfWork::$originalEntityData`, which is used to compute entity changesets. An empty array `[]` was passed for $data here. This will make the changeset computation assume that a partial object was loaded and effectively ignore all field updates here: https://github.com/doctrine/orm/blob/a616914887ea160db4158d2c67752e99624f7c8a/lib/Doctrine/ORM/UnitOfWork.php#L762-L764 I think that, effectively, it is sufficient to call `registerManaged()` only in the two cases where a proxy was created. Calling `registerManaged()` with `[]` as data for a proxy object is consistent with e. g. `\Doctrine\ORM\EntityManager::getReference()`. In the case that a full entity has to be loaded, we need not call `registerManaged()` at all, since that will already happen inside `EntityManager::find()` (or, more specifically, `UnitOfWork::createEntity()` called inside it). Note that the test case has to make some provisions so that we actually reach this case: * Load an entity that uses `fetch="EAGER"` on a to-one association * That association being against a class that uses inheritance (why's that?)
…GER + using inheritance (#10884) #10880 reports a case where the changes from #10785 cause entity updates to be missed. Upon closer inspection, this change seems to be causing it: https://github.com/doctrine/orm/pull/10785/files#diff-55a900494fc8033ab498c53929716caf0aa39d6bdd7058e7d256787a24412ee4L2990-L3003 The code was changed to use `registerManaged()` instead, which basically does the same things, but (since #10785) also includes an additional check against duplicate entity instances. But, one detail slipped through tests and reviews: `registerManaged()` also updates `\Doctrine\ORM\UnitOfWork::$originalEntityData`, which is used to compute entity changesets. An empty array `[]` was passed for $data here. This will make the changeset computation assume that a partial object was loaded and effectively ignore all field updates here: https://github.com/doctrine/orm/blob/a616914887ea160db4158d2c67752e99624f7c8a/lib/Doctrine/ORM/UnitOfWork.php#L762-L764 I think that, effectively, it is sufficient to call `registerManaged()` only in the two cases where a proxy was created. Calling `registerManaged()` with `[]` as data for a proxy object is consistent with e. g. `\Doctrine\ORM\EntityManager::getReference()`. In the case that a full entity has to be loaded, we need not call `registerManaged()` at all, since that will already happen inside `EntityManager::find()` (or, more specifically, `UnitOfWork::createEntity()` called inside it). Note that the test case has to make some provisions so that we actually reach this case: * Load an entity that uses `fetch="EAGER"` on a to-one association * That association being against a class that uses inheritance (why's that?)
I don't mean to drag up an old issue but we battled this for a while. While processing thousands of the same entity types we'd have issues with regenerating the same ID. I'm not remembering perfectly and I'll go back to try and resurrect the exact issue flow that would cause this. What was happening was the UOW was either not finding the entity because it had a persistent record of the OID having been seen and the newly asked for entity generated the same OID. I believe we're releasing the entity objects which based on the PHP definition of the function can lead to a re-use of the ID and we were running into this. From php.net:
Does this address that case? We're currently testing the latest version to see if it resolves all of the bugs that we have internally resolved but looking at the commit for this ticket it doesn't appear to cover the issue we were seeing. Given that the OID could conflict naturally (there's no guarantee of unique once objects are getting destroyed) throwing an exception I'm unsure was the right idea here. |
While trying to understand #3037, I found that it may happen that we have more entries in
\Doctrine\ORM\UnitOfWork::$entityIdentifiers
than in\Doctrine\ORM\UnitOfWork::$identityMap
.The former is a mapping from
spl_object_id
values to ID hashes, the latter an array first of entity class names and then from ID hash to entity object instances.(Basically, "ID hash" is a concatenation of all field values making up the
@Id
for a given entity.)This means that at some point, we must have different objects representing the same entity, or at least over time different objects are used for the same entity without the UoW properly updating its
::$entityIdentifiers
structure.I don't think it makes sense to overwrite an entity in the identity map, since that means a currently
MANAGED
entity is replaced with something else.If it makes sense at all to replace an entity, that should happen through dedicated management methods to first detach the old entity before persisting, merging or otherwise adding the new one. This way we could make sure the internal structures remain consistent.