Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documents are not inserted when spl_object_hash() values collide with previously removed documents #2730

Open
ilminaev opened this issue Feb 17, 2025 · 6 comments

Comments

@ilminaev
Copy link

Bug Report

Q A
BC Break no
Version 2.10.1

Summary

When trying to insert new documents after removing other documents (not necessarily from the same collection), Doctrine skips them if there is an spl_object_hash() collision.

Current behavior

If the spl_object_hash() value of the new document is the same as of a removed one, Doctrine mixes them up and doesn't insert the new document.

How to reproduce

#[ODM\Document]
class TestDocument
{
  #[ODM\Id]
  public ?string $id = null;

  #[ODM\Field(type: 'string')]
  public string $name;

  public function __construct(string $name)
  {
    $this->name = $name;
  }
}

At first, we populate the collection with documents that will be removed later:

foreach (range(1, 3) as $i) {
  $document = new TestDocument("document_{$i}");
  $dm->persist($document);
}
$dm->flush();

Then we remove all the existing documents and create new ones. Object hash collisions should happen here; if they don't, increasing count of the documents might help.

$documents = $dm->getRepository(TestDocument::class)->findAll();
$oldObjectHashes = array_map(spl_object_hash(...), $documents);

// Removing all old documents
foreach ($documents as $document) {
  $dm->remove($document);
}
$dm->flush();
unset($documents);

// Inserting new documents
foreach (range(1, 3) as $i) {
  $document = new TestDocument("document_{$i}");
  $dm->persist($document);
  if (in_array(spl_object_hash($document), $oldObjectHashes, true)) {
    echo "Object hash collision found.\n";
  }
}
$dm->flush();

After executing the script the collection is empty, i.e. no new documents have been inserted.

Expected behavior

All new documents are present in the database.

@GromNaN
Copy link
Member

GromNaN commented Feb 17, 2025

Hello ilminaev, thank you for this complete bug report.

This usage of spl_object_hash() was added in #2586 for transactions.

Would you like to open a PR to replace the spl_object_hash() with a WeakMap? This PR must target the branch 2.10.x and add a test (your reproducer is perfect as test).

@GromNaN
Copy link
Member

GromNaN commented Feb 17, 2025

spl_object_id() could also be used instead of spl_object_hash() (similar to doctrine/orm#8837), but I feel WeakMap is better because the references are removed from the map when the documents objects are removed from the memory. It's worth a performance benchmark.

@ilminaev
Copy link
Author

@GromNaN Thank you for your quick response.

Do you mean I should replace each array in UnitOfWork that is indexed by spl_object_hash() with a WeakMap? I see 5 such arrays.

@GromNaN
Copy link
Member

GromNaN commented Feb 17, 2025

Indeed, I looked too quickly. Using WeakMap would impact a lot of variables (all private to the class). Fixing the bug using spl_object_id() is simpler.

@ilminaev
Copy link
Author

I'm not sure how spl_object_id() might help, because its values are reused as well:

var_dump(spl_object_id(new stdClass) === spl_object_id(new stdClass)); // bool(true)

@GromNaN
Copy link
Member

GromNaN commented Feb 17, 2025

Oh, you're right, and this will not change php/php-src#7862.
So WeakMap is the way to go.

All the UoW properties need to become WeakMap:

  • embeddedDocumentsRegistry
  • parentAssociations
  • documentChangeSets
  • originalDocumentData
  • documentStates
  • visitedCollections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants