Skip to content

Commit

Permalink
Improve topological sort result order
Browse files Browse the repository at this point in the history
This PR changes a detail in the commit order computation for depended-upon entities.

We have a parent-child relationship between two entity classes. The association is parent one-to-many children, with the child entities containing the (owning side) back-reference.

Cascade-persist is not used, so all entities have to be passed to `EntityManager::persist()`.

Before v2.16.0, two child entities C1 and C2 will be inserted in the same order in which they are passed to `persist()`, and that is regardless of whether the parent entity was passed to `persist()` before or after the child entities.

As of v2.16.0, passing the parent entity to `persist()` _after_ the child entities will lead to an insert order that is _reversed_ compared to the order of `persist()` calls.

This PR makes the order consistent in both cases, as it was before v2.16.0.

 #### Cause

When the parent is passed to `persist()` after the children, commit order computation has to re-arrange the entities. The parent must be inserted first since it is referred to by the children.

The implementation of the topological sort from doctrine#10547 processed entities in reverse `persist()` order and unshifted finished nodes to an array to obtain the final result. That leads to dependencies (parent → before c1, parent → before c2) showing up in the result in the reverse order of which they were added.

This PR changes the topological sort to produce a result in the opposite order ("all edges pointing left"), which helps to avoid the duplicate array order reversal.

 #### Discussion

* This PR _does not_ change semantics of the `persist()` so that entities would (under all ciscumstances) be inserted in the order of `persist()` calls.
* It fixes an unnecessary inconsistency between versions before 2.16.0 and after. In particular, it may be surprising that the insert order for the child entities depends on whether another referred-to entity (the parent) was added before or after them.
* _Both_ orders (c1 before or after c2) are technically and logically correct with regard to the agreement that `commit()` is free to arrange entities in a way that allows for efficient insertion into the database.

Fixes doctrine#11058.
  • Loading branch information
mpdude committed Dec 21, 2023
1 parent 393679a commit 108fa30
Show file tree
Hide file tree
Showing 4 changed files with 223 additions and 29 deletions.
16 changes: 5 additions & 11 deletions lib/Doctrine/ORM/Internal/TopologicalSort.php
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@
use Doctrine\ORM\Internal\TopologicalSort\CycleDetectedException;

use function array_keys;
use function array_reverse;
use function array_unshift;
use function spl_object_id;

/**
Expand Down Expand Up @@ -93,18 +91,14 @@ public function addEdge($from, $to, bool $optional): void

/**
* Returns a topological sort of all nodes. When we have an edge A->B between two nodes
* A and B, then A will be listed before B in the result.
* A and B, then B will be listed before A in the result. Visually speaking, when ordering
* the nodes in the result order from left to right, all edges point to the left.
*
* @return list<object>
*/
public function sort(): array
{
/*
* When possible, keep objects in the result in the same order in which they were added as nodes.
* Since nodes are unshifted into $this->>sortResult (see the visit() method), that means we
* need to work them in array_reverse order here.
*/
foreach (array_reverse(array_keys($this->nodes)) as $oid) {
foreach (array_keys($this->nodes) as $oid) {
if ($this->states[$oid] === self::NOT_VISITED) {
$this->visit($oid);
}
Expand Down Expand Up @@ -147,7 +141,7 @@ private function visit(int $oid): void
}

// We have found a cycle and cannot break it at $edge. Best we can do
// is to retreat from the current vertex, hoping that somewhere up the
// is to backtrack from the current vertex, hoping that somewhere up the
// stack this can be salvaged.
$this->states[$oid] = self::NOT_VISITED;
$exception->addToCycle($this->nodes[$oid]);
Expand All @@ -160,6 +154,6 @@ private function visit(int $oid): void
// So we're done with this vertex as well.

$this->states[$oid] = self::VISITED;
array_unshift($this->sortResult, $this->nodes[$oid]);
$this->sortResult[] = $this->nodes[$oid];
}
}
14 changes: 8 additions & 6 deletions lib/Doctrine/ORM/UnitOfWork.php
Original file line number Diff line number Diff line change
Expand Up @@ -1378,9 +1378,10 @@ private function computeInsertExecutionOrder(): array
$joinColumns = reset($assoc['joinColumns']);
$isNullable = ! isset($joinColumns['nullable']) || $joinColumns['nullable'];

// Add dependency. The dependency direction implies that "$targetEntity has to go before $entity",
// so we can work through the topo sort result from left to right (with all edges pointing right).
$sort->addEdge($targetEntity, $entity, $isNullable);
// Add dependency. The dependency direction implies that "$entity depends on $targetEntity". The
// topological sort result will output the depended-upon nodes first, which means we can insert
// entities in that order.
$sort->addEdge($entity, $targetEntity, $isNullable);
}
}

Expand Down Expand Up @@ -1432,9 +1433,10 @@ private function computeDeleteExecutionOrder(): array
continue;
}

// Add dependency. The dependency direction implies that "$entity has to be removed before $targetEntity",
// so we can work through the topo sort result from left to right (with all edges pointing right).
$sort->addEdge($entity, $targetEntity, false);
// Add dependency. The dependency direction implies that "$targetEntity depends on $entity
// being deleted first". The topological sort will output the depended-upon nodes first,
// so we can work through the result in the returned order.
$sort->addEdge($targetEntity, $entity, false);
}
}

Expand Down
146 changes: 146 additions & 0 deletions tests/Doctrine/Tests/ORM/Functional/Ticket/GH11058Test.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
<?php

declare(strict_types=1);

namespace Doctrine\Tests\ORM\Functional\Ticket;

use Doctrine\Common\Collections\ArrayCollection;
use Doctrine\Common\Collections\Collection;
use Doctrine\ORM\Mapping as ORM;
use Doctrine\Tests\OrmFunctionalTestCase;

class GH11058Test extends OrmFunctionalTestCase
{
protected function setUp(): void
{
parent::setUp();

$this->setUpEntitySchema([
GH11058Parent::class,
GH11058Child::class,
]);
}

public function testChildrenInsertedInOrderOfPersistCalls1WhenParentPersistedLast(): void
{
[$parent, $child1, $child2] = $this->createParentWithTwoChildEntities();

$this->_em->persist($child1);
$this->_em->persist($child2);
$this->_em->persist($parent);
$this->_em->flush();

self::assertTrue($child1->id < $child2->id);
}

public function testChildrenInsertedInOrderOfPersistCalls2WhenParentPersistedLast(): void
{
[$parent, $child1, $child2] = $this->createParentWithTwoChildEntities();

$this->_em->persist($child2);
$this->_em->persist($child1);
$this->_em->persist($parent);
$this->_em->flush();

self::assertTrue($child2->id < $child1->id);
}

public function testChildrenInsertedInOrderOfPersistCalls1WhenParentPersistedFirst(): void
{
[$parent, $child1, $child2] = $this->createParentWithTwoChildEntities();

$this->_em->persist($parent);
$this->_em->persist($child1);
$this->_em->persist($child2);
$this->_em->flush();

self::assertTrue($child1->id < $child2->id);
}

public function testChildrenInsertedInOrderOfPersistCalls2WhenParentPersistedFirst(): void
{
[$parent, $child1, $child2] = $this->createParentWithTwoChildEntities();

$this->_em->persist($parent);
$this->_em->persist($child2);
$this->_em->persist($child1);
$this->_em->flush();

self::assertTrue($child2->id < $child1->id);
}

private function createParentWithTwoChildEntities(): array
{
$parent = new GH11058Parent();
$child1 = new GH11058Child();
$child2 = new GH11058Child();

$parent->addChild($child1);
$parent->addChild($child2);

return [$parent, $child1, $child2];
}
}

/**
* @ORM\Entity()
*/
class GH11058Parent
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue
*
* @var int
*/
public $id;

/**
* @ORM\OneToMany(targetEntity="GH11058Child", mappedBy="parent")
*
* @var Collection<int, GH11058Child>
*/
public $children;

public function __construct()
{
$this->children = new ArrayCollection();
}

public function addChild(GH11058Child $child): void
{
if (! $this->children->contains($child)) {
$this->children->add($child);
$child->setParent($this);
}
}
}

/**
* @ORM\Entity()
*/
class GH11058Child
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue
*
* @var int
*/
public $id;

/**
* @ORM\ManyToOne(targetEntity="GH11058Parent", inversedBy="children")
*
* @var GH11058Parent
*/
public $parent;

public function setParent(GH11058Parent $parent): void
{
$this->parent = $parent;
$parent->addChild($this);
}
}
76 changes: 64 additions & 12 deletions tests/Doctrine/Tests/ORM/Internal/TopologicalSortTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ public function testSimpleOrdering(): void
$this->addEdge('E', 'A');

// There is only 1 valid ordering for this constellation
self::assertSame(['E', 'A', 'B', 'C'], $this->computeResult());
self::assertSame(['C', 'B', 'A', 'E'], $this->computeResult());
}

public function testSkipOptionalEdgeToBreakCycle(): void
Expand All @@ -44,7 +44,7 @@ public function testSkipOptionalEdgeToBreakCycle(): void
$this->addEdge('A', 'B', true);
$this->addEdge('B', 'A', false);

self::assertSame(['B', 'A'], $this->computeResult());
self::assertSame(['A', 'B'], $this->computeResult());
}

public function testBreakCycleByBacktracking(): void
Expand All @@ -57,7 +57,7 @@ public function testBreakCycleByBacktracking(): void
$this->addEdge('D', 'A'); // closes the cycle

// We can only break B -> C, so the result must be C -> D -> A -> B
self::assertSame(['C', 'D', 'A', 'B'], $this->computeResult());
self::assertSame(['B', 'A', 'D', 'C'], $this->computeResult());
}

public function testCycleRemovedByEliminatingLastOptionalEdge(): void
Expand All @@ -75,7 +75,7 @@ public function testCycleRemovedByEliminatingLastOptionalEdge(): void
$this->addEdge('B', 'D', true);
$this->addEdge('D', 'A');

self::assertSame(['C', 'D', 'A', 'B'], $this->computeResult());
self::assertSame(['B', 'A', 'C', 'D'], $this->computeResult());
}

public function testGH7180Example(): void
Expand All @@ -89,7 +89,7 @@ public function testGH7180Example(): void
$this->addEdge('F', 'E');
$this->addEdge('E', 'D');

self::assertSame(['F', 'E', 'D', 'G'], $this->computeResult());
self::assertSame(['G', 'D', 'E', 'F'], $this->computeResult());
}

public function testCommitOrderingFromGH7259Test(): void
Expand All @@ -106,9 +106,9 @@ public function testCommitOrderingFromGH7259Test(): void
// the D -> A -> B ordering is important to break the cycle
// on the nullable link.
$correctOrders = [
['D', 'A', 'B', 'C'],
['D', 'A', 'C', 'B'],
['D', 'C', 'A', 'B'],
['C', 'B', 'A', 'D'],
['B', 'C', 'A', 'D'],
['B', 'A', 'C', 'D'],
];

self::assertContains($this->computeResult(), $correctOrders);
Expand All @@ -124,12 +124,12 @@ public function testCommitOrderingFromGH8349Case1Test(): void
$this->addEdge('B', 'C', true);
$this->addEdge('C', 'D', true);

// Many orderings are possible here, but the bottom line is D must be before A (it's the only hard requirement).
// Many orderings are possible here, but the bottom line is A must be before D (it's the only hard requirement).
$result = $this->computeResult();

$indexA = array_search('A', $result, true);
$indexD = array_search('D', $result, true);
self::assertTrue($indexD < $indexA);
self::assertTrue($indexD > $indexA);
}

public function testCommitOrderingFromGH8349Case2Test(): void
Expand All @@ -141,7 +141,7 @@ public function testCommitOrderingFromGH8349Case2Test(): void
$this->addEdge('A', 'B', true);

// The B -> A requirement determines the result here
self::assertSame(['B', 'A'], $this->computeResult());
self::assertSame(['A', 'B'], $this->computeResult());
}

public function testNodesMaintainOrderWhenNoDepencency(): void
Expand All @@ -153,6 +153,58 @@ public function testNodesMaintainOrderWhenNoDepencency(): void
self::assertSame(['A', 'B', 'C'], $this->computeResult());
}

public function testNodesReturnedInDepthFirstOrder(): void
{
$this->addNodes('A', 'B', 'C');
$this->addEdge('A', 'B');
$this->addEdge('A', 'C');

// We start on A and find that it has two dependencies on B and C,
// added (as dependencies) in that order.
// So, first we continue the DFS on B, because that edge was added first.
// This gives the result order B, C, A.
self::assertSame(['B', 'C', 'A'], $this->computeResult());
}

public function testNodesReturnedInDepthFirstOrderWithEdgesInDifferentOrderThanNodes(): void
{
$this->addNodes('A', 'B', 'C');
$this->addEdge('A', 'C');
$this->addEdge('A', 'B');

// This is like testNodesReturnedInDepthFirstOrder, but it shows that for the two
// nodes B and C that A depends upon, the result will follow the order in which
// the edges were added.
self::assertSame(['C', 'B', 'A'], $this->computeResult());
}

public function testNodesReturnedInDepthFirstOrderWithDependingNodeLast(): void
{
$this->addNodes('B', 'C', 'A');
$this->addEdge('A', 'B');
$this->addEdge('A', 'C');

// This again is like testNodesReturnedInDepthFirstOrder, but this
// time the node A that depends on B and C is added as the last node.
// That means processing can go over B and C in the order they were given.
// The order in which edges are added is not relevant (!), since at the time
// the edges are evaluated, the nodes they point to have already been finished.
self::assertSame(['B', 'C', 'A'], $this->computeResult());
}

public function testNodesReturnedInDepthFirstOrderWithDependingNodeLastAndEdgeOrderInversed(): void
{
$this->addNodes('B', 'C', 'A');
$this->addEdge('A', 'C');
$this->addEdge('A', 'B');

// This again is like testNodesReturnedInDepthFirstOrderWithDependingNodeLast, but adds
// the edges in the opposing order. Still, the result order is the same (!).
// This may be surprising when comparing with testNodesReturnedInDepthFirstOrderWithEdgesInDifferentOrderThanNodes,
// where the result order depends upon the _edge_ order.
self::assertSame(['B', 'C', 'A'], $this->computeResult());
}

public function testDetectSmallCycle(): void
{
$this->addNodes('A', 'B');
Expand Down Expand Up @@ -205,7 +257,7 @@ public function testDetectLargerCycleNotIncludingStartNode(): void
$this->computeResult();
} catch (CycleDetectedException $exception) {
self::assertEquals(
[$this->nodes['D'], $this->nodes['B'], $this->nodes['C'], $this->nodes['D']],
[$this->nodes['B'], $this->nodes['C'], $this->nodes['D'], $this->nodes['B']],
$exception->getCycle()
);
}
Expand Down

0 comments on commit 108fa30

Please sign in to comment.