Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Processing ,out of memory #6525

Closed
imsheng opened this issue Jun 25, 2017 · 23 comments
Closed

Batch Processing ,out of memory #6525

imsheng opened this issue Jun 25, 2017 · 23 comments

Comments

@imsheng
Copy link

imsheng commented Jun 25, 2017

http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/batch-processing.html#iterating-large-results-for-data-processing

$q = $this->_em->createQuery('select u from MyProject\Model\User u');
$iterableResult = $q->iterate();
foreach ($iterableResult as $row) {
    //nothing
    $this->_em->detach($row[0]);
}
PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to all
ocate 1425408 bytes) in C:\mmood\vendor\doctrine\dbal\lib\Doctrine\DBAL\Driver\P
DOConnection.php on line 104


  [Symfony\Component\Debug\Exception\OutOfMemoryException]
  Error: Allowed memory size of 134217728 bytes exhausted (tried to allocate
  1425408 bytes)

Total: 4000,000 rows
I tried the code and found that was out of memory

@Ocramius
Copy link
Member

Ocramius commented Jun 25, 2017

@imsheng this needs more details, such as:

@imsheng
Copy link
Author

imsheng commented Jun 25, 2017

in symfony console.
doctrine/orm version 2.5
my code is just that:

    protected function execute(InputInterface $input, OutputInterface $output)
    {
        set_time_limit(0);
        $this->em = $this->getContainer()->get('doctrine.orm.default_entity_manager');
        $xxxxDql = $this->em->createQuery('select w from AppBundle:xxxx w');
        $xxxxResult = $xxxxDql ->iterate();
		foreach ($xxxxResult as $row) {
			// do stuff with the data in the row, $row[0] is always the object

			// detach from Doctrine, so that it can be Garbage-Collected immediately
			$this->_em->detach($row[0]);
		}
    }
PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to all
ocate 1425408 bytes) in C:\mmood\vendor\doctrine\dbal\lib\Doctrine\DBAL\Driver\P
DOConnection.php on line 104

nothing else

@Ocramius
Copy link
Member

Run it with -vvv

@imsheng
Copy link
Author

imsheng commented Jun 25, 2017

C:\mmood>php bin/console xxxxxxx -e=prod -vvv
PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to all
ocate 1429504 bytes) in C:\mmood\vendor\doctrine\dbal\lib\Doctrine\DBAL\Driver\P
DOConnection.php on line 104


  [Symfony\Component\Debug\Exception\OutOfMemoryException]
  Error: Allowed memory size of 134217728 bytes exhausted (tried to allocate
  1429504 bytes)


Exception trace:
 () at C:\mmood\vendor\doctrine\dbal\lib\Doctrine\DBAL\Driver\PDOConnection.php:
104

@Ocramius
Copy link
Member

That still shows no trace, but the fact that it is crashing within the connection may indicate that the resultset is too big.

@imsheng
Copy link
Author

imsheng commented Jun 25, 2017

what should I do? I need to deal with tens of millions of data

@Ocramius
Copy link
Member

Sadly, PDO does not have cursor-alike operations: mysqli does, for example.

Just a hint though: if you are off doing million records operations, then the ORM is likely the wrong tool for the job...

@Ocramius
Copy link
Member

Ocramius commented Jun 25, 2017

Ref: doctrine/dbal#2718

@imsheng
Copy link
Author

imsheng commented Jun 25, 2017

I tried this method, but still out of memory:

$count = $this->em->getRepository('AppBundle:xxxxxx')->createQueryBuilder('xxxxxxx')
            ->select('count(xxxxxxx) c')
            ->getQuery()
            ->getSingleScalarResult();
//count:4000,000
        for ($i = 0; $i <= $count - 1; $i++) {
            $xxx= $this->em->getRepository('AppBundle:xxx')->createQueryBuilder('xx')
                ->setFirstResult($i)
                ->setMaxResults(1)
                ->getQuery()
                ->getSingleResult();
           //nothing
        }

@Ocramius
Copy link
Member

Ocramius commented Jun 25, 2017 via email

@imsheng
Copy link
Author

imsheng commented Jun 25, 2017

without update or insert.
The same is true with add this code.

$this->em->detach();
$this->em->clear();
unset();

what I do is a crawler
the php needs to run for a long time.
Thanks for your help :)

@Ocramius
Copy link
Member

Then there's a memleak somewhere else - this needs debugging either in your codebase, or be reproduced in an isolated test case within this library: I can't help further without any of those.

@trickeyone
Copy link

Setting the variable to NULL is recommended in place of using unset. It tends to work better with GC (refCount issues). You should also look at if you have any relations set to EAGER. If you do, it would probably be better to use LAZY or EXTRA_LAZY to keep from having too many objects loaded into memory.

@Ocramius
Copy link
Member

Ocramius commented Jul 15, 2017

It tends to work better with GC (refCount issues).

It doesn't: refcount works in the exact same way. If it doesn't, report a php-src bug.

@elnur
Copy link

elnur commented Sep 10, 2017

Running into this problem, too. I can't even get to iteration itself. It fails with OOM right here:

$this->entityManager
    ->getRepository(UploadView::class)
    ->createQueryBuilder('v')
    ->orderBy('v.id', 'asc')
    ->getQuery()
    ->iterate();

Need to process 12M rows.

I expected it to work the same way DBAL does because I don't have a problem iterating over a huge result set with DBAL. This time I need model objects tho.

@elnur
Copy link

elnur commented Sep 10, 2017

As a workaround, I ended up iterating over 10K rows at a time:

$lastId = 0;

while (true) {
    $iterable = $this->entityManager
        ->getRepository(UploadView::class)
        ->createQueryBuilder('v')
        ->where('v.id > :id')
        ->orderBy('v.id', 'asc')
        ->setMaxResults(10000)
        ->getQuery()
        ->iterate([
            'id' => $lastId,
        ]);

    $iterated = false;
    foreach ($iterable as $row) {
        $iterated = true;
        /** @var UploadView $view */
        $view = $row[0];
        $lastId = $view->getId();
        // do actual processing
        $this->entityManager->detach($view);
    }

    $this->entityManager->clear();

    if (!$iterated) {
        break;
    }
}

@Ocramius
Copy link
Member

Ah yes, fairly sure that the ORM doesn't lazily iterate over DBAL resultsets: that is a big issue that needs work.

@lcobucci
Copy link
Member

Ah yes, fairly sure that the ORM doesn't lazily iterate over DBAL resultsets: that is a big issue that needs work.

@Ocramius it would possibly be done for ORM v3.x, right?

@Ocramius
Copy link
Member

Ocramius commented Oct 29, 2017 via email

@imsheng imsheng closed this as completed Dec 6, 2017
@stouch
Copy link

stouch commented Nov 24, 2019

$em->getConnection()->getConfiguration()->setSQLLogger(null);

worked for me

@Cirrusware
Copy link

Try this, works for me

$conn = $emd->getConnection();
$conn->getWrappedConnection()
->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
$conn->getConfiguration()->setSQLLogger(null);

@Cirrusware
Copy link

If you wish persist an entity, then you have this error
SQLSTATE[HY000]: General error: 2014 Cannot execute queries while other unbuffered queries are active. Consider using PDOStatement::fetchAll(). Alternatively, if your code is only ever going to run against mysql, you may enable query buffering by setting the PDO::MYSQL_ATTR_USE_BUFFERED_QUERY attribute.

To avoid this, create another Entity Manager with PDO::MYSQL_ATTR_USE_BUFFERED_QUERY to true (by default) and use merge.

$emd2->merge($entity);

@rafalrudnicki
Copy link

$em->getConnection()->getConfiguration()->setSQLLogger(null);

worked for me

It works perfectly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants