Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDC-3646: Do not select unused columns in inner queries of Paginator #4474

Open
doctrinebot opened this issue Mar 31, 2015 · 8 comments
Open
Assignees

Comments

@doctrinebot
Copy link

Jira issue originally created by user MalteWunsch:

When a Paginator count()s, it fires a query like this:

SELECT COUNT(*) AS dctrn*count FROM (SELECT DISTINCT id0 FROM (%originalQuery%) dctrn_result) dctrn*table

an getIterator() does something similar for joined collections to get the ids in the first place:

SELECT DISTINCT id0, lastmod11 FROM (%originalQuery%) dctrn_result ORDER BY lastmod11 DESC LIMIT 10 OFFSET 0

My problem is that both the inner queries can be too heavy for a regular temporary table.

At first sight I noticed several LEFT JOIN s that could be stripped from the inner query without changing the result of the outer one. But that might have been a specially easy case, a general optimisation of the JOIN clauses might be harder. I think http://www.doctrine-project.org/jira/browse/[DDC-2381](http://www.doctrine-project.org/jira/browse/DDC-2381) deals with that.

On second thought we noticed some ugly BLOB columns as part of the projection, which unhealthily bloated the temporary table. Ultimately, the BLOBs are needed for the paginated view, but they are neither needed for counting nor for getting the paginated ids.

Hence, I propose to remove unused columns from the projection in the inner queries. "Unused" means not being part of the SELECT or ORDER BY clause in the outer query (and maybe not part of GROUP BY, HAVING... - haven't checked on this yet). The key could be the $innerSql in the LimitSubqueryOutputWalker->walkSelectStatement(), but before investigating further, I'd like to hear from you what you think.

@doctrinebot
Copy link
Author

Comment created by @Ocramius:

Please review #1353, as it might solve this issue

@doctrinebot
Copy link
Author

Comment created by MalteWunsch:

Thanks for your quick reply!

To be honest, I don't fully get the PR. At least it's goals look very different to me. Anyway, in LimitSubqueryOutputWalker the $innerSQL is now being generated in getInnerSQL(). In there, some hiddenAliasResultVariable are set to false (and are later restored) so that the parent SQLWalker can reference selected fields better. This might be a requirement for implementing my idea, but as far as I understand, there are no columns removed from the select clause so far?

@doctrinebot
Copy link
Author

Comment created by @doctrinebot:

A related Github Pull-Request [GH-1353] was closed:
#1353

@doctrinebot
Copy link
Author

Comment created by @Ocramius:

[~MalteWunsch] the PR is actually related as it affects the selected fields.

Consider checking again against 2.5.0, as we indeed fixed many paginator issues, and this issue needs to be re-validated.

@doctrinebot
Copy link
Author

Comment created by MalteWunsch:

Great! I'll be glad to check. It might take some time, as we currently only have PHP 5.3 at our disposal (please don't tell anybody), and Doctrine 2 ORM 2.5.0 requires at least PHP 5.4. Fortunately, this is a very good trigger to put some pressure on the PHP upgrade process. I'll keep my eye on it.

@doctrinebot
Copy link
Author

Comment created by MalteWunsch:

[~ocramius] Unfortunately, Doctrine 2.5 had no effect on my inner original query. It still looks like this:

SELECT 
  COUNT(*) AS dctrn_count 
FROM 
  (
    SELECT 
      DISTINCT id0 
    FROM 
      (
        SELECT 
          [all fields from t0_],
          [all fields from t1_],
          [all fields from t2_],
          [all fields from t3_]
        FROM 
          table1 t0_ 
          LEFT JOINs...
        WHERE 
          ...
        ORDER BY 
          ...
      ) dctrn_result
  ) dctrn_table

Just to be clear, my point is with the most inner query. If I'm correct it is generated in LimitSubqueryOutputWalker->getInnerSql(). In the special case with the query above, one could simplify it by dropping the order clause, removing the left joins and selecting only the id0 field. I have no plan for inner joins and other more complex optimisations, but I guess that removing the unused fields from the select clause might be in reach.

@doctrinebot
Copy link
Author

Comment created by mpdude:

When removing fields from the SELECT clause, we need to keep those that are referenced from GROUP BY or HAVING clauses (when ORDER BY is dropped).

@doctrinebot
Copy link
Author

Comment created by mpdude:

[http://www.doctrine-project.org/jira/browse/DDC-2381] seems to be related or identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants