-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid unnecessary information in query hints to improve query cache hit ratio #8797
Conversation
Technically yes, a possible BC break. I don't know if it is considered relevant, though. (Every change breaks someone's workflow, I know.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please sum up or reuse your PR message body as a commit message body, it's very clear and would be a great addition to the git log.
@greg0ire Requested changes made. Should we rename the If this constant is considered an internal implementation detail (and this fix here not a relevant BC break), then renaming it might make the change even more obvious for people relying on the details. |
Renaming should be fine indeed: https://github.com/search?q=HINT_PAGINATOR_ID_COUNT&type=code |
@greg0ire You mean "fine indeed" like "There's about 21,927 code results, but I've checked them all and its not an issue"? |
Strange search result when you exclude files named |
I mean: the first that come up are repos that have doctrine/orm versioned in them, so poorly coded repos, unlikely to be starred or forked by anyone, and if there was any useful package with a mention of
Indeed, I think you found a bug in Github! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me despite the fact that we technically have a BC-break here 👍
My only doubt now is on the target branch. Not sure if this should be considered a bug. |
It doesn't look like a bugfix to me. It's more of a feature that replaces an older one for improvement. Maybe 3.0.x would be a more fitting target. |
Or 2.10.x? Since the BC-break is small… |
Or, let's just avoid the BC-break entirely: /**
* @deprecated use self::HINT_PAGINATOR_HAS_IDS instead
*
* ID Count hint name.
*/
public const HINT_PAGINATOR_ID_COUNT = 'doctrine.id.count';
/**
* ID existence hint name.
*/
public const HINT_PAGINATOR_HAS_IDS = 'doctrine.paginator_has_ids'; (later, check both hints, but only set one ourselves) |
I think the BC break is not so much the renaming, but the fact that the query hint no longer contains a count but a boolean instead. Renaming the constant only makes this more and directly obvious to people relying on the value, since their code should (hopefully) fail. So, under the assumption that nobody should be using/referring to this hint or constant in the first place, I'd rather suggest a "hard" cutover. |
If you agree that So |
I agree, let's target 2.10, and revert in 2.10.1 in the unlikely event people complain. |
Does it make sense to mark parts of this as |
@SenseException does Doctrine have something like a policy document describing e. g. when to use |
None that I know of, but some BC questions can be handled when something is marked as internal. |
Its a bugfix to me, because it hurts system scalability and performance and uses caches wrong |
8e67588
to
cc84117
Compare
Rebased onto |
Please take a look at the PHPCS/Psalm failures. The build has to be green for a merge. Also, would it be able to write a test for this issue? I'd like to avoid a regression here. |
@derrabus I am not really sure what a reasonable test might look like...? |
Probably some kind of functional test with two queries having caching enabled, where the second query should hit the cache but doesn't without your changes. I mean, you somehow must've tested your change to see if it solves your problem. If you can create an automated test from that, you would help us very much. |
84d9747
to
3b44157
Compare
Rebased onto 2.14.x |
@derrabus I added the requested functional test, and also verified that the test fails without the changes being suggested here. |
@lcobucci I saw you worked a lot on the Paginator code a while back – maybe you could review this? |
I'm on holidays in Brazil and won't be able to look at it. I'm back in the second week of February and can review. |
…it ratio I've noticed that over time my query caches fill up with redundant queries, i. e. different cache entries for the DQL -> SQL translation that are exactly the same. For me, it's an issue because the cache entries fill up precious OPcache memory. Further investigation revealed that the queries themselves do not differ, but only the query hints – that are part of the computed cache key – do. In particular, only the value for the `WhereInWalker::HINT_PAGINATOR_ID_COUNT` query hint are different. Since `WhereInWalker` only needs to know _if_ there are matching IDs but not _how many_, we could avoid such cache misses by using just a boolean value as cache hint.
9cc9a7d
to
660197e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@greg0ire @beberlei @mpdude I believe that converting from int
to bool
won't have any impact on the lib 👍
We might be affecting users, though, as we don't know how query hints are used in the wild. AbstractQuery#getHint()
, AbstractQuery#getHints()
, and AbstractQuery#hasHint()
are public methods that could be used for code instrumentation, for example.
It feels like we're abusing query hints a bit to communicate state between the Paginator
and the WhereInWalker
buuuut it's been a feature/bug/trait/peculiarity for a long time (also used in other places) 😄
@mpdude I see you're improving the caching capabilities on the ORM... it might be good to test the combination of the cacheable queries ( |
51bb0b1
to
e7a3b19
Compare
... so we can be sure that in fact the second result has a different size. Co-authored-by: Luís Cobucci <[email protected]>
e7a3b19
to
29bc6cc
Compare
@lcobucci Thank you for the suggestion to make the test more specific! In fact, that change uncovered that both result sets still had the same size, since the IDs were autogenerated and did not start from 1 when multiple tests were executed together. So, I had to tweak it a bit to take the actual IDs into account, but now it works. Please re-approve if you don't mind. |
Thanks @mpdude |
I've noticed that over time my query caches fill up with redundant queries, i. e. different cache entries for the DQL -> SQL translation that are exactly the same. For me, it's an issue because the cache entries fill up precious OPcache memory.
Further investigation revealed that the queries themselves do not differ, but only the query hints – that are part of the computed cache key – do.
In particular, only the value for the
WhereInWalker::HINT_PAGINATOR_ID_COUNT
query hint are different. SinceWhereInWalker
only needs to know if there are matching IDs but not how many, we could avoid such cache misses by using just a boolean value as cache hint.