-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joins typeck
is exponential in number of joins
#3223
Comments
Maybe related: rust-lang/rust#20304 |
It could also be related to rust-lang/rust#99188 which also has exponential behavior (if associated types, and big wrapped items are involved, which I'm not sure is the case here I haven't had the time to look at the repo yet).
First, it's easier if the reproduction code is minimal. That is, if it doesn't depend on diesel itself, but extracts some of the code that triggers the issue. Then trying to reduce that to a very small size, so that any other piece of code that is not related to the issue doesn't interfere with the measurement or fixing processes. With that done (it could be done independently of course), one could also easily see if this is a new issue, or an old one, to see if it's a regression or not, and locate the PR that caused it if that's the case. Notifying the PR author and reviewer would then be good, if it is a regression. In parallel, the easiest first step would be to narrow down the vague area of rustc where the issue happens. One would that looking at the -Zself-profile data. I would guess that either These 2 steps would also allow looking for rust issues that could already track the problem, maybe there are workarounds there, or a lack of data that prevents people from working on it. Though, if it's a stress test that doesn't look like real code people use, it may not be seen as high priority: the scalability behavior at higher Ns, while annoying and a bug needing to be fixed, can be pretty rare to hit in practice (and a sign to "not do that and hopefully have a different design if at all possible"). There are many issues filed already, and sometimes they conflict with each other: a bunch of such regressions happened because of incorrect handling for incremental compilation, and slower compile times in rare cases are a compromise compared to having possible soundness issues. A lot have been fixed already (for example, #99188 was also impacted by some trait system caching removal in 1.56 to fix incremental compilation soundness issues, there was a 1000x slowdown in that stress test. Until some of that was improved in 1.59 and performance is now better than in 1.55) and more will be in the future. But some of those are very hard to fix, and there are not many people with the time and knowledge to do so (and most of them are volunteer contributors with limited time). All the above will help gather data to have all the information to understand the issue, and help people who could fix it. |
Thanks a lot for your answer!
I looked into this yesterday as it seemed to be a very close issue (this here is indeed is an issue when evaluating a trait bound on a large wrapped type) but I think that is not the exact same issue because:
(rust-lang/rust#99188 (comment)) Also according to this writeup the error from 99188 happens at the borrow checking step, while this one happens at the
So I had done that (screenshots from the reproduction repository linked above) and
Would that be going through https://github.com/rust-lang/rust/issues?q=is%3Aopen+label%3AI-compiletime+label%3AA-traits and looking for similarities or is there another method? (or is "one" somebody specifically? ^^) Thanks again! |
I've tried to create a minimal reproducible example without diesel here, but I failed. The example compiles, but does not reproduce the exponential behaviour. I post it nevertheless here, maybe that's helpful anyway. |
I suggest:
|
I've tried to just reproduce @Ten0 results locally and I've failed doing that. So here is what I done so far:
That means at least for me the check times are heavily dependent on what is cached by incremental compilation and what not. (This happens with the latest nightly as well as the referenced compiler version) |
Incremental compilation cache is actually pretty smart here: if you modify the value in the println, or the value of a number, etc, it will actually hit the cache and not recheck the whole crate. Have you done the comment/uncomment stuff and added a new println for each measured run? (steps 2 and 6) |
I should have been more precise about this. The following changes do not result in a large check time:
The following changes do result in large check times:
I assume that all of the later changes just invalidate the incremental compilation cache. In my opinion that gives two issues to maybe report to the rustc team:
|
This is a known issue and is being worked on: adding things above a function currently changes the spans of all the items following it, invalidating them since they can be used to refer to other items' locations in diagnostics for example. It could be interesting to try the WIP work supposed to fix some of these cases: |
So it looks like you did succeed to reproduce my results. :)
Indeed doesn't change anything for this particular reproduction case.
It looks like I failed to correctly extract the repro from my original crate then:
Any clue of why that could be? |
Thanks for this information. It's good to hear that this is already worked on. I've run some more profiling on this. Everything that follows is with 17 tables + recorded with
It's not clear for my why exatly this bounds show up as beeing that expensive to evaluate, while other bounds seems to be fine even if they involve similar types. First `evaluate_obligation` in a blockThis one takes 60msarg0:
Second `evaluate_obligation` in a blockThis one takes 320msarg0:
Third `evaluate_obligation` in a blockThis one takes 32ms arg0:
Fourth `evaluate_obligation` in a blockThis one takes 412ms arg0:
Fifth `evaluate_obligation` in a blockThis one takes 376ms arg0:
|
About notably fourth block: Looks like the fields of the table (even if unused afterwards) intervene a lot here. NB: that commit is based on an older version of Diesel that would make it easier to try and remove the QuerySource bound from building the join sequence to see if that works around the issue, which is hard in current version because of the lifetime requirement in the query builder (#2931) that makes it necessary to have the bound on the |
It's interesting that not appending to the large tuple is expensive, but rather than that checking if a select clause is valid. The corresponding trait impl is rather simple: diesel/diesel/src/type_impls/tuples.rs Line 193 in 5c0452e
Btw: @Ten0 It would be great to land the postgres row-by-row PR soon, so that I can at least release a 2.0.0-rc.1 version. |
I spend a bit more time to look into this. Commenting out the following trait bounds reduces compilation time drastically for me (from 5s to < 1s for 17 joins) diesel/diesel/src/type_impls/tuples.rs Line 194 in 17c1bb4
diesel/diesel/src/type_impls/tuples.rs Line 200 in 17c1bb4
Interestingly both impls needs to be removed. Unfortunately are both impls required for checking whether something is selectable for a given table or not. The following traits are involved there: pub trait SelectableExpression<QS: ?Sized>: AppearsOnTable<QS> {}
pub trait AppearsOnTable<QS: ?Sized>: Expression {}
pub trait Expression {
type SqlType: TypedExpressionType;
}
pub trait TypedExpressionType {} They are all implemented for tuples of various sizes. I guess that rustc is doing a lot more work as required there. Something like it starts checking if I've tried to put that observation in a minimal example, but I haven't been able to reproduce it there yet. Likely I miss an important detail or just choose too small tuple sizes. @lqd are there any easy ways to check that hypothesis through rustc logging/instrumentation? |
So I started the work on this again today, findings/recap: We have 3 separate sub-issues:
I tackled part of 1:
Next steps to further improve on 1 enough to get to a reasonable compile time would be: Pushing further on 1 or 2 or 3 would probably make dev flow reasonable. (we aim for <~2s before we get typecheck feedback in IDE) So my next steps will probably be to: |
@Ten0 Just wanted to thank you on this investigation. I've been subscribed to this issue because I have a production codebase suffering from this typeck issue as well. |
Thank you! :) |
@weiznich any strong preference for either of these two? |
Sorry to say that, but I would like to avoid both. In my opinion it is a really bad idea to make this behaviour configurable, as it makes it even harder to document what joins will do in diesel. That's already a large issue as it's not obvious for may users. Adding further configurations there will really hurt and have non-local effects. It might be a better solution to just somehow skip the checks to |
How about just removing default select clauses altogether then?
That seems particularly acceptable now that we have |
`DefaultSelection` type of joins This commit introduces infrastructure that allows to skip type checking the `DefaultSelection` of joins. diesel-rs#3223 discovered that this is responsible for a substantial amount of compile time for larger joins/codebases. We do not need to check this constraint there as it is always true because each of the tables/sides of the join already have an valid `DefaultSelection`. Therefore the resulting `DefaultSelection` must be valid for the join too (if there is no bug in diesel itself).
Removing the default select clause would be an option and I agree that it might be desirable for the reasons listed. But I feel that's a too large breaking change for now. Basically all existing code (that does use a stable release from crates.io) relies on this. So just removing it now would unfortunately impact to much code in a non-trivial way. If we plan to do this someday we should introduce a proper deprecation period for this feature + offer a complete migration story before finally removing this feature. I've put together weiznich@567bb1d which skips the |
I've pushed to your let q = schema::some_table_1::table
.inner_join(schema::some_table_2::table)
.inner_join(schema::some_table_3::table.inner_join(schema::some_table_4::table)); With this additional change I was able to compile our main crate, and that took 25s (instead of 20s for the no-default-selection approach). (I suspect the difference is due to the still-present requirement of checking However, this approach seems to start allowing invalid queries of this kind: let q: (
(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32),
(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32),
(
(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32),
(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32),
),
) = schema::some_table_1::table
.inner_join(schema::some_table_4::table)
.inner_join(schema::some_table_3::table.inner_join(schema::some_table_4::table))
.get_result(db)
.unwrap(); (notice how In other words, the following statement seems to be invalid when several join chains come into play: On the current master, this correctly fails to typecheck. NB: Another advantage of allowing to remove default selections for tables would be that it's not required anymore to increase diesel's compilation time a ton with the n-column-tables features as long as you never select all of those columns at the same time (which is also the case for us). NB2: No-default-selection seems to suffer from a similar issue where the following also typechecks: let q: i32 = schema::some_table_1::table
.inner_join(schema::some_table_4::table)
.inner_join(schema::some_table_3::table.inner_join(schema::some_table_4::table))
.select(schema::some_table_1::id)
.get_result(db)
.unwrap(); My guess would be this means that |
@Ten0 I've pushed another update that addresses the issues you've found. I continue to see an speedup from ~5s down to < 1s for your example repository with an example join with 17 tables for cases where the incremental compilation does not cache the results. Rustc self compile does report that the longest typecheck query now takes less than 20ms (400ms overall for typecheck). I feel that are good results without doing major breaking changes. (As outlined above I feel that's not possible that close to a release anymore). The next step for future speedups on your code base would likely to check the self compile output again to see where the time is spend now. It's quite possible that there are other places with could need some optimization as well. |
The current state of #3288 looks good to me (almost) for merge, and the state of the investigation on this issue is such that I'm fine with 2.0 release at this point :) (looks like further improvements are unlikely to be easy breaking changes) On the topic of this issue still, I get 28s of compilation with this (vs 20 and 25 with previous iterations - that weren't completely sound). The self-profiling output of my crate seems to show that the vast majority of the time is still spent checking Diesel trait bounds - so I don't think we can resolve this issue just yet (although it will probably require a rename ^^). On other news, I've found #2931 to be responsible for some part of the compilation time there, as it introduces this kind of bounds checks: All because of:
On the issue reproduction crate I get:
(#3288 performs much better than both) -> Attempting to decouple That however is a large enough work (if it even works) that I'm fine with not delaying 2.0 further, esp. since I still want to investigate 3 before. :) |
Thanks for doing this investigation. Those links are broken for me, or at least they do not jump to any line. Maybe you can replace them with links to the corresponding trait bounds on the master branch? |
behavior for me on Firefox and Chromium is that they end up jumping to the correct place if you wait long enough (~15s ^^) |
@Ten0 Same for me, 15s! haha |
Seems like I did not wait long enough. It's working now 👍 |
This comment was marked as off-topic.
This comment was marked as off-topic.
@dancixx As long as you don't provide any new information these kind of posts is highly discouraged as they are not helpful at all. At least you would need to provide an reproducible example here, otherwise please stop posting to old issues. |
Setup
Versions
postgres
,64-column-tables
Problem Description
Joins
typeck
appears to be exponential in number of joins, despite incremental compilation and not touching the function that contains the joins itself.That makes getting compiler feedback after touching a crate that uses a lot of Diesel queries take an unreasonable amount of time.
E.g. with just 7 joins:
=> With a few hundred such functions in a crate, getting the smallest amount of feedback from a cargo check when making changes easily takes several minutes.
Steps to reproduce
The following repository gives code and detailed step-by-step on how to observe this behavior:
https://github.com/Ten0/diesel_3223_repro
@lqd It seems you have been investigating a similar performance issue that notably affected Diesel earlier this year. Would you have any insight on what would be the process to follow in order to tackle this? Thanks a lot! 🥰
The text was updated successfully, but these errors were encountered: