-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A lot of code compiled on a small change (local variable name change) #1396
Comments
Did you start seeing this behavior with some specific version sbt (if so could you identify the first version it does this)? Do you use macros, lots of implicits, or package objects? Normally Zinc won't invalidate if the change is limited to a local variable (a variable inside of a function or a method). |
Would you like to share the log? There's still a chance that we can extract something useful out of the log. Ideally also share the entire scala project. The log hopefully contains reason for invalidation somewhere. This is indeed a peculiar issue as changes such as “after I renamed a local variable from add to toAdd” should not have any impact at all. I still need to double check but I believe Zinc does not grant local variable in a function a name, so none of the name hashes should change. Also does the log always start with
If that is the case, chances are the 2 files are always the same 2 files. |
Yes, it always starts this way. The issue is 100% reproducible. For me the logs contain too much information, it is hard for me to digest it and to learn what is going on from it. I can share you the log privately, if you suggest a suitable means to do so. As for the project, I would be very reluctant to share it, unless there is some NDA about it. |
I see. For the log, if you use discord my discord ID is vertex1003. You can send a google drive link containing the log via discord DM, and after I downloaded the log I will reply to you and you can then immediately disable the google drive link. If you use other messaging app, tell me the app name and your ID and I will register an account and add you there. Btw may I share segment of the log relevant to the issue to other Scala tooling contributors for sake of debugging this issue? |
@OndrejSpanel Could you try: Compile / incOptions ~= { _.withTransitiveSteps(10) } I noticed that you ran out of transitive steps (default is 3 I think), which explains cycle 4 is super big. |
Going slightly offtopic but should we consider raising the default transitive step to something higher than 3? I was reading source code of Intellij Scala Plugin and noticed their Scala Language Server sets transitive step to 5. |
At the rate that it expanded from 2 to 36 files, I am still guessing it's something to do with macro invalidation, but if this fixes the problem maybe we could consider upping to 6. |
I have added
This is much better, but still quite a lot for a renamed local variable.
I have the logs prepared, discord friendship request sent (username ospanel). |
With { _.withTransitiveStep(30) } the logs are the same. For repro I rename the same variable between
As for macros, I use several libraries which use them quite extensively, namely borer, circe, quicklens and airframe.surface. |
With
This object uses airframe.surface, namely my "light" fork of it (https://github.com/OpenGrabeso/light-surface/). |
I am trying to create a repro using [debug] [diff] Detected API change because net.gamatron.json.Schema contains a macro definition. I am trying to include various constructs those classes use from Schema, but do not see the issue in my repro so far. Some things Schema includes, which the classes mentioned as "member ref dependencies" use: // macro declaration
def enumerateSubclassesImpl[T: Type](using Quotes): Expr[List[Surface]] = ???
inline def enumerateSubclasses[T]: List[Surface] = ${ enumerateSubclassesImpl[T] }
inline def listDerivedClasses[T]: List[TypeDesc] = enumerateSubclasses[T].map(TypeDesc.apply(_))
// annotation declaration
type hiddenType = hidden
// type declarations
case class TypeDesc(surface: Surface) What are "member ref dependencies" and when are they invalidated? I am not even sure why is Almost all classes listed in the first step as "by member reference" use |
Member ref dependencies are basically direct reference from a class to another. class A
class B {
val a = new A;
} Then B has a member ref dependency to A as B refers to A. As for why they are invalidated, there's lots of ways they can be invalidated. One common way is that if a class has macros (either contains a macro call site or macro implementation), then all of its member ref dependents are invalidated, which do appear like what is taking place judging by the appearance of
I am looking into it. The debug log should contain the reason but the log is a bit hard to read... Which I guess is an area Zinc can work on. |
If you think there is any experiment I could run on my project or any other information you want, just ask. I am very glad you care about the issue and work on solving it. Thanks a lot. |
I confirm |
One more observation: I do not even have to rename a local variable. Just editing a comment in (almost) any source is enough. I see very similar messages as before, only there is 1 Scala source compiled in the compilation cycle 1, not 2:
|
If the package object gets shaken, it's game over. So my suggestion would be to split the source code as much possible for anything that might invalidate the package object, or maybe just avoid package object altogether. |
Thanks for the tip. I will check what package objects could be involved, I certainly have some present. (Or if the logs tell more about what package object in particular is involved, it will certainly help me). And to be sure I understand: by "shaked" you mean "invalidated"? Still I do not understand why changing a comment or a local variable name shakes anything at all. |
When I replaced my most prominent
|
When changing a local variable name in a file contains macro (either implementation or call site), zinc performs very aggressive invalidation since macro expands at callsite. I still need to think more about what is going on. I believe that Zinc's macro invalidation strategy is that a when a source file containing macro is invalidated, all downstream files (all member ref dependents of the macro containing file) would be invalidated. But I would assume that
Shake means invalidated. |
Can you give me some examples of what to avoid?
Downstream / upstream - I am afraid I am not sure yet what it means. In this case In this repo I try to replicate how is it used in the real project - https://github.com/OndrejSpanel/incremental-repro - so far I was not able to reproduce the issue, though. I will continue to add more code, but any hints on what might be the problematic construct would be helpful. |
Upstream means the class is relatively higher in the dependency tree. Like if class A depends on class B, then B is upstream relative to A.
Thanks for the info! This is really important information that helps to narrow down possibilities. |
I will try to see if I can talk to Scala 3 compiler devs. I don't have any idea on what might be the problematic construct but they might have an idea on what constructs may cause the issue. |
I have created a standalone repro - https://github.com/OndrejSpanel/IncRepro. It is about 15 scala source files, referencing each other in various ways, including opaque types and exports, sometimes using unused imports, with some package objects present. To see the issue:
The output I see after the second compile:
(I hope it will be useful, I spent more than 10 hours reducing it to this shape.) |
If there is any progress with this issue, please let me know. Even if it is not resolved, just knowing what kinds of constructs trigger this would immensely help me, as I could refactor my sources accordingly. The long compile times on trivial changes are killing me. |
Sorry for lack of communication in the last few days. I plan to spend this entire weekend looking into this issue and will share any intel found as comment. |
Didn't find the root cause yet but did some narrowing down the issue and found out that zinc/internal/zinc-core/src/main/scala/sbt/internal/inc/IncrementalCommon.scala Lines 398 to 408 in 59505ce
has incorrect result. Just by changing a comment, More specifically It is also very weird that despite I am also using https://github.com/OndrejSpanel/IncRepro, I am somehow getting different logs than #1396 (comment). For me the log is
|
ok. Reminder to everyone that
If working on Zinc is fun, please by all means keep doing what you're doing, but please do not feel obligated on anything. |
I see this when I modify the project using IntelliJ IDE with Compiler based error highlighting. In this mode the IDE interacts with the build system and skews results very much. I have learned hard way that to learn anything useful about the SBT build process, the IDE should not be open, or at least be in Built-in highlighting mode. When I use notepad to edit the files, I consistently get the results I have shown: PS C:\Dev\inc-repro> sbt compile I running this on Win11 x64, Java 21. |
Thanks Eugene for reminding me the facts. While Ondrej is being extremely helpful in terms of investigating the issue and being very nice in general, indeed it does not mean I have an obligation to resolve this ticket by doing a weekend crunch. I will keep investigate this ticket, but I shall prioritize playing with my cat during the long weekend instead of directing all my energy towards this issue. |
Thanks. Oh my there are a lot of holes. One is that Compiler based error highlighting ruins class file reproducibility. I used an online file hash generator and found out the class file hashes indeed are different across different runs... I shall try using |
For information: I have create a post about this on scala-users. |
I had figured out what caused
Basically Zinc has a mechanism called "third order invalidation" introduced by Dale in #1282. Zinc tracks list of all (transitive) classes for which a macro (in this case @OndrejSpanel I don't have access to your entire project, but if possible, breaking any part of the chain should ease the issue. In long term, we need a better macro invalidation (due to upstream change) mechanism. Bytecode based call graph analysis is one promising candidate. The issue with bytecode based analysis though is that it takes significant effort to implement it. Despite its obvious benefits (much refined third order invalidation & potentially better Btw welcomes any other idea that third order invalidation can be improved. The best case scenario is if we can come up with some macro invalidation strategy that is almost as good as bytecode analysis but takes significantly less effort to implement. Although no one had ever came up with such strategy, few people were looking into this issue anyways. |
And it really does. When I move the macro Thanks for the analysis—I really appreciate it! (I hope you had a great time with your cat.) I'd like to express my gratitude in a more tangible way. I've sent you an invite, so you'll be able to find my contact information if you're interested. |
Can you describe how can I analyze logs so that I am able to find such invalidation chain on my own? While breaking this chain helps in the repro, there must be some other quite different type of chain in the real project, as even without any macros at all I get following behaviour on trivial code changes:
Even without a macro I can see in the real project that even trivial changes (changing comment, renaming local variable) cause compilation of at least two Scala source files. Now I could spend another day creating another repro, but perhaps understanding logs and being able to identify the invalidation chains would be more productive? What makes understanding logs hard for me is that there is a lot of information dumped by sbt / zinc about files / classes which I assume were checked for invalidation, but not invalidated eventually. I consider creating some tool which would parse the logs and show just the invalidation information from them. |
When
Eugene suggested package object might be causing mass invalidation. Here's the relevant log
Try to remove
I do feel like relying on user to know library internal & logs that are tightly coupled with library internal indicates some deeper problem. Fundamentally there's this common sentiment that Scala has poor tooling, and the particular problem we are investigating is merely a special case of it. While I currently have no intention of doing any bounty & receiving any form of monetary compensation, I do feel like some form of incentive structure is the proper way forward. Ideally Scala Center/Lightbend can hire more people to work on Scala tooling full time (but that's easier said than done). Li Haoyi style bounty also seemed to be quite effective. |
Part of the issue in the real project is #867 (comment):
My Java files are annotations only (because of Once I disabled pipelining, the behavior is much more reasonable, I usually get only 2 compilation cycles, not 3 or more. (Even 2 are too many on a trivial change, I am working on a reproducer, but the progress is slow). I am not sure why any of the Java files was invalidated in the first place, and so far I am unable to examine the invalidation chains, therefore I cannot really tell. |
I believe this is due to #630
Yeah unfortunately pipelining can have some surprising interaction with incremental compilation. zinc/internal/zinc-core/src/main/scala/sbt/internal/inc/IncrementalCommon.scala Lines 120 to 140 in 63e2ade
The best thing we can do is to come up with a minimal reproduction right now. Even if we can examine the chain, it can be likely that the chain does not reveal very much. Frequently Zinc's invalidation is not quite as orthodox as explicitly following a chain due to name hashing optimization & various special cases to override name hashing. |
steps
I am afraid I need help to reproduce the issue in a smaller project. My project is SBT based, Scala 3, several modules, JS + JVM, with about 500 Scala source files.
problem
When I make a small change in my project, like renaming a local variable in a function, a lot of files (>400) is compiled.
expectation
On any change which is internal to a single function and does not affect its interface (parameters or return value) only one file should be compiled.
notes
It is possible something in my sources is contributing to the problem, but given current tools, it is quite hard to learn what exactly is the issue. The complete build log obtained using
Level.Debug
has over 50 000 lines.I attach an excerpt obtained using
ThisBuild / logLevel := Level.Debug
from my sbt build which was triggered after I renamed a local variable fromadd
totoAdd
:The text was updated successfully, but these errors were encountered: