Make CheckUnused not slow. #20321

sjrd · 2024-05-02T12:31:34Z

It doesn't mean that it's fast yet, but it is already a significant step in that direction. In particular, this goes in the direction of addressing #19671.

The most important commit is "Simplify the logic for checking unused imports.", whose commit message follows:

Instead of dealing with entire tpd.Imports at the end of the scope, we eagerly flatten them into individual ImportSelectors. We store them along with some data, including a mutable flag for whether a selector has been used.

This allows to dramatically simplify isInImport, as well as more aggressively cache the resolution of selectors. We also get rid of the IdentityHashMap.

The algorithm is still O(n*m) where n is the number of imports in a scope, and m the number of references found in that scope. It is not entirely clear to me whether the previous logic was already O(n*m) or worse (it may have included an additional p factor for the number of possible selections from a given qualifier).

Regardless, it is already quite a bit faster than before, thanks to smaller constant factors.

It is used for every single tree in `CheckUnused`, so this is worth it.

It is not efficient when the results are always used exactly once.

`Tree`s have structural equality. Even if `==` should be able to exit quickly either because of `eq` or an early difference, sets systematically call `hashCode`, which is going to recurse into the entire structure.

It is pointless to sort a list before converting it into a Set.

sjrd · 2024-05-06T12:26:37Z

Rebased on top of #20163. @noti0na1 perhaps you could review this, since you changed isInImport last before me?

noti0na1 · 2024-05-06T12:36:02Z

Sure, I will review this today.

KacperFKorban

The refactor looks good to me. I added some minor suggestions that might make the code easier to read.

Sorry for taking so long with the review, the end of last week was a public holiday in Poland.

KacperFKorban · 2024-05-06T21:32:58Z

compiler/src/dotty/tools/dotc/transform/CheckUnused.scala

+        val selector = selData.selector
+
+        if !selector.isWildcard then
+          if !altName.forall(explicitName => selector.rename == explicitName.toTermName) then


Can this be changed to altName.exists(explicitName => selector.rename != explicitName.toTermName)? (It would read like the comment then)

KacperFKorban · 2024-05-06T21:37:53Z

compiler/src/dotty/tools/dotc/transform/CheckUnused.scala

+          // If the symbol is accessible in this scope without an import, do not register it for unused import analysis
+          val notForImport1 =
+            notForImport
+              || (!name.exists(_.toTermName != sym.name.toTermName) && sym.isAccessibleAsIdent)


Similarly to isInImport: can this first part be name.forall(_.toTermName == sym.name.toTermName)? (Less negation should make it simpler)

KacperFKorban · 2024-05-06T21:52:47Z

compiler/src/dotty/tools/dotc/transform/CheckUnused.scala

-      if !isConstructorOfSynth(sym) && !doNotRegister(sym) then
-        if sym.isConstructor && sym.exists then
-          registerUsed(sym.owner, None) // constructor are "implicitly" imported with the class
+    def registerUsed(sym: Symbol, name: Option[Name], notForImport: Boolean = false, isDerived: Boolean = false)(using Context): Unit =


I find the logic harder to follow because of the not in notForImport, can we flip the value of this flag to make it mean "possibly form an import"?

Instead of dealing with entire `tpd.Import`s at the end of the scope, we eagerly flatten them into individual `ImportSelector`s. We store them along with some data, including a mutable flag for whether a selector has been used. This allows to dramatically simplify `isInImport`, as well as more aggressively cache the resolution of selectors. We also get rid of the `IdentityHashMap`. The algorithm is still O(n*m) where n is the number of imports in a scope, and m the number of references found in that scope. It is not entirely clear to me whether the previous logic was already O(n*m) or worse (it may have included an additional p factor for the number of possible selections from a given qualifier). Regardless, it is already quite a bit faster than before, thanks to smaller constant factors.

That test does not rely on any information dependent on the import selectors. It only relies on information at the use site. Eagerly checking it means we do not put as many symbols into the `usedInScope` set, which is good because it is one of the complexity factors of the unused-imports analysis.

noti0na1

LGTM, glad to see the logic can be simplified significantly.

kyri-petrou · 2024-05-09T07:29:40Z

@sjrd thank you very much for your work on this. You're going to make a lot of people happy with this!

Out of curiosity, do you happen to know what release will these changes be included in?

sjrd · 2024-05-09T07:31:50Z

3.5.0-RC1

Backports #20321 to the LTS branch. PR submitted by the release tooling. [skip ci]

sjrd force-pushed the make-check-unused-not-slow branch 5 times, most recently from 933d8fa to 9ad7463 Compare May 3, 2024 15:43

sjrd changed the title ~~WiP Make CheckUnused not slow.~~ Make CheckUnused not slow. May 3, 2024

sjrd marked this pull request as ready for review May 3, 2024 15:51

sjrd requested a review from KacperFKorban May 3, 2024 15:53

sjrd assigned KacperFKorban May 3, 2024

som-snytt mentioned this pull request May 3, 2024

false-positive "unused import" warning when importing givens defined in common trait for several objects #19657

Open

sjrd added 8 commits May 6, 2024 13:51

Make unusedDataApply inline so that no closure is allocated.

69664f7

It is used for every single tree in `CheckUnused`, so this is worth it.

Do not use LazyList in CheckUnused.

0bd4b15

It is not efficient when the results are always used exactly once.

Do not mangle names only to test whether it starts with a given string.

6d79caa

Remove dead code newCtx in CheckUnused.

3188177

Do not use Sets of Trees in CheckUnused.

a55ee4d

`Tree`s have structural equality. Even if `==` should be able to exit quickly either because of `eq` or an early difference, sets systematically call `hashCode`, which is going to recurse into the entire structure.

Remove a useless sort and otherwise sort by offset, not line.

701d69f

It is pointless to sort a list before converting it into a Set.

Refactor unused imports to try and make sense of it.

803dff7

Fix some indentation.

0c1f090

sjrd force-pushed the make-check-unused-not-slow branch from 9ad7463 to 6587ab4 Compare May 6, 2024 12:24

sjrd requested a review from noti0na1 May 6, 2024 12:25

sjrd assigned noti0na1 May 6, 2024

KacperFKorban approved these changes May 6, 2024

View reviewed changes

sjrd added 2 commits May 7, 2024 09:52

sjrd force-pushed the make-check-unused-not-slow branch from 6587ab4 to 8553bfc Compare May 7, 2024 07:56

sjrd enabled auto-merge May 7, 2024 07:56

noti0na1 approved these changes May 7, 2024

View reviewed changes

sjrd merged commit 360d473 into scala:main May 7, 2024
19 checks passed

sjrd deleted the make-check-unused-not-slow branch May 7, 2024 11:11

Kordyjan added this to the 3.5.0 milestone May 10, 2024

Gedochao mentioned this pull request Jun 10, 2024

-Wunused:all is slow to compile #19671

Closed

som-snytt mentioned this pull request Jun 28, 2024

3.5.0-RC2 regression: false positive unused warning on given import with wildcard #20860

Closed

WojciechMazur mentioned this pull request Jul 6, 2024

Backport "Make CheckUnused not slow." to LTS #21101

Merged

WojciechMazur added a commit that referenced this pull request Jul 6, 2024

Backport "Make CheckUnused not slow." to LTS (#21101)

95f012b

Backports #20321 to the LTS branch. PR submitted by the release tooling. [skip ci]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make CheckUnused not slow. #20321

Make CheckUnused not slow. #20321

sjrd commented May 2, 2024 •

edited

Loading

sjrd commented May 6, 2024

noti0na1 commented May 6, 2024

KacperFKorban left a comment

KacperFKorban May 6, 2024

KacperFKorban May 6, 2024

KacperFKorban May 6, 2024

noti0na1 left a comment

kyri-petrou commented May 9, 2024

sjrd commented May 9, 2024

Make CheckUnused not slow. #20321

Make CheckUnused not slow. #20321

Conversation

sjrd commented May 2, 2024 • edited Loading

sjrd commented May 6, 2024

noti0na1 commented May 6, 2024

KacperFKorban left a comment

Choose a reason for hiding this comment

KacperFKorban May 6, 2024

Choose a reason for hiding this comment

KacperFKorban May 6, 2024

Choose a reason for hiding this comment

KacperFKorban May 6, 2024

Choose a reason for hiding this comment

noti0na1 left a comment

Choose a reason for hiding this comment

kyri-petrou commented May 9, 2024

sjrd commented May 9, 2024

sjrd commented May 2, 2024 •

edited

Loading