-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/tools/go/analysis/checker: an importable analysis driver based on go/packages #61324
Comments
Change https://go.dev/cl/411907 mentions this issue: |
I like the proposal and have two questions about this.
|
The go/analysis/checker driver uses defer + recover around Analyzer.Run, so if the analyzer is single-threaded (as ~all are), it will catch the panic and treat it as a failure of that Action. Like compilation, all downstream actions will not be executed, but independent actions will continue as usual. All this information can be gleaned from the action graph returned by checker.Analyze.
The Fact data structures are part of the output: see the
If you want to distribute units of analysis, you should be using unitchecker, or something like it (it's not a lot of code), as it assumes that intermediate results between units are serialized. Perhaps I should add an Example in go/analysis (but not |
I see. I think I missed this part:
Thanks for clarifying (I should learn to read comment 😄 ) |
I think Since this is a non-main package, I don't think it should log anything. Consider instead providing that functionality as callbacks. For instance, replace |
Good point. On closer inspection, it only logs one message, a constant, at the very beginning. We can simply get rid of Verbose. Thanks for pointing this out. |
This proposal doesn't include a "convenience" API for flags, but it does allow you to control all the relevant pieces of the driver and the analyzers from the outside:
for _, a := range analyzers {
a.Flags.VisitAll(func(f *flag.Flag) {
flag.Var(f.Value, a.Name + "." + f.Name, f.Usage) // --analyzer.flag=...
})
}
flag.Parse() We can always add more convenience functions later as clear needs arise, but the primary goal here is to make the tricky logic of the driver available in a form other than a complete application |
Thanks @Antonboom, I think you are saying that the proposal addresses the four issues you listed and that you are in favor of accepting it. I hope we can get it approved soon. (Please do correct me if I have misunderstood or if you think we should change the proposed API in some way.) |
No, no. I'm waiting this change a lot. I attached this example just with the hope that maybe it will push you to new thoughts. For example:
|
Great, thanks. Yes, in all three cases the new API makes these problems the client responsibility:
The amount of code in the load function is very small, and client applications are likely going to want fine-grained control over it, so I'd rather not introduce a helper function just yet. (The proposed API does state its requirements for the Mode flag, but that's all it needs to say.)
This logic is unrelated to analysis; it's there because singlechecker.Main is a main function, and any significant application has logic like this in its main function.
We may want to add some helper functions for flags, but it's surprisingly fiddly to come up with a good API for it, so I thought I would leave it out for now and see what needs arise. The important thing is that clients are unstuck, even if they have to copy a little code. |
This proposal has been added to the active column of the proposals project |
I am in favor of this proposal. However, I think we should consider having this package operate on an fs.FS, rather than the I suppose we can add In general, I'm not sure about adding all the helpers in this package. They seem more related to a particular command-line interface than they do to the code of the analysis algorithm. But I don't feel strongly. |
I assume you are primarily referring to ApplyFixes. It does seem reasonable to parameterize it over the file system in some way. An FS would be suitable for reading, but we still need a means to write the results. (There's no writable equivalent of FS, and proposal #45757 was declined.) I'll have a think. As for Analyze, the client is expected to have already obtained the packages.Packages, so it's too late to affect how files are read. That said, some analyzers do call os.OpenFile and I recently filed a proposal #62292 to add an OpenFile operation to analysis.Pass, in which case this interface would want an FS field so that reads done within the analyzer are consistent with whatever overlays were passed to go/packages.
The three Print functions aren't fundamental. They could be left for a later proposal, but I've included them to make it easy to build something like unitchecker without having to copy tons of code. The PrintDiagnostics function could also be usefully parameterized over an FS since it prints lines of context which it must read from somewhere. |
FWIW go/packages accepts overlays -- that's how gopls uses it. |
Understood, but my point is that the client does the packages.Load call before the call to Analyze, so Analyze has no need for the file system (at least until #62292 is accepted). |
Ack, sorry I misinterpreted your statement as saying that go/packages would have necessarily read from the OS. In that case, yes it does appear that this is only relevant to ApplyFixes. Should we accept a ReadFile/WriteFile interface? |
I'm slightly concerned about the proposed In Staticcheck's analysis runner, we instead construct the package graph from packages loaded with
and then compute ASTs and type information on demand when a package is ready to be analyzed, i.e. all of its dependencies have been analyzed. Once we're done, we discard the AST and type information. This results in overall much lower memory usage. One downside of our approach is that we load type information for dependencies from export data repeatedly, leading to more garbage and more CPU time spent on parsing export data and GC. (We do the same discarding and reloading for facts, but that part probably contributes much less to overall memory usage.) |
Also, are there any plans to add callbacks to support caching? That is, avoid recomputing facts and diagnostics for packages that haven't changed. |
Hi @dominikh.
You're right that it isn't the most asymptotically efficient solution. It would be more efficient to have Analyze produce (and then evict) type-annotated syntax as it goes, and for {single,multi}checker to take advantage of this optimization too. I suppose nothing stops us from retrofitting this optimization into the API proposed here: if syntax and/or types are missing, then Analyze could parse and type-check as it goes, using a parallel structure to avoid mutating the go/packages graph. But the motivation for this new API is for users that want to load and analyze a large program, re-using the tricky bits of {single,multi}checker.Main while inserting their own logic in between package.Load and Analyze and between Analyze and Exit, which the existing API makes impossible. So I expect that for them, the cost of holding everything in memory at once is quite acceptable. As always, asymptotically efficient separate analysis with persistent memoization of subproblems is best done using the unitchecker approach ( |
Are we sure we want to keep the "and if IsRoot" part? Keeping it does encourage compatibility with unitchecker and facts, but it will make the If we have analyzers Why is |
Correct.
Because Results are potentially very large data structures, such as the SSA representation of all the function bodies, and we want to evict them from memory. By contrast, Diagnostics are typically small and few in number. |
No change in consensus, so accepted. 🎉
|
The {single,multi}checker packages provide the main function for a complete application, as a black box. Many users want the ability to customize the analyzer behavior with additional logic, as described in the attached issues. This change creates a new package, go/analysis/checker, that exposes an Analyze pure function---one that avoids global flags, os.Exit, logging, profiling, and other side effects---that runs a set of analyzers on a set of packages loaded (by the client) using go/packages, and presents the graph of results in a form that allows postprocessing. package checker func Analyze([]*analysis.Analyzer, []*packages.Package, *Options) (*Graph, error) type Graph struct { Roots []*Action } type Action struct { ... } // information about each step func (*Graph) WriteJSONDiagnostics(io.Writer) error func (*Graph) WriteTextDiagnostics(io.Writer, contextLines int) error func (*Graph) All() iter.Seq[*Action] // (was Visit in the proposal) See the example_test.go file for typical API usage. API feedback welcome. This change should have no effect on the behavior of existing programs. Logic changes have been kept to a minimum, so the large diffs are mostly code motion. Auxiliary functions for printing, flags, fixes, and so on has been kept to a minimum so that we can make progress on this change. They will be dealt with in follow-up changes. Detailed list of changes to ease review: analysistest: - Run: we report Load errors to t.Log up front (unless RunDespiteErrors); this was previously in loadPackages. - Run: the Validate call is now done by checker.Analyze. - internal/checker.TestAnalyzer has melted away - the logic to construct the "legacy" Result slice is new. - Result: this type used to be an alias for internal/checker.checker.TestAnalyzerResult (regrettably exposing more than intended); now it is a plain public struct. - check: the logic to check fact expectations is new. - The buildtag and directive analyzers both used a similar hack w.r.t. IgnoredFiles; this hack is now in analysistest. Better ideas welcome. checker: - checker.go is mostly moved from internal/checker/checker.go. - print.go is mostly split from old printDiagnostics in internal/checker/checker.go. - Action.execOnce: the pass.Result/Err logic was simplified using an immediately-applied lambda. - inheritFacts: the comments on package-fact handling are new. - Graph.Visit is "born deprecated". Clients using go1.23 should use Graph.All (iter.Seq). - Example: this is new code. internal/checker: - applyFixes: now accepts only the actions to apply, without calling visitAll over the action graph. The previous code was wrong, a latent bug. Some files outside go/analysis were updated, not out of necessity (it's not a breaking change) but to modernize them (e.g. avoiding analysistest.Result.Pass). Updates golang/go#61324 (new API proposal) Fixes golang/go#53215 (feature proposal) Fixes golang/go#31897 Fixes golang/go#50265 Fixes golang/go#53336 Fixes golang/go#66745 Updates golang/go#30231 Updates golang/go#30219 Updates golang/go#31007 Updates golang/go#66745 Change-Id: I745d319a587dca506564a4624b52a7f1eb5f4751 Reviewed-on: https://go-review.googlesource.com/c/tools/+/411907 Reviewed-by: Robert Findley <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
For the record, the API that was committed differs from the accepted proposal in one minor detail: - ShowFacts bool // log each exported fact
+ FactLog io.Writer // if non-nil, log each exported fact to it |
We propose to create a new package, golang.org/x/tools/go/analysis/checker, to provide a function, Analyze, that applies a set of analyzers to a set of packages and returns a data structure providing access to the complete results of the operation.
Background: the go/analysis framework defines an interface for static checkers that inspect a parsed, type-checked Go package and report mistakes or suggest fixes. The interface allows checkers (e.g. printf, nilness) to be written independent of the driver, and drivers (e.g. go vet) to be written independent of the checkers. The x/tools repo provides two drivers. The first, unitchecker, used by go vet, performs "separate analysis", of one package at a time, using serialized types and facts for dependencies. The second, internal/checker, uses go/packages to load packages and all their dependencies from source. Currently it has two public interfaces, singlechecker and multichecker, that differ only trivially: a singlechecker has only one analyzer, whereas a multichecker contains many and has a slightly more complicated command-line interface. However, both of these public interfaces consist only of a Main function that does the entire job and then calls os.Exit.
Users reported that this was not a very useful building block, and that they wanted to incorporate analysis into larger applications, or add some extra steps after package loading but before analysis, or some extra postprocessing of the analysis results. See:
We propose to create a new package with the following public API.
The primary entry point is Analyze; it accepts a set of packages and analyzers and runs a unit of analysis (an "Action") for each element of their cross product, plus all necessary horizontal and vertical dependencies. The result is exposed as a graph, only the roots of which are provided in the Result struct; this approach is similar to the one used in
go/packages.Load
.To reduce churn and skew, the above API code is just the declarations but not all of the comments. You can see the complete API and implementation in https://go.dev/cl/411907.
Questions:
checker
, so as not to suggest commonality with unitchecker? "srcchecker", perhaps? (If we could start over, I would use "driver" for {single,multi,unit}checker and this package, and use "checker" for analysis.Analyzer.)The text was updated successfully, but these errors were encountered: