-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all: reduce init-time CPU & memory usage #26775
Comments
Change https://golang.org/cl/127655 mentions this issue: |
Defer call to registerBasics until the first invocation of NewDecoder() or NewEncoder() This saves ~13.5kB of allocation in init() Updates golang#26775
Change https://golang.org/cl/127660 mentions this issue: |
Change https://golang.org/cl/127661 mentions this issue: |
Change https://golang.org/cl/127664 mentions this issue: |
Change https://golang.org/cl/127735 mentions this issue: |
Change https://golang.org/cl/127736 mentions this issue: |
Wow, that was picked up real fast by the others! |
Change https://golang.org/cl/127875 mentions this issue: |
See also my last comment in #2559 about using go generate for some of this. |
Change https://golang.org/cl/119715 mentions this issue: |
…cating Updates #26775 Change-Id: I83c9eeda59769d2f35e0cc98f3a8579861d5978b Reviewed-on: https://go-review.googlesource.com/119715 Reviewed-by: Brad Fitzpatrick <[email protected]> Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Updates golang/go#26775 Change-Id: Iea95ea07bb0fed42410efb4e8420d8e9a17704fe Reviewed-on: https://go-review.googlesource.com/127664 Reviewed-by: Ian Lance Taylor <[email protected]>
Saves 22KB of memory in stdlib packages. Updates #26775 Change-Id: Ia19fe7aff61f6e2ddd83cd35969d7ff94526591f Reviewed-on: https://go-review.googlesource.com/127661 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
Compile go/doc's 4 regexps lazily, on demand. Also, add a test for the one that had no test coverage. This reduces init-time CPU as well as heap by ~20KB when they're not used, which seems to be common enough. As an example, cmd/doc only seems to use 1 of them. (as noted by temporary print statements) Updates #26775 Change-Id: I85df89b836327a53fb8e1ace3f92480374270368 Reviewed-on: https://go-review.googlesource.com/127875 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
Don't open files or do sysctls in init. Updates #26775 Change-Id: I017bed6c24ef1e4bc30040120349fb779f203225 Reviewed-on: https://go-review.googlesource.com/127655 Reviewed-by: Ian Lance Taylor <[email protected]>
Saves 36KB of memory in stdlib packages. Updates #26775 Change-Id: I0f9d7b17d9768f6fb980d5fbba7c45920215a5fc Reviewed-on: https://go-review.googlesource.com/127735 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>
Using all.go that Brad posted in the OP, and using
|
Those top two |
Saves 6KB of memory in stdlib packages. Updates #26775 Change-Id: I1a6184cefa78e9a3c034fa84506fdfe0fec27add Reviewed-on: https://go-review.googlesource.com/127736 Reviewed-by: Brad Fitzpatrick <[email protected]>
Change https://golang.org/cl/166459 mentions this issue: |
The first biggest offender was crypto/des.init at ~1%. It's cryptographically broken and the init function is relatively expensive, which is unfortunate as both crypto/tls and crypto/x509 (and by extension, cmd/go) import it. Hide the work behind sync.Once. The second biggest offender was flag.sortFlags at just under 1%, used by the Visit flagset methods. It allocated two slices, which made a difference as cmd/go iterates over multiple flagsets during init. Use a single slice with a direct sort.Interface implementation. Another big offender is initializing global maps. Reducing this work in cmd/go/internal/imports and net/textproto gives us close to another whole 1% in saved work. The former can use map literals, and the latter can hide the work behind sync.Once. Finally, compress/flate used newHuffmanBitWriter as part of init, which allocates many objects and slices. Yet it only used one of the slice fields. Allocating just that slice saves a surprising ~0.3%, since we generated a lot of unnecessary garbage. All in all, these little pieces amount to just over 3% saved CPU time. name old time/op new time/op delta ExecGoEnv-8 3.61ms ± 1% 3.50ms ± 0% -3.02% (p=0.000 n=10+10) Updates #26775. Updates #29382. Change-Id: I915416e88a874c63235ba512617c8aef35c0ca8b Reviewed-on: https://go-review.googlesource.com/c/go/+/166459 Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>
Change https://golang.org/cl/170317 mentions this issue: |
Change https://golang.org/cl/210284 mentions this issue: |
…nation Fixes #36021 Updates #2559 Updates #26775 Change-Id: I2e6708691311035b63866f25d5b4b3977a118290 Reviewed-on: https://go-review.googlesource.com/c/go/+/210284 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Rob Pike <[email protected]>
…nation Fixes golang#36021 Updates golang#2559 Updates golang#26775 Change-Id: I2e6708691311035b63866f25d5b4b3977a118290 Reviewed-on: https://go-review.googlesource.com/c/go/+/210284 Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Rob Pike <[email protected]>
Isn't this resolved? ❯ go version
go version go1.17.2 darwin/amd64
❯ go build all.go
❯ GODEBUG=memprofilerate=1 ./all
289408
❯ go tool pprof /tmp/all.mem.prof
Type: inuse_space
Time: Dec 5, 2021 at 12:35pm (CET)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top 50
Showing nodes accounting for 9751.47kB, 100% of 9751.47kB total
flat flat% sum% cum cum%
3585.42kB 36.77% 36.77% 3585.42kB 36.77% runtime.malg
3092.09kB 31.71% 68.48% 3092.09kB 31.71% runtime.procresize
1537.69kB 15.77% 84.25% 4610.75kB 47.28% runtime.allocm
1024.25kB 10.50% 94.75% 2048.66kB 21.01% runtime.mcommoninit
512.02kB 5.25% 100% 512.02kB 5.25% text/template/parse.(*Tree).newList (inline)
0 0% 100% 512.02kB 5.25% html/template.(*Template).Parse
0 0% 100% 512.02kB 5.25% net/rpc.init
0 0% 100% 512.02kB 5.25% runtime.doInit
0 0% 100% 512.02kB 5.25% runtime.main
0 0% 100% 2049.09kB 21.01% runtime.main.func1
0 0% 100% 1024.41kB 10.51% runtime.mpreinit
0 0% 100% 3073.86kB 31.52% runtime.mstart
0 0% 100% 3073.86kB 31.52% runtime.mstart0
0 0% 100% 3073.86kB 31.52% runtime.mstart1
0 0% 100% 2561.30kB 26.27% runtime.mstartm0
0 0% 100% 2561.30kB 26.27% runtime.newextram
0 0% 100% 2561.66kB 26.27% runtime.newm
0 0% 100% 512.20kB 5.25% runtime.newproc
0 0% 100% 1024.41kB 10.51% runtime.newproc.func1
0 0% 100% 1024.41kB 10.51% runtime.newproc1
0 0% 100% 2561.30kB 26.27% runtime.oneNewExtraM
0 0% 100% 512.56kB 5.26% runtime.resetspinning
0 0% 100% 3604.29kB 36.96% runtime.rt0_go
0 0% 100% 3092.09kB 31.71% runtime.schedinit
0 0% 100% 512.56kB 5.26% runtime.schedule
0 0% 100% 512.56kB 5.26% runtime.startm
0 0% 100% 2561.30kB 26.27% runtime.systemstack
0 0% 100% 512.56kB 5.26% runtime.wakep
0 0% 100% 512.02kB 5.25% text/template.(*Template).Parse
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).Parse
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).action
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).itemList
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).parse
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).parseControl
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).rangeControl
0 0% 100% 512.02kB 5.25% text/template/parse.(*Tree).textOrAction
0 0% 100% 512.02kB 5.25% text/template/parse.Parse
(pprof) |
Updates golang/go#26775 Change-Id: Iea95ea07bb0fed42410efb4e8420d8e9a17704fe Reviewed-on: https://go-review.googlesource.com/127664 Reviewed-by: Ian Lance Taylor <[email protected]>
Updates golang/go#26775 Change-Id: Iea95ea07bb0fed42410efb4e8420d8e9a17704fe Reviewed-on: https://go-review.googlesource.com/127664 Reviewed-by: Ian Lance Taylor <[email protected]>
Change https://go.dev/cl/460543 mentions this issue: |
@cristaloleg it is not resolved; I've found pprof to not show useful numbers for init funcs. |
Change https://go.dev/cl/460544 mentions this issue: |
Change https://go.dev/cl/460545 mentions this issue: |
Per benchinit, this makes a big difference to init times: name old time/op new time/op delta InternalProfile 185µs ± 1% 6µs ± 1% -96.51% (p=0.008 n=5+5) name old alloc/op new alloc/op delta InternalProfile 101kB ± 0% 4kB ± 0% -95.72% (p=0.008 n=5+5) name old allocs/op new allocs/op delta InternalProfile 758 ± 0% 25 ± 0% -96.70% (p=0.008 n=5+5) The fixed 0.2ms init cost is saved for any importer of net/http/pprof, but also for cmd/compile, as it supports PGO now. A Go program parsing profiles might not even need to compile these regular expressions at all, if it doesn't encounter any legacy files. I suspect this will be the case with most invocations of cmd/compile. Updates #26775. Change-Id: I8374dc64459f0b6bb09bbdf9d0b6c55d7ae1646e Reviewed-on: https://go-review.googlesource.com/c/go/+/460545 Reviewed-by: Michael Pratt <[email protected]> Run-TryBot: Daniel Martí <[email protected]> Reviewed-by: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
Avoid unnecessary allocations when calling reflect.TypeOf; we can use nil pointers, which fit into an interface without allocating. This saves about 1% of CPU time. The builtin types are limited to typeIds between 0 and firstUserId, and since firstUserId is 64, builtinIdToType does not need to be a map. We can simply use an array of length firstUserId, which is simpler. This saves about 1% of CPU time. idToType is similar to firstUserId in that it is a map keyed by typeIds. The difference is that it can grow with the user's types. However, each added type gets the next available typeId, meaning that we can use a growing slice, similar to the case above. nextId then becomes the current length of the slice. This saves about 1% of CPU time. typeInfoMap is stored globally as an atomic.Value, where each modification loads the map, makes a whole copy, adds the new element, and stores the modified copy. This is perfectly fine when the user registers types, as that can happen concurrently and at any point in the future. However, during init time, we sequentially register many types, and the overhead of copying maps adds up noticeably. During init time, use a regular global map instead, which gets replaced by the atomic.Value when our init work is done. This saves about 2% of CPU time. Finally, avoid calling checkId in bootstrapType; we have just called setTypeId, whose logic for getting nextId is simple, so the extra check doesn't gain us much. This saves about 1% of CPU time. Using benchinit, which transforms GODEBUG=inittrace=1 data into Go benchmark compatible output, results in a nice improvement: name old time/op new time/op delta EncodingGob 175µs ± 0% 162µs ± 0% -7.45% (p=0.016 n=5+4) name old alloc/op new alloc/op delta EncodingGob 39.0kB ± 0% 36.1kB ± 0% -7.35% (p=0.016 n=5+4) name old allocs/op new allocs/op delta EncodingGob 588 ± 0% 558 ± 0% -5.10% (p=0.000 n=5+4) Updates #26775. Change-Id: I28618e8b96ef440480e666ef2cd5c4a9a332ef21 Reviewed-on: https://go-review.googlesource.com/c/go/+/460543 Reviewed-by: Carlos Amedee <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Reviewed-by: Rob Pike <[email protected]> Run-TryBot: Rob Pike <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
With benchinit, we see a noticeable improvement in init times: name old time/op new time/op delta GoTypes 83.4µs ± 0% 43.7µs ± 1% -47.57% (p=0.029 n=4+4) name old alloc/op new alloc/op delta GoTypes 26.5kB ± 0% 18.8kB ± 0% -29.15% (p=0.029 n=4+4) name old allocs/op new allocs/op delta GoTypes 238 ± 0% 154 ± 0% -35.29% (p=0.029 n=4+4) Port the same change to cmd/compile/internal/types and types2. Updates #26775. Change-Id: Ia1f7c4a4ce9a22d66e2aa9c9b9c341036993adca Reviewed-on: https://go-review.googlesource.com/c/go/+/460544 TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Robert Findley <[email protected]> Run-TryBot: Robert Findley <[email protected]> Reviewed-by: Robert Griesemer <[email protected]>
Change https://go.dev/cl/455455 mentions this issue: |
Small cleanup to remove a couple of needless global variables. Instead of relying on two instances of emptyCtx having different addresses, we use different types. For #26775 Change-Id: I0bc4813e94226f7b3f52bf4b1b3c3a3bbbebcc9e Reviewed-on: https://go-review.googlesource.com/c/go/+/455455 Reviewed-by: Damien Neil <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> Reviewed-by: Sameer Ajmani <[email protected]>
Small cleanup to remove a couple of needless global variables. Instead of relying on two instances of emptyCtx having different addresses, we use different types. For golang#26775 Change-Id: I0bc4813e94226f7b3f52bf4b1b3c3a3bbbebcc9e Reviewed-on: https://go-review.googlesource.com/c/go/+/455455 Reviewed-by: Damien Neil <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> Reviewed-by: Sameer Ajmani <[email protected]>
Small cleanup to remove a couple of needless global variables. Instead of relying on two instances of emptyCtx having different addresses, we use different types. For golang#26775 Change-Id: I0bc4813e94226f7b3f52bf4b1b3c3a3bbbebcc9e Reviewed-on: https://go-review.googlesource.com/c/go/+/455455 Reviewed-by: Damien Neil <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> Reviewed-by: Sameer Ajmani <[email protected]>
Tracking bug to reduce init-time CPU & memory usage.
Previously:
https://go-review.googlesource.com/c/go/+/127075 - html: lazily populate Unescape tables
https://go-review.googlesource.com/c/net/+/127275 - http2/hpack: lazily build huffman table on first use
Open:
#26752 - x/text/unicode/norm: reduce init-time memory usage
The program https://play.golang.org/p/9ervXCWzV_z is useful to find offenders:
The text was updated successfully, but these errors were encountered: