Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Devirtualize calls when concrete type behind interface is statically known #19361

Closed
navytux opened this issue Mar 2, 2017 · 12 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@navytux
Copy link
Contributor

navytux commented Mar 2, 2017

Please answer these questions before submitting your issue. Thanks!

What did you do?

Please consider the following program:

package xxx

type Iface interface {
        DoSomething()
}

type myStruct struct {
}

func (ms *myStruct) DoSomething() {
        println("mystruct·DoSomething")
}

func test() {
        var i Iface = &myStruct{}
        i.DoSomething()
}

(https://play.golang.org/p/1SSSkjvvcy)

Note in test(): type behind i is statically known to be *myStruct

What did you expect to see?

Code generated for test() directly calls myStruct.DoSomething

What did you see instead?

"".test t=1 size=80 args=0x0 locals=0x18
        0x0000 00000 (devirt.go:14)     TEXT    "".test(SB), $24-0
        0x0000 00000 (devirt.go:14)     MOVQ    (TLS), CX
        0x0009 00009 (devirt.go:14)     CMPQ    SP, 16(CX)
        0x000d 00013 (devirt.go:14)     JLS     73
        0x000f 00015 (devirt.go:14)     SUBQ    $24, SP
        0x0013 00019 (devirt.go:14)     MOVQ    BP, 16(SP)
        0x0018 00024 (devirt.go:14)     LEAQ    16(SP), BP
        0x001d 00029 (devirt.go:14)     FUNCDATA        $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x001d 00029 (devirt.go:14)     FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x001d 00029 (devirt.go:15)     LEAQ    type."".myStruct(SB), AX
        0x0024 00036 (devirt.go:15)     MOVQ    AX, (SP)
        0x0028 00040 (devirt.go:15)     PCDATA  $0, $0
        0x0028 00040 (devirt.go:15)     CALL    runtime.newobject(SB)
        0x002d 00045 (devirt.go:15)     MOVQ    8(SP), AX
        0x0032 00050 (devirt.go:16)     MOVQ    go.itab.*"".myStruct,"".Iface+32(SB), CX
        0x0039 00057 (devirt.go:16)     MOVQ    AX, (SP)
        0x003d 00061 (devirt.go:16)     PCDATA  $0, $0
        0x003d 00061 (devirt.go:16)     CALL    CX
        0x003f 00063 (devirt.go:17)     MOVQ    16(SP), BP
        0x0044 00068 (devirt.go:17)     ADDQ    $24, SP
        0x0048 00072 (devirt.go:17)     RET
        0x0049 00073 (devirt.go:17)     NOP
        0x0049 00073 (devirt.go:14)     PCDATA  $0, $-1
        0x0049 00073 (devirt.go:14)     CALL    runtime.morestack_noctxt(SB)
        0x004e 00078 (devirt.go:14)     JMP     0

Note the indirect call in offsets 00050 - 00061.

Does this issue reproduce with the latest release (go1.8)?

Yes

Context

This issue originally came in the context of using reflect.Typeof(v).Comparable() to detect whether a value can be used as key in a map:

kisielk/og-rek#30 (comment)

Current code for reflect.Typeof() is just a pointer cast to reflect/runtime.rtype + nil check and then the result is returned as reflect.Type interface:

https://github.com/golang/go/blob/f072283b/src/reflect/type.go#L1405
https://github.com/golang/go/blob/f072283b/src/reflect/type.go#L3025

So since result of reflect.Typeof(v) is statically known to be either:

  • *reflect.rtype, or
  • nil

code generated for reflect.Typeof(v).Comparable() should ideally be:

  • cast v to rtype
  • check for nil - if yes panic
  • directly call rtype·Comparable()

but currently it does the call indirectly - similar to my above original example for myStruct:

...
        0x001d 00029 (x.go:6)   MOVQ    "".v+48(FP), AX
        0x0022 00034 (x.go:6)   MOVQ    AX, reflect.i·2+16(SP)
        0x0027 00039 (x.go:6)   MOVQ    "".v+56(FP), AX
        0x002c 00044 (x.go:6)   MOVQ    AX, reflect.i·2+24(SP)
        0x0031 00049 (x.go:6)   MOVQ    reflect.i·2+16(SP), AX
        0x0036 00054 (x.go:6)   TESTQ   AX, AX
        0x0039 00057 (x.go:6)   JEQ     95
        0x003b 00059 (x.go:6)   LEAQ    go.itab.*reflect.rtype,reflect.Type(SB), CX
        0x0042 00066 (x.go:6)   MOVQ    64(CX), CX
        0x0046 00070 (x.go:6)   MOVQ    AX, (SP)
        0x004a 00074 (x.go:6)   PCDATA  $0, $1
        0x004a 00074 (x.go:6)   CALL    CX
...

So implementing this kind of local devirtualization should imho help in many cases where reflect is used.

Possibly related issues:

#19165
#16869

System details

go version devel +f072283bce Thu Mar 2 06:08:42 2017 +0000 linux/amd64
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/kirr/go"
GORACE=""
GOROOT="/home/kirr/src/tools/go/go"
GOTOOLDIR="/home/kirr/src/tools/go/go/pkg/tool/linux_amd64"
GCCGO="/usr/bin/gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build669846645=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOROOT/bin/go version: go version devel +f072283bce Thu Mar 2 06:08:42 2017 +0000 linux/amd64
GOROOT/bin/go tool compile -V: compile version devel +f072283bce Thu Mar 2 06:08:42 2017 +0000 X:framepointer
uname -sr: Linux 4.9.0-1-amd64
Distributor ID:	Debian
Description:	Debian GNU/Linux 9.0 (stretch)
Release:	9.0
Codename:	stretch
/lib/x86_64-linux-gnu/libc.so.6: GNU C Library (Debian GLIBC 2.24-9) stable release version 2.24, by Roland McGrath et al.
gdb --version: GNU gdb (Debian 7.12-6) 7.12.0.20161007-git

Thanks beforehand,
Kirill

/cc @randall77, @dr2chase

@navytux
Copy link
Contributor Author

navytux commented Mar 2, 2017

I'd like to add: this can also help to reduce pressure on garbage collector.

Reasoning: currently when something is called via interface, both receiver and arguments are automatically escaped to heap, irregardless of what actually happens inside the call, because in general case when we do not know what code gets called, the worst needs to be assumed.

On the other hand when we could statically know the "underlying" type behind an interface, it is possible to use the information and ask escape analysis to do its work. As a consiquence in not so few cases things will stay on stack instead of being moved to heap.

Some examples:

  1 package xxx
  2 
  3 import "reflect"
  4 
  5 func comparable(v interface{}) bool {
  6         return reflect.TypeOf(v).Comparable()
  7 }       
devirt2.go:6: inlining call to reflect.TypeOf
devirt2.go:6: inlining call to reflect.toType
devirt2.go:6: reflect.Type(reflect.t·2) escapes to heap
devirt2.go:5: leaking param: v                                         <-- (at least this)
devirt2.go:6: comparable &reflect.i·2 does not escape

  1 package xxx
  2 
  3 import (
  4         "os"
  5         "io"
  6 )
  7 
  8 func readpipe() {
  9         r, w, err := os.Pipe()
 10         if err != nil {
 11                 panic(err)
 12         }
 13         _ = w
 14 
 15         rl := io.LimitedReader{R: r, N: 64}
 16         buf := [128]byte{}
 17 
 18         _, err = rl.Read(buf[:])
 19 }
devirt2.go:15: r escapes to heap        <--
devirt2.go:18: buf escapes to heap      <--
devirt2.go:16: moved to heap: buf       <--
devirt2.go:18: readpipe rl does not escape

etc.

This might be related to e.g. #18822 but not only on mutable/immutable but also on escapes/not escapes and similar.

So since garbage collection is known to have relatively high cost and statically tracking underlying interface types could reduce pressure to garbage collector, may I suggest to treat this issue as not low-priority?

Thanks again,
Kirill

@philhofer
Copy link
Contributor

philhofer commented Mar 2, 2017

Part of the difficulty in covering many of the cases you've listed is that they would require the compiler to track values across call sites, and the compiler only performs intra-procedural optimizations today (perhaps with the exception of escape analysis). We'd have to make more aggressive inlining decisions to expose many of these opportunities. There's some work being done on the front, but it's a prerequisite to making this sort of optimization generally effective.

With that being said, in the very elementary case where you have

var r io.Reader
r = bytes.NewBuffer(nil)
r.Read(buf)

it should be possible to devirtualize the call to Read.

@navytux
Copy link
Contributor Author

navytux commented Mar 2, 2017

Yes, I understand. As you say we can start from something simple - your above example + cases where called something is already being inlined. E.g. reflect.TypeOf() is already being inlined so even this simple way would help imho.

@philhofer
Copy link
Contributor

So, de-virtualizing the calls themselves is relatively straightforward: https://go-review.googlesource.com/c/37751/

We'll have to wait until inlining is performed post-SSA before this has an effect on some of your examples, but at least it gets the trivial ones.

philhofer added a commit to philhofer/go that referenced this issue Mar 4, 2017
DO NOT REVIEW

Passes all.bash locally; needs more tests.

After this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the concrete type of
'h' is known statically.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

Common patterns that get de-virtualized:
 - Calls on hash.Hash created with sha1.New(), md5.New(), etc.
 - Uses of base64.NewEncoder/base64.NewDecoder
 - Uses of hex.Dumper
 - Calls on sync.Locker created with (*sync.RWMutex).RLocker()

Updates golang#19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
@navytux
Copy link
Contributor Author

navytux commented Mar 4, 2017

Thanks a lot for starting this.

You mention there is some work being done also on inlining front. Any pointers? I'm not a compiler guy, but it is always interesting for me to look around and learn something.

Thanks beforehand,
Kirill

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/37751 mentions this issue.

philhofer added a commit to philhofer/go that referenced this issue Mar 4, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates golang#19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
philhofer added a commit to philhofer/go that referenced this issue Mar 7, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates golang#19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
navytux added a commit to navytux/og-rek that referenced this issue Mar 7, 2017
Go specification requires that only comparable types could be used as
map keys:

    https://golang.org/ref/spec#Map_types

For map[interface{}]... this is checked at runtime, and if concrete
value used as a key is not comparable it results in runtime panic, e.g.:

    panic: runtime error: hash of unhashable type ogórek.Call

    goroutine 1 [running]:
    github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420084360, 0x64, 0x0)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:655 +0x18c
    github.com/kisielk/og-rek.Decoder.Decode(0xc42001c3c0, 0x5a9300, 0x0, 0x0, 0xc4200164b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:187 +0x172b
    github.com/kisielk/og-rek.Fuzz(0x7ff901592000, 0xa, 0x200000, 0x3)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
    go-fuzz-dep.Main(0x50d830)
            /tmp/go-fuzz-build561441550/goroot/src/go-fuzz-dep/main.go:49 +0xd9
    main.main()
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d
    exit status 2

so when decoding dict and friends - all places where maps are populated - we
have to check whether an object on stack we are going to use as key is suitable.

Issue kisielk#30 contains comparison of 2 ways to do such check - catch
runtime panic in exception style manner or use
reflect.TypeOf(v).Comparable(). Since reflect-way turns out to be faster

kisielk#30 (comment)

and likely will become more faster in the future:

golang/go#19361

it was decided to go the reflect way (which is also a canonical way in
go land).

So audit all places where map items are set and add appropriate checks
before them.

I've verified that if we remove any of the added checks, via so far found
crash vectors, at least one crash case will reappear in tests. This
means that all added checks are actually test covered.

Updates: kisielk#30
philhofer added a commit to philhofer/go that referenced this issue Mar 7, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates golang#19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
navytux added a commit to navytux/og-rek that referenced this issue Mar 7, 2017
Go specification requires that only comparable types could be used as
map keys:

    https://golang.org/ref/spec#Map_types

For map[interface{}]... this is checked at runtime, and if concrete
value used as a key is not comparable it results in runtime panic, e.g.:

    panic: runtime error: hash of unhashable type ogórek.Call

    goroutine 1 [running]:
    github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420084360, 0x64, 0x0)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:655 +0x18c
    github.com/kisielk/og-rek.Decoder.Decode(0xc42001c3c0, 0x5a9300, 0x0, 0x0, 0xc4200164b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:187 +0x172b
    github.com/kisielk/og-rek.Fuzz(0x7ff901592000, 0xa, 0x200000, 0x3)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
    go-fuzz-dep.Main(0x50d830)
            /tmp/go-fuzz-build561441550/goroot/src/go-fuzz-dep/main.go:49 +0xd9
    main.main()
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d
    exit status 2

so when decoding dict and friends - all places where maps are populated - we
have to check whether an object on stack we are going to use as key is suitable.

Issue kisielk#30 contains comparison of 2 ways to do such check - catch
runtime panic in exception style manner or use
reflect.TypeOf(v).Comparable(). Since reflect-way turns out to be faster

kisielk#30 (comment)

and likely will become more faster in the future:

golang/go#19361

it was decided to go the reflect way (which is also a canonical way in
go land).

So audit all places where map items are set and add appropriate checks
before them.

I've verified that if we remove any of the added checks, via so far found
crash vectors, at least one crash case will reappear in tests. This
means that all added checks are actually test covered.

Updates: kisielk#30
philhofer added a commit to philhofer/go that referenced this issue Mar 7, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates golang#19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
kisielk pushed a commit to kisielk/og-rek that referenced this issue Mar 8, 2017
* decoder: Don't forget to check memo for key not there on read access

If we do not do reading from memo will return memo's zero value (= nil),
which is

a) not correct - many memo keys could be read this way, and
b) nil there on stack breaks invariants of stack containing only good
   values.

Furthermore, it can even lead to crashes on e.g. calling
reflect.TypeOf(stack.top()) in the next patch.

Anyway getting something from memo must be checked for whether it there
or not for correctness.

Noticed while working on fix for #30.

* decoder: Fix "panic: runtime error: hash of unhashable type ..."

Go specification requires that only comparable types could be used as
map keys:

    https://golang.org/ref/spec#Map_types

For map[interface{}]... this is checked at runtime, and if concrete
value used as a key is not comparable it results in runtime panic, e.g.:

    panic: runtime error: hash of unhashable type ogórek.Call

    goroutine 1 [running]:
    github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420084360, 0x64, 0x0)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:655 +0x18c
    github.com/kisielk/og-rek.Decoder.Decode(0xc42001c3c0, 0x5a9300, 0x0, 0x0, 0xc4200164b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:187 +0x172b
    github.com/kisielk/og-rek.Fuzz(0x7ff901592000, 0xa, 0x200000, 0x3)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
    go-fuzz-dep.Main(0x50d830)
            /tmp/go-fuzz-build561441550/goroot/src/go-fuzz-dep/main.go:49 +0xd9
    main.main()
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d
    exit status 2

so when decoding dict and friends - all places where maps are populated - we
have to check whether an object on stack we are going to use as key is suitable.

Issue #30 contains comparison of 2 ways to do such check - catch
runtime panic in exception style manner or use
reflect.TypeOf(v).Comparable(). Since reflect-way turns out to be faster

#30 (comment)

and likely will become more faster in the future:

golang/go#19361

it was decided to go the reflect way (which is also a canonical way in
go land).

So audit all places where map items are set and add appropriate checks
before them.

I've verified that if we remove any of the added checks, via so far found
crash vectors, at least one crash case will reappear in tests. This
means that all added checks are actually test covered.

Updates: #30

* decoder: Don't forget to check for odd #elements in loadDict & friends

e.g. for Dict opcode the expected stack state is

	MARK key1 obj1 key2 obj2 ... keyN objN DICT

so if in between MARK and DICT there is odd number of elements this is
an error in input data. We also used to crash on such cases, e.g.:

        "(\x88d"

        panic: runtime error: index out of range

        goroutine 1 [running]:
        github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420082990, 0xc42000e464, 0x0)
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/ogorek.go:652 +0x21d
        github.com/kisielk/og-rek.Decoder.Decode(0xc42001c7e0, 0x5a9320, 0x0, 0x0, 0xc4200167b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/ogorek.go:188 +0x172d
        github.com/kisielk/og-rek.Fuzz(0x7f6d1f310000, 0x3, 0x200000, 0x3)
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
        go-fuzz-dep.Main(0x50d798)
                /tmp/go-fuzz-build403415384/goroot/src/go-fuzz-dep/main.go:49 +0xd9
        main.main()
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d

I've audited whole decoder and regarding odd #(elements) there are only 2
places to care: loadDict and loadSetItems and crashers for all of them were
already found by fuzz testing.

Fixes #30 (for all known cases so far).
philhofer added a commit to philhofer/go that referenced this issue Mar 8, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates golang#19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
@philhofer
Copy link
Contributor

@navytux
Copy link
Contributor Author

navytux commented Mar 9, 2017

Thanks for the link.

gopherbot pushed a commit that referenced this issue Mar 13, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates #19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
Reviewed-on: https://go-review.googlesource.com/37751
Run-TryBot: Philip Hofer <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Keith Randall <[email protected]>
philhofer added a commit to philhofer/go that referenced this issue Mar 13, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates golang#19361

Change-Id: I57d4dc21ef40a05ec0fbd55a9bb0eb74cdc67a3d
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/38139 mentions this issue.

gopherbot pushed a commit that referenced this issue Mar 14, 2017
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates #19361

Change-Id: I57d4dc21ef40a05ec0fbd55a9bb0eb74cdc67a3d
Reviewed-on: https://go-review.googlesource.com/38139
Run-TryBot: Philip Hofer <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: David Chase <[email protected]>
@navytux
Copy link
Contributor Author

navytux commented Mar 16, 2017

@philhofer thanks for getting the first patch ready and merged. Looking forward for follow-up where situations with convT2I could be handled too. Hopefully reflect.Typeof() will be devertualized then.

Thanks beforehand,
Kirill

@bradfitz bradfitz added this to the Go1.9Maybe milestone Mar 21, 2017
lomik pushed a commit to lomik/og-rek that referenced this issue Apr 11, 2017
* decoder: Don't forget to check memo for key not there on read access

If we do not do reading from memo will return memo's zero value (= nil),
which is

a) not correct - many memo keys could be read this way, and
b) nil there on stack breaks invariants of stack containing only good
   values.

Furthermore, it can even lead to crashes on e.g. calling
reflect.TypeOf(stack.top()) in the next patch.

Anyway getting something from memo must be checked for whether it there
or not for correctness.

Noticed while working on fix for kisielk#30.

* decoder: Fix "panic: runtime error: hash of unhashable type ..."

Go specification requires that only comparable types could be used as
map keys:

    https://golang.org/ref/spec#Map_types

For map[interface{}]... this is checked at runtime, and if concrete
value used as a key is not comparable it results in runtime panic, e.g.:

    panic: runtime error: hash of unhashable type ogórek.Call

    goroutine 1 [running]:
    github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420084360, 0x64, 0x0)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:655 +0x18c
    github.com/kisielk/og-rek.Decoder.Decode(0xc42001c3c0, 0x5a9300, 0x0, 0x0, 0xc4200164b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/ogorek.go:187 +0x172b
    github.com/kisielk/og-rek.Fuzz(0x7ff901592000, 0xa, 0x200000, 0x3)
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
    go-fuzz-dep.Main(0x50d830)
            /tmp/go-fuzz-build561441550/goroot/src/go-fuzz-dep/main.go:49 +0xd9
    main.main()
            /tmp/go-fuzz-build561441550/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d
    exit status 2

so when decoding dict and friends - all places where maps are populated - we
have to check whether an object on stack we are going to use as key is suitable.

Issue kisielk#30 contains comparison of 2 ways to do such check - catch
runtime panic in exception style manner or use
reflect.TypeOf(v).Comparable(). Since reflect-way turns out to be faster

kisielk#30 (comment)

and likely will become more faster in the future:

golang/go#19361

it was decided to go the reflect way (which is also a canonical way in
go land).

So audit all places where map items are set and add appropriate checks
before them.

I've verified that if we remove any of the added checks, via so far found
crash vectors, at least one crash case will reappear in tests. This
means that all added checks are actually test covered.

Updates: kisielk#30

* decoder: Don't forget to check for odd #elements in loadDict & friends

e.g. for Dict opcode the expected stack state is

	MARK key1 obj1 key2 obj2 ... keyN objN DICT

so if in between MARK and DICT there is odd number of elements this is
an error in input data. We also used to crash on such cases, e.g.:

        "(\x88d"

        panic: runtime error: index out of range

        goroutine 1 [running]:
        github.com/kisielk/og-rek.(*Decoder).loadDict(0xc420082990, 0xc42000e464, 0x0)
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/ogorek.go:652 +0x21d
        github.com/kisielk/og-rek.Decoder.Decode(0xc42001c7e0, 0x5a9320, 0x0, 0x0, 0xc4200167b0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/ogorek.go:188 +0x172d
        github.com/kisielk/og-rek.Fuzz(0x7f6d1f310000, 0x3, 0x200000, 0x3)
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/fuzz.go:12 +0x108
        go-fuzz-dep.Main(0x50d798)
                /tmp/go-fuzz-build403415384/goroot/src/go-fuzz-dep/main.go:49 +0xd9
        main.main()
                /tmp/go-fuzz-build403415384/gopath/src/github.com/kisielk/og-rek/go.fuzz.main/main.go:10 +0x2d

I've audited whole decoder and regarding odd #(elements) there are only 2
places to care: loadDict and loadSetItems and crashers for all of them were
already found by fuzz testing.

Fixes kisielk#30 (for all known cases so far).
@randall77
Copy link
Contributor

This is now all fixed at tip. We now devirtualize calls like this:

        var i Iface = &myStruct{}
        i.DoSomething()

and

	var r io.Reader
	r = bytes.NewBuffer(nil)
	r.Read(buf)

I'm sure there are more complicated cases we miss, but we get the examples in this issue.

@navytux
Copy link
Contributor Author

navytux commented Apr 9, 2019

@randall77, thanks for the update.

@golang golang locked and limited conversation to collaborators Apr 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

5 participants