You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I took out the random data from one benchmark function and the speed drastically improved (~40 ns/op faster), despite the random data being behind StopTimer calls.
I put in the random data to prevent the compiler from making optimization with constant data and biasing results (since they're really simple functions). However, I made the following simple change to VecSub:
func init() {
rand.Seed(time.Now().UnixNano())
}
var v1 = Vec4{rand.Float32(), rand.Float32(), rand.Float32(), rand.Float32()}
var v2 = Vec4{rand.Float32(), rand.Float32(), rand.Float32(), rand.Float32()}
func BenchmarkVec4Sub(b *testing.B) {
for i := 0; i < b.N; i++ {
v1.Sub(v2)
}
}
Because the compiler shouldn't be able to do any constant optimization for that (since it's based off the seed at runtime). And got ~12.6 ns/op, which is about as good as a SIMD package I benchmarked (~11.9 ns/op). This is down from previous benchmarks of the form
func BenchmarkVec4Sub(b *testing.B) {
b.StopTimer()
r := rand.New(rand.NewSource(int64(time.Now().Nanosecond())))
for i := 0; i < b.N; i++ {
b.StopTimer()
v1 := Vec4{r.Float32(), r.Float32(), r.Float32(), r.Float32()}
v2 := Vec4{r.Float32(), r.Float32(), r.Float32(), r.Float32()}
b.StartTimer()
v1.Sub(v2)
}
}
Which were clocking ~50.9 ns/op. And also took an age to run. Further, introducing random data in the same form as our current benchmarks to the SIMD package's benchmarks catapaulted it up to the ~40-50ns/op range -- implying there's almost certainly no magical Go compiler trickery going on biasing our benchmarks to an unfairly low number.
This is actually very good news. I'm filing this because I don't have time for the next little bit if someone else wants to do this. I think all Matrix, Vector, and Quaternion benchmarks use random data now. It's not a horifically difficult thing to change, just tedious.
On the other hand, I'm not sure why b.StopTimer is being so finnicky, but it may just be meant for benchmarks at ns/op levels that aren't so low (~40ns/op is nothing compared to something like a graph search).
The text was updated successfully, but these errors were encountered:
Yeah, I'm not surprised that doesn't work as intended. StopTimer() and StartTimer() are likely too high-grained and meant to be used before/after the tight b.N loop, not inside.
I took out the random data from one benchmark function and the speed drastically improved (~40 ns/op faster), despite the random data being behind StopTimer calls.
I put in the random data to prevent the compiler from making optimization with constant data and biasing results (since they're really simple functions). However, I made the following simple change to VecSub:
Because the compiler shouldn't be able to do any constant optimization for that (since it's based off the seed at runtime). And got ~12.6 ns/op, which is about as good as a SIMD package I benchmarked (~11.9 ns/op). This is down from previous benchmarks of the form
Which were clocking ~50.9 ns/op. And also took an age to run. Further, introducing random data in the same form as our current benchmarks to the SIMD package's benchmarks catapaulted it up to the ~40-50ns/op range -- implying there's almost certainly no magical Go compiler trickery going on biasing our benchmarks to an unfairly low number.
This is actually very good news. I'm filing this because I don't have time for the next little bit if someone else wants to do this. I think all Matrix, Vector, and Quaternion benchmarks use random data now. It's not a horifically difficult thing to change, just tedious.
On the other hand, I'm not sure why b.StopTimer is being so finnicky, but it may just be meant for benchmarks at ns/op levels that aren't so low (~40ns/op is nothing compared to something like a graph search).
The text was updated successfully, but these errors were encountered: