Performance regression introduced by commit 760b93a #14939

SergioBenitez · 2014-06-16T19:12:39Z

Commits 6a58537 / 760b93a (Thu May 29 19:03:06) introduced a performance regression in std::collections::HashMap insertion. The regression appears to be isolated to instances where the key is a string, but I have only tested with string and integer keys.

I have set up a test benchmark at https://github.com/SergioBenitez/rust-perf-regression that showcases the issue. It inserts 100 string/int key/value pairs into a HashMap. good.sh runs the benchmark using the last good compiler build, 3ec321f, and bad.sh runs the first bad compiler build, 760b93a, assuming rustc from 3ec321f is in /usr/local/good/bin/rustc and rustc from 760b93a is in the current $PATH.

Here are the results on my machine (Core i7-920, 24GB RAM, OS X 10.9.2):

$ ./good.sh (--opt-level=3 test.rs --cfg=good --test -o test.out)
test insert_100 ... bench:      7770 ns/iter (+/- 1294)

$ ./bad.sh (--opt-level=3 test.rs --cfg=bad --test -o test.out)
test insert_100 ... bench:      9443 ns/iter (+/- 1742)

@cmr on IRC suggested adding -Z lto. Here are the results with that flag:

$ ./good.sh (-Z lto --opt-level=3 test.rs --cfg=good --test -o test.out)
test insert_100 ... bench:      7148 ns/iter (+/- 1912)

$ ./bad.sh (-Z lto --opt-level=3 test.rs --cfg=bad --test -o test.out)
test insert_100 ... bench:      9066 ns/iter (+/- 1771)

As can be seen, the binary built from the compiler at 760b93a is about 25% slower than the binary from the compiler at 3ec321f.

The text was updated successfully, but these errors were encountered:

alexcrichton · 2014-06-16T19:31:29Z

This is likely explained in my description of #14538

Due to using a custom trait, the SipHasher implementation has lost its
specialized methods for writing integers. These can be re-added
backwards-compatibly in the future via default methods if necessary, but the
FNV hashing should satisfy much of the need for speedier hashing.

pnkfelix · 2014-06-16T19:38:31Z

cc me

thestinger · 2014-06-16T19:42:16Z

The ability to use single-pass hashing instead of state machines is important and seems to be missing or at least not leveraged. At the moment, Rust's SipHash implementation is far slower than a naive C implementation, and that's without getting into the large gains you can get by using SIMD.

but the FNV hashing should satisfy much of the need for speedier hashing.

I don't think that's a satisfactory workaround. In addition to not providing protection against pathological data (DoS attacks), FNV is also a very weak hash and suffers from many collisions even with random inputs. It's also not the default hash, and the out-of-the-box performance is what matters most. AFAIK, a proper SipHash implementation is significantly faster in terms of cycles per byte than FNV, and should be winning on large keys where the initial cost becomes less significant.

SergioBenitez · 2014-06-20T23:10:38Z

So it's clear, is the current consensus that the issue could be that the default hashing algorithm was switched to FNV from SipHash? If so, what's preventing the switch back?

pcwalton · 2014-06-20T23:13:12Z

@thestinger What should our hashing API be? I guess it needs to be redesigned?

brson · 2014-06-20T23:21:05Z

@SergioBenitez no, the default hashing algorithm has not changed.

If there are problems with the SipHash implementation, please file them as seperate issues and let's fix them.

thestinger · 2014-08-15T08:54:16Z

@pcwalton: Ideally, HashMap wouldn't require hashes to be implemented as a state machine. It can be useful to have that capability, but in the common case a single-pass implementation would be nicer. I haven't fully thought out how this needs to be done via traits, but I think it's doable.

brson · 2015-05-28T16:07:09Z

I'm untagging all pre-1.0 regressions to repurpose it for stable regressions.

alexcrichton · 2015-08-13T20:44:28Z

A pretty huge amount of time has changed since this was originally filed, so I'm going to close this as "probably stale", although if numbers still show a regression we should definitely fix!

fix: Fix nav target calculation discarding file ids from differing macro upmapping Fixes rust-lang/rust-analyzer#14792 Turns out there was the assumption that upmapping from a macro will always end in the same root file, which is no longer the case thanks to `include!`

thestinger added the I-slow label Jun 16, 2014

emberian added the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label Nov 12, 2014

brson removed the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label May 28, 2015

alexcrichton closed this as completed Aug 13, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression introduced by commit 760b93a #14939

Performance regression introduced by commit 760b93a #14939

SergioBenitez commented Jun 16, 2014

alexcrichton commented Jun 16, 2014

pnkfelix commented Jun 16, 2014

thestinger commented Jun 16, 2014

SergioBenitez commented Jun 20, 2014

pcwalton commented Jun 20, 2014

brson commented Jun 20, 2014

thestinger commented Aug 15, 2014

brson commented May 28, 2015

alexcrichton commented Aug 13, 2015

Performance regression introduced by commit 760b93a #14939

Performance regression introduced by commit 760b93a #14939

Comments

SergioBenitez commented Jun 16, 2014

alexcrichton commented Jun 16, 2014

pnkfelix commented Jun 16, 2014

thestinger commented Jun 16, 2014

SergioBenitez commented Jun 20, 2014

pcwalton commented Jun 20, 2014

brson commented Jun 20, 2014

thestinger commented Aug 15, 2014

brson commented May 28, 2015

alexcrichton commented Aug 13, 2015