-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: AVX2 sha256 #32
POC: AVX2 sha256 #32
Conversation
911d5f7
to
2d2e71f
Compare
Before: ``` test bench1_10 ... bench: 39 ns/iter (+/- 2) = 256 MB/s test bench2_100 ... bench: 349 ns/iter (+/- 23) = 286 MB/s test bench3_1000 ... bench: 3,412 ns/iter (+/- 11) = 293 MB/s test bench4_10000 ... bench: 34,084 ns/iter (+/- 3,183) = 293 MB/s ``` After: ``` itest bench1_10 ... bench: 27 ns/iter (+/- 1) = 370 MB/s test bench2_100 ... bench: 232 ns/iter (+/- 5) = 431 MB/s test bench3_1000 ... bench: 2,082 ns/iter (+/- 85) = 480 MB/s test bench4_10000 ... bench: 20,543 ns/iter (+/- 2,686) = 486 MB/s ```
# This software is available to you under a choice of one of two | ||
# licenses. You may choose to be licensed under the terms of the GNU | ||
# General Public License (GPL) Version 2, available from the file | ||
# COPYING in the main directory of this source tree, or the | ||
# OpenIB.org BSD license below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Licensing-wise all of our crates are MIT+Apache 2.0 which is the conventional license choice for Rust projects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know.. this is why I was asking. It was just easy to copy from the linux kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be an alternative https://github.com/openssl/openssl/blob/13a574d8bb2523181f8150de49bc041c9841f59d/crypto/sha/asm/sha256-mb-x86_64.pl with Apache License 2.0
and even faster in my benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might work. We could potentially contact the author to ask about dual licensing under MIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarcieri The version from openssl (in latest git) is effectively triple-licensed under 3-clause BSD, Apache 2.0, and GPL, I think the combination of 3-clause BSD and Apache 2.0 is sufficiently equivalent that it'd be fine to just document that as the license rather than requesting what amounts to a relicense from BSD to MIT (both permissive licenses that are nearly equivalent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshtriplett per @newpavlov's comment below, we can just change the license of the crate to be whatever license the original ASM uses.
Thanks for opening this PR! It'd be great to have AVX2 support for SHA-256, unfortunately I think the licensing is going to pose a problem for us. Offhand I'm not sure where to get an implementation which is license compatible (I just checked the public domain sources in the eBACS SUPERCOP repo but there doesn't appear to be one). |
Also note that where possible we like to have Not sure how interested you are in this particular problem but if you'd really like to dig into it, that'd be great. |
I am sorry, it was just a POC... I don't have the time to dig into that further. |
The licensing issue is not critical (e.g. the currently used assembly files are licensed under MIT only), but we certainly would like to keep licensing of crates simple if possible. Also I agree with @tarcieri about using intrinsics, currently we have to use asm files for ARM only because the relevant intrinsics are currently unstable. |
Here is a quick POC for sha256 with SIMD... let me know, if you accept code with the
OpenIB.org BSD
license.Before:
After: