-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] SIMD vectorization for distance metrics #2084
[ENH] SIMD vectorization for distance metrics #2084
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
[[bench]] | ||
name = "distance_metrics" | ||
path = "src/benches/distance_metrics.rs" | ||
harness = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
learning: what does this do (harness)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
harness disables libtest benchmarking since I am using the criterion
crate
@@ -47,6 +52,7 @@ proptest = "1.4.0" | |||
proptest-state-machine = "0.1.0" | |||
"rand" = "0.8.5" | |||
rayon = "1.8.0" | |||
criterion = "0.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity what alternatives did we evaluate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. By default, we have libtest
using which we can do basic benchmarking. My main motive behind using criterion
was to enable sophisticated benchmarking for the future. In particular criterion
provides:
- Statistics: Statistical analysis detects if, and by how much, performance has changed since the last benchmark run
- Charts: Uses gnuplot to generate detailed graphs of benchmark results
These would be useful in future to run these benchmarks in our CI/CD pipeline or nightly/weekly (some cadence) to detect perf regressions for e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome thank you
all(target_feature = "avx", target_feature = "fma") | ||
))] | ||
{ | ||
if std::arch::is_x86_feature_detected!("avx") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add tests to validate that the simd impls (whatever can run on the target machine) match the base impl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a test already in types.rs
- test_distance_function_l2sqr()
which I was relying on. If that regresses then there is some bug in the SIMD impls. That test validates inner product and l2 norm. Since, cosine is the same as inner product for us, I am guessing that is not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok great good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is super cool, wonder if it'd be worth using an existing vectorized implementation like this one instead of implementing it ourselves?
@codetheweb the main issue with using a third-party is that our distance functions have slightly different definition. For e.g. cosine similarity assumes the vectors are normalized |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Verified that there is about 10x perf improvement for both arm and x86_64. Going to merge this unless anyone else has any objections. |
Description of changes
Adds SIMD vectorization for euclidean, cosine and inner product for x86, x86_64 and arm. Instruction sets whose support has been added are SSE, AVX and NEON.
Test plan
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
No