-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(vector): updated marshalling of vector #9109
Conversation
Harshil goel seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
107975e
to
167743d
Compare
@@ -335,7 +326,7 @@ func populateEdgeDataFromKeyWithCacheType( | |||
if data == nil { | |||
return false, nil | |||
} | |||
err = json.Unmarshal(data.([]byte), &edgeData) | |||
err = decodeUint64MatrixUnsafe(data.([]byte), edgeData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change the disk format? Why can't we use protobuf here like we do for everything we write to disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to iterate over the list to make it a protobuff. So to marshal and unmarshal becomes a linear task. Hence taking too much time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BenchmarkEncodeDecodeUint64Matrix/JSON_Encoding/Decoding-8 91692 12608 ns/op
BenchmarkEncodeDecodeUint64Matrix/PB_Encoding/Decoding-8 467174 2221 ns/op
BenchmarkEncodeDecodeUint64Matrix/Unsafe_Encoding/Decoding-8 1609965 748.9 ns/op
e52bdb7
to
3cce9dc
Compare
60fca00
to
fe5159c
Compare
Earlier we were unmarshalling bytes to []float64 by iterating on each element and reading it little endian. But we are now doing it using unsafe pointers. This reduces thee time from O(size(bytes)) to O(1) basically.
Benchmark stats:
Now indexing 500k vectors take about 5 minutes. (more than 5 hours before)