Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(query): Update CompressedBin IntersectionAlgo #9000

Merged
merged 10 commits into from
Oct 13, 2023

Conversation

harshil-goel
Copy link
Contributor

@harshil-goel harshil-goel commented Sep 15, 2023

Updated algo to intersect. Getting upto 175% improvment overall

goos: linux
goarch: amd64
pkg: github.com/dgraph-io/dgraph/algo
cpu: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
                                                                                          │     in1      │                 main_in                 │
                                                                                          │    sec/op    │    sec/op      vs base                  │
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=100:overlap=0.01:-8       91.21n ± ∞ ¹   387.70n ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=100:overlap=0.01:-8        336.0n ± ∞ ¹   1733.0n ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=100:overlap=0.01:-8          1.093µ ± ∞ ¹    3.481µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=100:overlap=0.01:-8         4.719µ ± ∞ ¹    8.600µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=100:overlap=0.01:-8        19.76µ ± ∞ ¹    44.32µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=1000:overlap=0.01:-8      481.7n ± ∞ ¹    784.2n ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=1000:overlap=0.01:-8       1.502µ ± ∞ ¹    3.667µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=1000:overlap=0.01:-8         5.454µ ± ∞ ¹   36.977µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=1000:overlap=0.01:-8        33.14µ ± ∞ ¹    70.73µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=1000:overlap=0.01:-8       179.8µ ± ∞ ¹    485.1µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=10000:overlap=0.01:-8     3.214µ ± ∞ ¹    4.459µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=10000:overlap=0.01:-8      12.00µ ± ∞ ¹    32.78µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=10000:overlap=0.01:-8        76.90µ ± ∞ ¹   356.05µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=10000:overlap=0.01:-8       371.6µ ± ∞ ¹    923.1µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=10000:overlap=0.01:-8      1.894m ± ∞ ¹    5.141m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=100000:overlap=0.01:-8    48.13µ ± ∞ ¹    60.59µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=100000:overlap=0.01:-8     146.7µ ± ∞ ¹    611.9µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=100000:overlap=0.01:-8       943.7µ ± ∞ ¹   3359.6µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=100000:overlap=0.01:-8      3.926m ± ∞ ¹    8.849m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=100000:overlap=0.01:-8     20.78m ± ∞ ¹
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=1000000:overlap=0.01:-8   862.4µ ± ∞ ¹   1142.5µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=1000000:overlap=0.01:-8    1.599m ± ∞ ¹    7.878m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=1000000:overlap=0.01:-8      9.791m ± ∞ ¹   33.871m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=1000000:overlap=0.01:-8     44.40m ± ∞ ¹
geomean                                                                                     66.19µ          104.6µ        +175.93%               ³
                    

@dgraph-bot dgraph-bot added area/testing Testing related issues go Pull requests that update Go code labels Sep 15, 2023
algo/uidlist.go Outdated
@@ -60,7 +60,7 @@ func IntersectCompressedWith(pack *pb.UidPack, afterUID uint64, v, o *pb.List) {

// Select appropriate function based on heuristics.
ratio := float64(m) / float64(n)
if ratio < 500 {
if ratio < 10 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set hardcoded variables to a const

algo/uidlist.go Outdated
return q[idx] >= lastUid
})
if qidx >= len(q) {
if ld*10 < len(q) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 10? if it’s based on heuristic then you can just invoke the const

algo/uidlist.go Outdated
if len(uids) == 0 || u > uids[len(uids)-1] {
uids = dec.Seek(u, codec.SeekStart)
if lq*10 < ld {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comments abt 10

jairad26
jairad26 previously approved these changes Sep 29, 2023
Copy link
Member

@jairad26 jairad26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

codec/codec.go Outdated
prevBlockIdx = 0
}

pack := d.Pack
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this assignment

codec/codec.go Outdated
@@ -223,6 +223,64 @@ func (d *Decoder) ApproxLen() int {

type searchFunc func(int) bool

// SeekToBlock will find the nearest block, and unpack it. Unlike Seek, it doesn't
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can improve this explanation

if len(uids) == 0 || u > uids[len(uids)-1] {
uids = dec.Seek(u, codec.SeekStart)
if lq*linVsBinRatio < ld {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix this condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried updating the condition, but it made performance much worse. I am guessing its because that the decision to do binary or linear really depends on the total size of the arrays. If the array ratio is too high, then the numbers would be far apart.

Copy link
Member

@jairad26 jairad26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@harshil-goel harshil-goel merged commit 0b84c95 into main Oct 13, 2023
9 checks passed
@harshil-goel harshil-goel deleted the harshil-goel/algo-bin branch October 13, 2023 14:28
shivaji-kharse pushed a commit that referenced this pull request Mar 12, 2024
Updated algo to intersect. Getting upto 175% improvment overall
```
goos: linux
goarch: amd64
pkg: github.com/dgraph-io/dgraph/algo
cpu: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
                                                                                          │     in1      │                 main_in                 │
                                                                                          │    sec/op    │    sec/op      vs base                  │
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=100:overlap=0.01:-8       91.21n ± ∞ ¹   387.70n ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=100:overlap=0.01:-8        336.0n ± ∞ ¹   1733.0n ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=100:overlap=0.01:-8          1.093µ ± ∞ ¹    3.481µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=100:overlap=0.01:-8         4.719µ ± ∞ ¹    8.600µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=100:overlap=0.01:-8        19.76µ ± ∞ ¹    44.32µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=1000:overlap=0.01:-8      481.7n ± ∞ ¹    784.2n ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=1000:overlap=0.01:-8       1.502µ ± ∞ ¹    3.667µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=1000:overlap=0.01:-8         5.454µ ± ∞ ¹   36.977µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=1000:overlap=0.01:-8        33.14µ ± ∞ ¹    70.73µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=1000:overlap=0.01:-8       179.8µ ± ∞ ¹    485.1µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=10000:overlap=0.01:-8     3.214µ ± ∞ ¹    4.459µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=10000:overlap=0.01:-8      12.00µ ± ∞ ¹    32.78µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=10000:overlap=0.01:-8        76.90µ ± ∞ ¹   356.05µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=10000:overlap=0.01:-8       371.6µ ± ∞ ¹    923.1µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=10000:overlap=0.01:-8      1.894m ± ∞ ¹    5.141m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=100000:overlap=0.01:-8    48.13µ ± ∞ ¹    60.59µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=100000:overlap=0.01:-8     146.7µ ± ∞ ¹    611.9µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=100000:overlap=0.01:-8       943.7µ ± ∞ ¹   3359.6µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=100000:overlap=0.01:-8      3.926m ± ∞ ¹    8.849m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=100:size=100000:overlap=0.01:-8     20.78m ± ∞ ¹
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.01:size=1000000:overlap=0.01:-8   862.4µ ± ∞ ¹   1142.5µ ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=0.1:size=1000000:overlap=0.01:-8    1.599m ± ∞ ¹    7.878m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=1:size=1000000:overlap=0.01:-8      9.791m ± ∞ ¹   33.871m ± ∞ ¹         ~ (p=1.000 n=1) ²
ListIntersectCompressBin/compressed:IntersectWith:ratio=10:size=1000000:overlap=0.01:-8     44.40m ± ∞ ¹
geomean                                                                                     66.19µ          104.6µ        +175.93%               ³
                    

```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing Testing related issues go Pull requests that update Go code
Development

Successfully merging this pull request may close these issues.

4 participants