-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(core): Fix performance issue in type filter #9065
Conversation
b31cea7
to
8375b71
Compare
posting/list.go
Outdated
numNormalPostingsRead := 0 | ||
defer func() { | ||
if numNormalPostingsRead < numDeletePostingsRead { | ||
glog.V(3).Infof("During iterate on posting list, we read %d set postings, %d delete postings"+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify this message in some way to be more useful to someone who is not familiar with internal workings? Also include which posting list if that is available.
I think this means that the badger values representing a posting list (in 256KB chunks IIRC) contained many deleted structures, and are doing more than 50% of the data movement for non-useful deleted data.
If so something like: "High proportion of deleted data observed for posting list {l.key}: total = {numNormal+numdeleted}, percent deleted = {numDel / (numnormal+numdel) * 100}%".
@@ -48,7 +48,7 @@ const ( | |||
`client_key=; sasl-mechanism=PLAIN; tls=false;` | |||
LimitDefaults = `mutations=allow; query-edge=1000000; normalize-node=10000; ` + | |||
`mutations-nquad=1000000; disallow-drop=false; query-timeout=0ms; txn-abort-after=5m; ` + | |||
` max-retries=10;max-pending-queries=10000;shared-instance=false` | |||
` max-retries=10;max-pending-queries=10000;shared-instance=false;type-filter-uid-limit=10` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this... If a customer is large enough to have performance concerns, my guess is the dgraph.type index is going to be very large (or include some very large index entries that drive the performance profile). In that case, perhaps we should optimize for this case, and consider what uid-limit will balance performance for 1M or more UIDs? I suspect that will be more like 100 or more.
Maybe not something to delay/retest for, but consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The number 10 is too small for such optimization. But we can defer this until we see another case and get some validation of type-filter-uid-limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
Currently when we do queries like `func(uid: 0x1) @filter(type)`. We retrieve the entire type index. Sometimes, when the index is too big, fetching the index is quite slow. We realised that if we know we only want to check few `uids` are of the same, then we can just check those `uids` directly. Right now we are hard coding the number of `uids` threshold. This could be improved with a more statistical based model, where we figure out how many items does the type index have, how many we need to check.
Currently when we do queries like `func(uid: 0x1) @filter(type)`. We retrieve the entire type index. Sometimes, when the index is too big, fetching the index is quite slow. We realised that if we know we only want to check few `uids` are of the same, then we can just check those `uids` directly. Right now we are hard coding the number of `uids` threshold. This could be improved with a more statistical based model, where we figure out how many items does the type index have, how many we need to check.
Currently when we do queries like
func(uid: 0x1) @filter(type)
. We retrieve the entire type index. Sometimes, when the index is too big, fetching the index is quite slow. We realised that if we know we only want to check fewuids
are of the same, then we can just check thoseuids
directly. Right now we are hard coding the number ofuids
threshold. This could be improved with a more statistical based model, where we figure out how many items does the type index have, how many we need to check.