Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dgraph very bad memory usage when serialize query result #982

Closed
Rader opened this issue May 25, 2017 · 6 comments
Closed

Dgraph very bad memory usage when serialize query result #982

Rader opened this issue May 25, 2017 · 6 comments
Milestone

Comments

@Rader
Copy link

Rader commented May 25, 2017

see the query first

query
{
	it (id:msg_1479709973679)  @normalize
	{
		~click_msg
		{
			~device_click
			{
				devName:name
				device_click
				{
					click_msg
					{
					   msgName:name
					   ~click_msg
					   {
						~device_click{
							dev2Name:name
						}
					   }
					}
				}
				
			}
		}
	}
}

It will get about 200w paths like this: msg(msg_1479709973679) -> device -> msg -> device.
If I run the query, Dgraph (v0.7.6) will use over 10G memory on my MAC pro with 16G, but still can not return the result as Dgraph will use all the CPU for GC. I mean this,

2017/05/25 10:56:42 lists.go:310: Memory usage over threshold. STW. Allocated MB: 4201
2017/05/25 10:56:42 lists.go:196: Aggressive evict, committing to RocksDB
2017/05/25 10:56:44 lists.go:211: Trying to free OS memory
2017/05/25 10:56:45 lists.go:217: EVICT DONE! Memory usage after calling GC. Allocated MB: 3307
2017/05/25 10:56:47 lists.go:310: Memory usage over threshold. STW. Allocated MB: 4176
2017/05/25 10:56:47 lists.go:196: Aggressive evict, committing to RocksDB
2017/05/25 10:56:47 lists.go:211: Trying to free OS memory
2017/05/25 10:56:48 lists.go:217: EVICT DONE! Memory usage after calling GC. Allocated MB: 3872
2017/05/25 10:56:52 lists.go:310: Memory usage over threshold. STW. Allocated MB: 5373
2017/05/25 10:56:52 lists.go:196: Aggressive evict, committing to RocksDB
2017/05/25 10:56:52 lists.go:211: Trying to free OS memory
2017/05/25 10:56:53 lists.go:217: EVICT DONE! Memory usage after calling GC. Allocated MB: 4846

Go pprof will tell you the what's CPU is used for.

rader-MacBook-Pro:Desktop rader$ go tool pprof http://localhost:8080/debug/pprof/profile
Fetching profile from http://localhost:8080/debug/pprof/profile
Please wait... (30s)
Saved profile in /Users/rader/pprof/pprof.localhost:8080.samples.cpu.003.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) top
71.18s of 82.59s total (86.18%)
Dropped 97 nodes (cum <= 0.41s)
Showing top 10 nodes out of 89 (cum >= 4.57s)
flat flat% sum% cum cum%
35.02s 42.40% 42.40% 58.40s 70.71% runtime.scanobject
12.20s 14.77% 57.17% 12.23s 14.81% runtime.greyobject
11.17s 13.52% 70.70% 11.17s 13.52% runtime.heapBitsForObject
2.55s 3.09% 73.79% 64.09s 77.60% runtime.gcDrain
2.24s 2.71% 76.50% 15.74s 19.06% github.com/dgraph-io/dgraph/query.(*SubGraph).preTraverse
2.06s 2.49% 78.99% 2.26s 2.74% runtime.(*gcWork).get
1.63s 1.97% 80.97% 1.63s 1.97% github.com/dgraph-io/dgraph/algo.IndexOf.func1
1.49s 1.80% 82.77% 1.49s 1.80% sync.(*Mutex).Lock
1.44s 1.74% 84.51% 1.44s 1.74% runtime.memmove
1.38s 1.67% 86.18% 4.57s 5.53% runtime.mallocgc

Most of the memory is used while doing serialization, converting to JSON for example as I sent request through HTTP. Total memory used is 4.52GB when I make this heap profile request. After that, the memory continue increasing to 10G.

rader-MacBook-Pro:dgraph0.7.6 rader$ go tool pprof dgraph http://localhost:8080/debug/pprof/heap
Fetching profile from http://localhost:8080/debug/pprof/heap
Saved profile in /Users/rader/pprof/pprof.dgraph.localhost:8080.alloc_objects.alloc_space.inuse_objects.inuse_space.003.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) top
4586.12MB of 4630.95MB total (99.03%)
Dropped 78 nodes (cum <= 23.15MB)
Showing top 10 nodes out of 25 (cum >= 44.24MB)
flat flat% sum% cum cum%
2617.24MB 56.52% 56.52% 2658.99MB 57.42% runtime.mapassign
903.04MB 19.50% 76.02% 903.04MB 19.50% runtime.makemap
271.42MB 5.86% 81.88% 1911.98MB 41.29% github.com/dgraph-io/dgraph/query.(*fastJsonNode).AddListChild
228.01MB 4.92% 86.80% 228.01MB 4.92% runtime.stringtoslicebyte
221.01MB 4.77% 91.57% 1423.21MB 30.73% github.com/dgraph-io/dgraph/query.(*fastJsonNode).AddValue
153.50MB 3.31% 94.89% 1056.54MB 22.81% github.com/dgraph-io/dgraph/query.(*fastJsonNode).New
124.65MB 2.69% 97.58% 219.39MB 4.74% github.com/dgraph-io/dgraph/worker.processTask
40.24MB 0.87% 98.45% 40.24MB 0.87% runtime.hashGrow
27MB 0.58% 99.03% 32MB 0.69% github.com/dgraph-io/dgraph/posting.(*List).ValueFor
0 0% 99.03% 44.24MB 0.96% github.com/dgraph-io/dgraph/posting.(*listMap).PutIfMissing
(pprof) list AddListChild
Total: 4.52GB
ROUTINE ======================== github.com/dgraph-io/dgraph/query.(*fastJsonNode).AddListChild in /Users/rader/projects/Go/src/github.com/dgraph-io/dgraph/query/outputnode.go
271.42MB 1.87GB (flat, cum) 41.29% of Total
. . 294:}
. . 295:
. . 296:func (fj *fastJsonNode) AddListChild(attr string, child outputNode) {
. . 297: children, found := fj.children[attr]
. . 298: if !found {
228.01MB 228.01MB 299: children = make([]*fastJsonNode, 0, 5)
. . 300: }
43.41MB 1.64GB 301: fj.children[attr] = append(children, child.(*fastJsonNode))
. . 302:}
. . 303:
. . 304:func (fj *fastJsonNode) New(attr string) outputNode {
. . 305: return &fastJsonNode{
. . 306: children: make(map[string][]*fastJsonNode),

If I stop building the fastJsonNode tree, and use the SubGraph object directly (see the line before return - json.NewEncoder(bufw).Encode(sg)) memory usage will be only a little more than 1G. And the CPU usage is about 100%, instead of 700% on my MAC. By the way, the JSON result sent to http client is about 60M bytes.

func (sg *SubGraph) ToFastJSON(l *Latency, w io.Writer, allocIds map[string]string, addLatency bool) error {
	var seedNode *fastJsonNode
	n := seedNode.New("_root_")
	// if sg.Attr == "__" {
	// 	for _, sg := range sg.Children {
	// 		err := processNodeUids(n.(*fastJsonNode), sg)
	// 		if err != nil {
	// 			return err
	// 		}
	// 	}
	// } else {
	// 	err := processNodeUids(n.(*fastJsonNode), sg)
	// 	if err != nil {
	// 		return err
	// 	}
	// }

	if sg.Params.isDebug || addLatency {
		sl := seedNode.New("serverLatency").(*fastJsonNode)
		for k, v := range l.ToMap() {
			val := types.ValueForType(types.StringID)
			val.Value = v
			sl.AddValue(k, val)
		}
		n.AddMapChild("server_latency", sl, false)
	}

	if allocIds != nil && len(allocIds) > 0 {
		sl := seedNode.New("uids").(*fastJsonNode)
		for k, v := range allocIds {
			val := types.ValueForType(types.StringID)
			val.Value = v
			sl.AddValue(k, val)
		}
		n.AddMapChild("uids", sl, false)
	}

	bufw := bufio.NewWriter(w)
	n.(*fastJsonNode).encode(bufw)
	json.NewEncoder(bufw).Encode(sg)
	return bufw.Flush()
}
@janardhan1993
Copy link
Contributor

janardhan1993 commented May 25, 2017

@Rader Can you please share the mutations so that we can reproduce the issue and fix it. Thanks for reporting this. We will investigate

@Rader
Copy link
Author

Rader commented May 26, 2017

@janardhan1993 I did what exactly the data loader did. First I generate RDFs, then call rdf.Parse to get NQuad. Finally, I commit them through BatchMutation.

func (r *DevRepo) genClickMutation(pairs []DevMsgPair) string {
	var buf bytes.Buffer
	for _, pair := range pairs {
		clickID := uuid.NewV4().String()
		// properties
		buf.WriteString(fmt.Sprintf("<dev_%s> <name> \"%s\" .\n", pair.DevID, pair.DevID))
		buf.WriteString(fmt.Sprintf("<dev_%s> <tag> \"device\" .\n", pair.DevID))
		buf.WriteString(fmt.Sprintf("<%s> <tag> \"click\" .\n", clickID))
		buf.WriteString(fmt.Sprintf("<msg_%d> <name> \"%d\" .\n", pair.MsgID, pair.MsgID))
		buf.WriteString(fmt.Sprintf("<msg_%d> <tag> \"msg\" .\n", pair.MsgID))
		// predicates
		buf.WriteString(fmt.Sprintf("<dev_%s> <device_click> <%s> .\n", pair.DevID, clickID))
		buf.WriteString(fmt.Sprintf("<%s> <click_msg> <msg_%d> .\n", clickID, pair.MsgID))
	}
	return buf.String()
}

@manishrjain manishrjain self-assigned this May 29, 2017
@manishrjain manishrjain added this to the v0.8 milestone May 29, 2017
@manishrjain
Copy link
Contributor

manishrjain commented May 29, 2017

Thanks for this issue, @Rader. This is a really interesting issue. We'll optimize our JSON conversion.

Btw, note that directly encoding SubGraph, would not give you the results you expect. We do edge based sharding, and SubGraph is an edge based tree. OTOH, the results user expects are node based.

That's the conversion that's going on here. And I think what's happening is that there are too many nodes involved, each node requiring its own memory allocation on heap, which becomes a problem.

@ashwin95r ashwin95r modified the milestones: v0.8Maybe, v0.8 Jul 3, 2017
@sebastianmacias
Copy link

Checkout https://github.com/json-iterator/go/, it's the fastest JSON encoder/decoder golang library I have seen so far.

@manishrjain
Copy link
Contributor

The encoder that we have avoids creating a map in the first place, which would be expensive. We directly create the JSON string representation.

@tzdybal
Copy link
Contributor

tzdybal commented Jul 17, 2017

This is related to #1138.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants