Memory leak in REP socket handling #2567

dex6 · 2017-05-09T12:31:31Z

Problem happens in a REP server, when a REQ client disconnects before the response is sent. According to valgrind, about 500B leaks per response on my test setup. I've discovered the problem in ZMQ 4.2.1, then confirmed also in master and 4.1.6.
The app I'm working on is a simple REQ/REP server, working on Linux, using tcp/ipc protocols. It occasionally requires significant amount of time to process the request, due to external hardware problems, during which our client timeouts and leaves the server with small, but growing leak.

I've traced the problem to a partial message being left in a pipe:

When a request is received, the rep_t::xrecv() calls router_t::xsend() to copy all the labels to response message.
The rest of message is passed to user's app, which works hard to prepare the response
In the meantime, the client goes away
The response is sent using rep_t::xsend(), which calls router_t::xsend(), which correctly notices the pipe had gone away and frees the response (thanks to fix from Memory leak on REP socket server when the REQ client disappears #1313), however the labels already put into the pipe in step "1" are left there and never freed.

I suppose the session_base_t::clean_pipes() should remove them in step "3", however it does not happen for some reason... Unfortunately, I don't have enough time to get familiar with libzmq code, so I cannot came with a patch at hand... But I would gratefully test one.

Please see attached valgrind output, and a minimal server + client test apps for reproducing the issue (my real server/client uses czmq 4 and pyzmq 14.x, however that should not matter much). The sample server is hardcoded with "processing" time (just a sleep) set to 1s, and the sample client has RCVTIMEO set to 0.5s. With such values I get the memleak every request. When receive timeout is raised enough that client reads the response, the problem stops occurring (obviously).

libzmq_valgrind.txt
testapps.zip

Solution: roll back the pipe if writing messages other than the first fails in router::xsend. Also add test case that reproduces the memory leak when ran with valgrind. Fixes zeromq#2567

bluca · 2017-05-10T22:49:43Z

This is a very good analysis, thank you. I've managed to create a simple and self contained test case that reproduces the problem both locally and on Travis.

I think the solution is to call rollback () on current_out. Once the CI is green I'll open a PR.

dex6 · 2017-05-11T09:16:12Z

@bluca Thanks for preparing the patch, I'll merge it to my company's libzmq fork, and let it run for the next weekend. I'll keep you posted here.

BTW. any plans to release 4.2.3 in near future?

bluca · 2017-05-11T09:45:07Z

Yes it's almost time, but there's a couple of things that need fixing first (cmake and ipv6 related)

dex6 · 2017-05-15T09:23:55Z

The patch seems working for me. You may close the issue once PR #2572 is merged.

Thanks for help!

bluca · 2017-05-15T10:42:50Z

Great, thanks for confirming

Solution: roll back the pipe if writing messages other than the first fails in router::xsend. Roll it back also when the pipe is terminating. Also add test case that reproduces the memory leak when ran with valgrind. Fixes zeromq#2567

bluca mentioned this issue May 10, 2017

Problem: REP leaves label msgs for dead REQ in pipe #2572

Merged

bjovke closed this as completed in #2572 May 16, 2017

bluca reopened this May 16, 2017

bluca mentioned this issue May 17, 2017

Problem: REP leaves label msgs for dead REQ in pipe #2577

Merged

somdoron closed this as completed in #2577 May 17, 2017

bluca mentioned this issue Jun 22, 2017

intermittent memory leak for req/rep send/recv. #2602

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in REP socket handling #2567

Memory leak in REP socket handling #2567

dex6 commented May 9, 2017

bluca commented May 10, 2017

dex6 commented May 11, 2017

bluca commented May 11, 2017

dex6 commented May 15, 2017

bluca commented May 15, 2017

Memory leak in REP socket handling #2567

Memory leak in REP socket handling #2567

Comments

dex6 commented May 9, 2017

bluca commented May 10, 2017

dex6 commented May 11, 2017

bluca commented May 11, 2017

dex6 commented May 15, 2017

bluca commented May 15, 2017