Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental improvements to path finding memory usage #4099

Closed
wants to merge 0 commits into from

Conversation

ximinez
Copy link
Collaborator

@ximinez ximinez commented Feb 12, 2022

High Level Overview of Change

As the number of trust lines on the ledger has exploded, so has the amount of memory used by servers for path finding.

Some particularly gnarly requests have been observed to load over 20 million trust lines into memory. These changes build on #4080, which reduces the amount of memory used per trust line, so reviewers only need to review the commit(s) starting at "Incremental improvements to path finding memory usage".

Background path finding jobs can continue after a websocket client has created another request, or disconnected entirely. This can cause a lot of memory to be used unnecessarily while other requests are using even more memory. The main changes in this PR free up the relevant memory more quickly when the request is no longer relevant.

Type of Change

  • [X ] New feature (non-breaking change which adds functionality)
  • [X ] Refactor (non-breaking change that only restructures code)

Test Plan

Run an expensive path_find request. Wait until it returns a full_reply: true result, indicating that the background job has started. Invalidate the request using one of the following methods:

  • Disconnect the client.
  • Issue a { command: path_find, subcommand: close } to close the request.
  • Issue a new path_find request on the same connection.

With version of code before this PR, the path request can be observed to continue running by tracking process memory, or using the get_counts command to see the number of PathFindTrustLine continue to grow. (Note that the counter for PathFindTrustLine was only added in #4080.) With these changes, the memory usage will drop within about a few seconds to a minute, and the PathFindTrustLine count will drop to 0 for all cases but the last.

Future Tasks

Now that the trust line memory is getting freed up more quickly, I'm working on preventing as many trust lines and other objects from being loaded in the first place.

@ximinez ximinez changed the title Memory Incremental improvements to path finding memory usage Feb 12, 2022
@ximinez ximinez requested a review from mtrippled February 12, 2022 01:15
@ximinez ximinez requested a review from seelabs February 12, 2022 01:15
Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left some minor comments, but nothing big.

<< "getLineCache creating new cache for " << lgrSeq;
// Assign to the local before the member, because the member is a
// weak_ptr, and will immediately discard it if there are no other
// references.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comment. Thanks for that.

@@ -80,15 +90,28 @@ PathRequests::updateAll(

int processed = 0, removed = 0;

auto getSubscriber = [](PathRequest::pointer& request) -> InfoSub::pointer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Param could be const&

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Json::Value update = request->doUpdate(
cache, false, [&getSubscriber, &request]() {
return (bool)getSubscriber(request);
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious what that lambda is doing. Maybe make a local variable for the callback just for the purpose of nameing it? I.e.

auto conintueCallback = [&getSubscriber, &request]()...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

s << ", " << *iter;
}
return s;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of iterator comes up all the time - I wish someone would add it to boost or std. This is the best I could find:
https://en.cppreference.com/w/cpp/experimental/ostream_joiner
but that's experimental, I'd rather not use that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't help myself, so I wrote something to do it.

doUpdate(
std::shared_ptr<RippleLineCache> const&,
bool fast,
std::function<bool(void)> continueCallback = {});
Copy link
Collaborator

@seelabs seelabs Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider passing function by const& (here and other places). Here's a patch to do that if you want: seelabs@97ede1e

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Collaborator

@mtrippled mtrippled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. And thanks for coding up the join utility. That's awesome!

I did have a concern about lifetime management of the delimiter. Take a look and see what you think.

{
public:
Collection const& collection;
Str const& delimiter;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifetime management of delimiter gives me pause. In the common case - passing a statically allocated c string - we're fine. But this is a very general class and I could see delimiter pointing to some temporary that went out of scope. Since delimiters are going to be small (usually one or two characters) and std::string has a small-object optimization (no allocations for small strings), what do you think about using a std::string for the delimiter instead?

FWIW, I'm not concerned about the lifetime management of collection. The caller should expect this to keep a reference to the object.

I saw in your tests that you were using integers and uint256 as crazy delimiters. But I think just supporting strings is enough (if you have a crazy delimiter, convert it to a string first). Here's a patch if you want to see what this looks like: seelabs@2bf7647

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'm approving this now, but I did leave a comment on forcing the delim to be an rvalue ref. I'd leave that as a value parameter to keep the interface easier to use.

@@ -44,7 +42,7 @@ class CollectionAndDelimiter
Collection const& collection;
std::string const delimiter;

explicit CollectionAndDelimiter(Collection const& c, std::string delim)
explicit CollectionAndDelimiter(Collection const& c, std::string&& delim)
Copy link
Collaborator

@seelabs seelabs Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably leave delim as a value, not an rvalue (here and in cases below). The common use case of passing a cstring will act the same. It will act differently if someone already has a delim as a string that they want to pass in. Making this a rvalue might have a performance benefit - it forces the caller to decide if they need to copy the delimiter or if they can just move it. But it also make the interface a little harder to use. For a class like this I'd choose to make the interface easier to use.

I'm fine with it as-is too. But thought I'd make the note.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. I'll change it back.

@ximinez ximinez force-pushed the memory branch 3 times, most recently from 911f994 to 42a3704 Compare March 15, 2022 22:13
@ximinez ximinez added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Mar 24, 2022
@ximinez
Copy link
Collaborator Author

ximinez commented Mar 24, 2022

Marked "passed" and squashed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants