-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tune for higher transaction processing. #2294
Conversation
Jenkins Build SummaryBuilt from this commit Built at 20180117 - 17:11:31 Test Results
|
Codecov Report
@@ Coverage Diff @@
## develop #2294 +/- ##
==========================================
- Coverage 70.96% 70.9% -0.06%
==========================================
Files 691 691
Lines 51663 51409 -254
==========================================
- Hits 36661 36452 -209
+ Misses 15002 14957 -45
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me. I left one thought for a potential change. I think it's a good idea, but it's not required.
I don't have any opinion on the constant value changes. I'll defer to @ximinez on those.
Thanks for chasing this down.
src/ripple/overlay/impl/PeerImp.cpp
Outdated
@@ -45,6 +45,9 @@ using namespace std::chrono_literals; | |||
|
|||
namespace ripple { | |||
|
|||
// The maximum number of transactions to have in the job queue. | |||
int const max_transactions = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this constant is only used one place in the file, consider reducing its scope to just where it is needed. Here's an example of what I have in mind: scottschurr@99f3bfc
BTW, I like providing a name for the magic number. Good work with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constant value changes decrease the cost by 10x for peers to send transactions to one another, while leaving the relative for all other activities the same.
soon ...
|
👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments identifying unused Charge
values.
I also suggest updating the commit message to something like
Tune for higher transaction processing:
* Decrease the relative cost for `feeLightPeer`, which is charged when a transaction is received, by increasing all the other charges.
If not, at least remove the .
.
Otherwise, looks good, so I'll approve once those changes are made.
src/ripple/resource/impl/Fees.cpp
Outdated
Charge const feeHighBurdenRPC ( 3000, "heavy RPC" ); | ||
|
||
Charge const feeLightPeer ( 1, "trivial peer request" ); | ||
Charge const feeLowBurdenPeer ( 20, "simple peer request" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feeLowBurdenPeer
is unused. May as well remove it.
src/ripple/resource/impl/Fees.cpp
Outdated
Charge const feeReferenceRPC ( 20, "reference RPC" ); | ||
Charge const feeExceptionRPC ( 100, "exceptioned RPC" ); | ||
Charge const feeLightRPC ( 50, "light RPC" ); // DAVID: Check the cost | ||
Charge const feeLowBurdenRPC ( 200, "low RPC" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feeLightRPC
and feeLowBurdenRPC
are unused. May as well remove them.
src/ripple/resource/impl/Fees.cpp
Outdated
|
||
Charge const feeNewTrustedNote ( 100, "trusted note" ); | ||
Charge const feeNewValidTx ( 100, "valid tx" ); | ||
Charge const feeSatisfiedRequest ( 100, "needed data" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feeNewTrustedNote
, feeNewValidTx
, and feeSatisfiedRequest
are unused. May as well remove them.
2695b2f
to
565ab5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 pending CI.
We found an issue where transactions were being dispatched each time they were received from a peer. This caused transactions to be dropped due to the inflated backlog and servers to be dropped due to the inflated counts of extraneous transactions received from peers. We changed this to suppress dispatching a transaction received from a peer if that transactions was dispatched due to peer receipt within the past ten seconds. |
355981e
to
c73fc8e
Compare
@JoelKatz changes LGTM for the new suppressor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Let's try to get this to the cluster soon to improve the performance there.
c73fc8e
to
b986447
Compare
@JoelKatz No transactions were dropped. However, each transaction was being distributed through the network multiple times to each peer. The duplicates caused congestion control limits to kick in, causing some peers to sever connections with other peers. The only peers who experienced issues with this were the public-facing client handlers and not the validators. No transactions were dropped. |
~50
|
@ximinez @scottschurr @nbougalis @JoelKatz I've reopened this PR because it has been modified since we last visited it. This PR prior to commit e24689b was put into 0.80.2. The latest commit (e24689b) mainly adds counters to track some behaviors surrounding desync events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, nice looking changes. Especially if they help us identify performance issues.
I left comments that you may want to address in a few places.
src/ripple/json/impl/json_value.cpp
Outdated
std::string str(std::to_string(value)); | ||
value_.string_ = valueAllocator ()->duplicateStringValue ( | ||
str.c_str (), str.size() ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit nervous about introducing these constructors. Once these constructors exist, if someone constructs a Value
with an int32_t
, the Value
has intValue
. But if they construct with an int64_t
, the Value
has stringValue
. I think this will potentially lead to bugs where folks don't know what size of integer they are handling (auto
anyone?) and sometimes produce strings instead of integers.
The change is convenient for your particular use case. But I think it has the potential to make bugs easier to write.
If you want to keep this change you'll need to convince me that it is safe for the code base in general, not just the code you're introducing today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without those constructors, creating json from 64bit input won't compile (at least on gcc):
/home/mtravis/projects/rippled/src/ripple/app/misc/NetworkOPs.cpp:2392:34: error: conversion from ‘uint64_t {aka long unsigned int}’ to ‘Json::Value’ is ambiguous
info[jss::jq_trans_overflow] = app_.overlay().getJqTransOverflow();
Referring to NetworkOPs.cpp:2392:
info[jss::jq_trans_overflow] = app_.overlay().getJqTransOverflow();
No existing 64bit integers rendered as Json will be harmed, because none exist other than what I'm putting there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places where we want to store 64-bit integers in JSON, we use to_string
to make it clear that JSON treats them differently.
I agree with @scottschurr that doing this by default has the potential for confusing semantics going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? ximinez@a51519b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reverted the json 64bit constructor to go with the team's consensus that 64bit integers are not to be handled automatically but instead make the coder remember to make them strings. The only json change I kept is initializing the map_ object and protecting against double-free on destruction.
} | ||
|
||
void | ||
incPeerDisconnect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be marked override
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
} | ||
|
||
void | ||
incPeerDisconnectCharges() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be marked override
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/ripple/app/misc/HashRouter.cpp
Outdated
@@ -71,6 +71,18 @@ bool HashRouter::addSuppressionPeer (uint256 const& key, PeerShortID peer, int& | |||
return result.second; | |||
} | |||
|
|||
bool HashRouter::shouldProcess (uint256 const& key, PeerShortID peer, int& flags, | |||
Stopwatch::time_point now, std::chrono::seconds interval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make the caller responsible for now
and interval
? shouldRelay
(which, yeah, I wrote) uses the clock from within the suppressionMap_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, though using the clock from suppressionMap_ fails unit test so am using the C++ clock.
src/ripple/app/misc/HashRouter.cpp
Outdated
auto& s = result.first; | ||
s.addPeer (peer); | ||
flags = s.getFlags (); | ||
return s.shouldProcess (now, interval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could use some unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/ripple/json/impl/json_value.cpp
Outdated
std::string str(std::to_string(value)); | ||
value_.string_ = valueAllocator ()->duplicateStringValue ( | ||
str.c_str (), str.size() ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? ximinez@a51519b
src/ripple/app/misc/NetworkOPs.cpp
Outdated
@@ -120,7 +120,7 @@ class NetworkOPsImp final | |||
{ | |||
struct Counters | |||
{ | |||
std::uint64_t transitions = 0; | |||
std::uint32_t transitions = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intend to change this back to uint32_t
? If so, then I think you could skip the uses of std::to_string()
below. But I don't believe that's your intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I intended to revert and I did revert the conversion for transitions. And to_string() was already in place for the dur member, so that's unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Mark. It looks like the added std::to_string()
calls deal with your removal of the 64-bit JSON interface. Sorry I didn't catch that before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'm good with these changes. I'll let @ximinez speak for the changes he suggested in HashRouter.cpp
.
src/ripple/app/misc/NetworkOPs.cpp
Outdated
@@ -120,7 +120,7 @@ class NetworkOPsImp final | |||
{ | |||
struct Counters | |||
{ | |||
std::uint64_t transitions = 0; | |||
std::uint32_t transitions = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Mark. It looks like the added std::to_string()
calls deal with your removal of the 64-bit JSON interface. Sorry I didn't catch that before.
77c5f7a
to
43457be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test can work with the suppression map clock. I left a comment describing how.
src/test/app/HashRouter_test.cpp
Outdated
|
||
BEAST_EXPECT(router.shouldProcess(key, peer, flags, 1s)); | ||
BEAST_EXPECT(! router.shouldProcess(key, peer, flags, 1s)); | ||
std::this_thread::sleep_for(2s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this line with
++stopwatch;
++stopwatch;
Then change HashRouter::shouldProcess
to use supressionMap_.clock().now()
instead of steady_clock
. Or cherry-pick ximinez@d1f584b
The test should now pass, and as a bonus, the tests aren't unnecessarily slowed down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ximinez Fixed: it's almost as if stopwatch is something that can be started and stopped explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. Now that I think about it, stopwatch
is a terrible name. It's more of a manualClock
.
43457be
to
fbd5493
Compare
Nah, that manualCloxk sounds terrible
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go. 👍
no fun
|
Do not process a transaction received from a peer if it has been processed within the past ten seconds. Increase the number of transaction handlers that can be in flight in the job queue and decrease the relative cost for peers to share transaction and ledger data. Additionally, make better use of resources by adjusting the number of threads we initialize, by reverting commit 68b8ffd. Performance counter modifications: * Create and display counters to track: 1) Pending transaction limit overruns. 2) Total peer disconnections. 3) Peers disconnections due to resource consumption. Avoid a potential double-free in Json library.
fbd5493
to
f0a4f90
Compare
Incorporated into 0.90.0-b4 as 76ad06e. |
No description provided.