Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize calculation of close time to avoid impasse and minimize gratuitous proposal changes #4760

Merged
merged 8 commits into from
Nov 17, 2023
190 changes: 147 additions & 43 deletions src/ripple/consensus/Consensus.h
Original file line number Diff line number Diff line change
Expand Up @@ -1759,15 +1759,27 @@ Consensus<Adaptor>::updateOurPositions(bool const share)
else
{
int neededWeight;
// It's possible to be at a close time impasse (described below), so
// keep track of whether this round has taken a long time.
bool stuck = false;

if (convergePercent_ < parms.avMID_CONSENSUS_TIME)
{
neededWeight = parms.avINIT_CONSENSUS_PCT;
}
else if (convergePercent_ < parms.avLATE_CONSENSUS_TIME)
{
neededWeight = parms.avMID_CONSENSUS_PCT;
}
else if (convergePercent_ < parms.avSTUCK_CONSENSUS_TIME)
{
neededWeight = parms.avLATE_CONSENSUS_PCT;
}
else
{
neededWeight = parms.avSTUCK_CONSENSUS_PCT;
stuck = true;
}
mtrippled marked this conversation as resolved.
Show resolved Hide resolved

int participants = currPeerPositions_.size();
if (mode_.get() == ConsensusMode::proposing)
Expand All @@ -1787,57 +1799,149 @@ Consensus<Adaptor>::updateOurPositions(bool const share)
<< " nw:" << neededWeight << " thrV:" << threshVote
<< " thrC:" << threshConsensus;

// An impasse is possible unless a validator pretends to change
// its close time vote. Imagine 5 validators. 3 have positions
// for close time t1, and 2 with t2. That's an impasse because
// 75% will never be met. However, if one of the validators voting
// for t2 switches to t1, then that will be 80% and sufficient
// to break the impasse. It's also OK for those agreeing
// with the 3 to pretend to vote for the one with 2, because
// that will never exceed the threshold of 75%, even with as
// few as 3 validators. The most it can achieve is 2/3.
for (auto& [t, v] : closeTimeVotes)
// Choose a close time and decide whether there is consensus for it.
//
// Close time is chosen based on the threshVote threshold
// calculated above. If a close time has votes equal to or greater than
// that threshold, then that is the best close time. If two
// close times have an equal number of votes, then choose the greatest
mtrippled marked this conversation as resolved.
Show resolved Hide resolved
// of them. Ensure that our close time then matches that which meets
// the criteria. But if no close time meet the criteria, make no
// changes.
//
// This is implemented slightly differently for validators vs
// non-validators. For non-validators, it is sufficient to merely
// count the close times from all peer positions to determine
// the best. Validators, however, risk putting the network into an
// impasse unless they are able to change their own position without
// first having counted it towards the close time totals.
//
// Here's how the impasse could occur:
// Imagine 5 validators. 3 have close time t1, and 2 have t2.
// As consensus time increases, the threshVote threshold also increases.
// Once threshVote exceeds 60%, no members of either set of validators
// will change their close times.
//
// Avoiding the impasse means that validators should not necessarily
// count their own close time towards the total until they know
// the most popular among their peers, and then change their vote
// to that of the most popular. In this case, the validators in set
// t2 from the example would have switched to close time t1 and ended
// the network impasse.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the validators in t1 do not count their own close times, then they perceive two equal sets of 2 validators each for different close times. Are we sure they won't switch, even if t2 is greater? Perhaps the code handles this correctly (I have not yet analyzed it), but it seems ambiguous in at least this comment.

I'm a little worried that we're adding more and more special cases to the code instead of carefully designing and implementing a general algorithm. I'm disturbed by how long this function is even before this change. Is it possible to state the current consensus algorithm in simple terms? How far has it strayed from the description in the whitepapers? I don't think I'm concerned enough right now to try to get in the way of this change, but I'd like to hear opinions from others. How much confidence do we have in the convergence of consensus at this point?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I observed the impasse behavior in testing, and commented on how specifically it occurred:
// Here's how the impasse could occur:
// Imagine 5 validators. 3 have close time t1, and 2 have t2.
// As consensus time increases, the threshVote threshold also increases.
// Once threshVote exceeds 60%, no members of either set of validators
// will change their close times.

This is a point fix to a problem observed in testing that also potentially affects the live network. It is not a holistic refactor of consensus.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As duration increases, threshVote gets higher. threshVote is the criteria for whether or not to change one's close time. So if it's >60% and only 60% of peers have the other close time, then none will ever change their close time. That's the impasse which was observed in testing. The original fix led to gratuitous position changes, which were again observed in testing. This fixes both issues, and no problems are observed in testing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        // Here's how the impasse could occur:
        // Imagine 5 validators. 3 have close time t1, and 2 have t2.
        // As consensus time increases, the threshVote threshold also increases.
        // Once threshVote exceeds 60%, no members of either set of validators
        // will change their close times.

If t2 > t1, then:

  • the validators in t1 will see two times tied with two votes each, and switch to the greater of the two, which is t2
  • the validators in t2 will see t1 with 3 votes and t2 with 1 vote, and switch to the one with more votes, which is t1

After, t2 will have 3 votes and t1 will have 2 votes, and there will be one more flip as everyone converges on the time that is both greater and has more votes, which is t2. Correct? Can that be done in one step instead of two? Are there chances for more discord with 35 validators and more than two close times? Does this feel hacky to anyone else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the logic behind the threshVote restriction anyway? Why is switching halted after all times fall below threshVote? Why not just remove that restriction? Could the fix be as simple as that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtrippled How is the impasse for which #4505 addresses possible for any transaction volume? The protocol has been working well with the 1950ms being the min_duration for the establish phase and how the countdown starts, again, if we want to change how the close_time should be established, we need to test it under a more realistic env. and test case. I agree totally with the proactive testing, but the testing has to be valid in terms of how it can appear in production.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And simply making a test environment more like production doesn't necessarily cause the bug to occur, at least immediately. As in the description, the impasse potential is latent at any volume, having to do with decisions about when to change a close time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is fine, but this comment is wrong. Clearing the impasse takes one more round.

        // Avoiding the impasse means that validators should not necessarily
        // count their own close time towards the total until they know
        // the most popular among their peers, and then change their vote
        // to that of the most popular. In this case, the validators in set
        // t2 from the example would have switched to close time t1 and ended
        // the network impasse.

Assume all these validators are "stuck", i.e. they all perceive a long consensus time. Our starting condition is this:

validator voting for
A t1
B t1
C t1
D t2
E t2

Every validator looks at the count, but excludes their own vote. Here is how they see the votes:

validator votes for t1 votes for t2
A 2 2
B 2 2
C 2 2
D 3 1
E 3 1

The t1 validators A, B, C decide to switch to t2 because they perceive a tie between t1 and t2 and t2 is the later value. The t2 validators D, E decide to switch because they perceive a sole winning time of t1. All validators flip. Here is the next state:

validator voting for
A t2
B t2
C t2
D t1
E t1

The impasse remains. Now they count votes again, excluding their own vote. Here is what they see:

validator votes for t1 votes for t2
A 2 2
B 2 2
C 2 2
D 1 3
E 1 3

The validators A, B, C, which are now voting for t2, still perceive a tie between t1 and t2, but they do not switch their vote, because t2 is the later value. The validators D, E, which are now voting for t1, do switch their vote because they perceive a sole winning time of t2. Now the impasse is cleared. Here is the next state:

validator voting for
A t2
B t2
C t2
D t2
E t2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the comment to reflect the current behavior.

Regarding the table you made--the new behavior is different than described based on the problem you pointed out. No longer do validators revoke their vote in the first step after being "stuck". Instead, they all vote as they always have and tally for each. Now, however, they determine which is the best close time based on the criteria: most votes, then if tie, latest close time. In this case, t1 will have 3 and t2 start with 2 votes. Only the t2 set will change their position and vote to t1. The t1 set know they are the best, and will not change their votes. Impasse resolved next iteration.

In the case of the problem with t1 having 2 votes and t2 having 2 votes--that was possible with before my previous commit. What happens now is that each validator first counts their own close time towards the tallies of each possible close time. They they see that there is a tie between t1 and t2. However, t2 is greater. Therefore, the set in t1 will switch their close time to t2, and t2 will keep the same. Resolved in the next iteration.

//
// Another wrinkle, however, is that too many position changes
// from validators also has a destabilizing affect. Consensus can
// take longer if peers have to keep re-calculating positions,
// and mistakes can be made if peers settle on the wrong position.
// Therefore, avoiding an impasse should also minimize the likelihood
// of gratuitous change of position.
//
// The solution for validators is to first track whether it's
// possible that the network is at an impasse based on how much
// time this current consensus round has taken. This is reflected
// in the "stuck" boolean.
//
// The best close time is the close time with the highest number of
// votes. If multiple close times are tied, choose the latest one.
// Whether earlier or later does not matter. What matters is
// coming to agreement with the same criteria.
//
// For validators that are stuck and do not have the best close time,
// they change their close time position add their vote to the tally
mtrippled marked this conversation as resolved.
Show resolved Hide resolved
// for that close time. This solution avoids an impasse while
// minimizing gratuitous close time changes. The other way
// for a validator to change its position is if it exceeds threshVote
// as described above.
//
// Determining whether there is close time consensus is very simple
// in comparison: if votes for the best close time meet or exceed
// threshConsensus, then we have close time consensus. Otherwise, not.

// First, find the best close time with which to agree: first criteria
// is the close time with the most votes. If a tie, the latest
// close time of those tied for the maximum number of votes.
std::multimap<int, NetClock::time_point> votesByCloseTime;
std::stringstream ss;
ss << "Close time calculation for ledger sequence "
<< static_cast<std::uint32_t>(previousLedger_.seq()) + 1
<< " Close times and vote count are as follows: ";
bool first = true;
for (auto const& [closeTime, voteCount]: closeTimeVotes)
{
if (adaptor_.validating() &&
t != asCloseTime(result_->position.closeTime()))
{
JLOG(j_.debug()) << "Others have voted for a close time "
"different than ours. Adding our vote "
"to this one in case it is necessary "
"to break an impasse.";
++v;
}
JLOG(j_.debug())
<< "CCTime: seq "
<< static_cast<std::uint32_t>(previousLedger_.seq()) + 1 << ": "
<< t.time_since_epoch().count() << " has " << v << ", "
<< threshVote << " required";

if (v >= threshVote)
if (first)
first = false;
else
ss << ',';
votesByCloseTime.insert({voteCount, closeTime});
ss << closeTime.time_since_epoch().count() << ':' << voteCount;
}
std::optional<int> maxVote;
std::set<NetClock::time_point> maxCloseTimes;
// Highest vote getter is last. Track each close time that is tied
// with the highest.
for (auto rit = votesByCloseTime.crbegin();
rit != votesByCloseTime.crend(); ++rit)
{
int const& voteCount = rit->first;
mtrippled marked this conversation as resolved.
Show resolved Hide resolved
if (!maxVote.has_value())
maxVote = voteCount;
else if (voteCount < *maxVote)
break;
maxCloseTimes.insert(rit->second);
mtrippled marked this conversation as resolved.
Show resolved Hide resolved
}
// The best close time is the latest close time of those that have
// the maximum number of votes.
NetClock::time_point const bestCloseTime = *maxCloseTimes.crbegin();
ss << ". The best close time has the most votes. If there is a tie, "
"choose the latest. This is " << bestCloseTime.time_since_epoch().count() <<
"with " << *maxVote << " votes. ";

// If we are a validator potentially at an impasse and our own close
// time is not the best, change our close time to match it and
// tally another vote for it.
if (adaptor_.validating() && stuck &&
consensusCloseTime != bestCloseTime)
{
consensusCloseTime = bestCloseTime;
++*maxVote;
ss << " We are a validator. Consensus has taken "
<< result_->roundTime.read().count() << "ms. Previous round "
"took " << prevRoundTime_.count() << "ms. Now changing our "
"close time to " << bestCloseTime.time_since_epoch().count() << " that "
"now has " << *maxVote << " votes.";
}
// If the close time with the most votes also meets or exceeds the
// threshold to change our position, then change our position.
// Then check if we have met or exceeded the threshold for close
// time consensus.
//
// The 2nd check has been nested within the first historically.
// It's possible that this can be optimized by doing the
// 2nd check independently--this may make consensus happen faster in
// some cases. Then again, the trade-offs have not been modelled.
if (*maxVote >= threshVote)
{
consensusCloseTime = bestCloseTime;
ss << "Close time " << bestCloseTime.time_since_epoch().count() << " has " <<
*maxVote << " votes, which is >= the threshold (" << threshVote
<< " to make that our positions if it isn't already.";
mtrippled marked this conversation as resolved.
Show resolved Hide resolved
if (*maxVote >= threshConsensus)
{
// A close time has enough votes for us to try to agree
consensusCloseTime = t;
threshVote = v;

if (threshVote >= threshConsensus)
{
haveCloseTimeConsensus_ = true;
// Make sure that the winning close time is the one
// that propagates to the rest of the function.
break;
}
haveCloseTimeConsensus_ = true;
ss << " The maximum votes also >= the threshold (" << threshConsensus
<< ") for consensus.";
}
}

if (!haveCloseTimeConsensus_)
{
JLOG(j_.debug())
<< "No CT consensus:"
<< " Proposers:" << currPeerPositions_.size()
<< " Mode:" << to_string(mode_.get())
<< " Thresh:" << threshConsensus
<< " Pos:" << consensusCloseTime.time_since_epoch().count();
ss << " No CT consensus:"
<< " Proposers:" << currPeerPositions_.size()
<< " Mode:" << to_string(mode_.get())
<< " Thresh:" << threshConsensus
<< " Pos:" << consensusCloseTime.time_since_epoch().count();
}
JLOG(j_.debug()) << ss.str();
}

if (!ourNewSet &&
Expand Down
Loading