-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make minimum quorum Byzantine fault tolerant (RIPD-1461) #2093
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #2093 +/- ##
===========================================
+ Coverage 69.48% 69.49% +<.01%
===========================================
Files 685 685
Lines 50520 50520
===========================================
+ Hits 35105 35108 +3
+ Misses 15415 15412 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly interested in clarification on f
.
The following is only a suggestion, but I'd find it useful to add more comments. I'm imagining coming back some months in the future and having to tease out what is going on.
src/ripple/app/misc/ValidatorList.h
Outdated
if (localPubKey_.size() && ! localKeyListed && | ||
rankedKeys.size () > 1 && keyListings_.size () % 2 != 0) | ||
++quorum; | ||
// This minimum quorum of 2f + 1 guarantees safe overlap with the trusted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define f
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather we don't mention f
at all here:
// The minimum quorum guarantees safety with up 1/3 of the listed
// validators being malicious.
src/ripple/app/misc/ValidatorList.h
Outdated
@@ -309,7 +309,8 @@ class ValidatorList | |||
std::function<void(PublicKey const&, bool)> func) const; | |||
|
|||
static std::size_t | |||
calculateQuorum (std::size_t nTrustedKeys); | |||
calculateMinimumQuorum ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding doxygen comments since calculateMinimumQuorum
is public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OR... what do you think about calculateMinimumQuorum
being private?
I'd remove testCalculateMinimumQuorum
and move most of its checks down here:
6ef1f30#diff-03bfc5ce29ad8cf375088f420df2649dR780
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your call. If you are comfortable getting coverage via that other spot, seems fine to me.
// Use 80% for large values of n, but have special cases for small numbers. | ||
constexpr std::array<std::size_t, 10> quorum{{ 0, 1, 2, 2, 3, 3, 4, 5, 6, 7 }}; | ||
// Use 2f + 1 for large values of n, but have special cases for small numbers. | ||
constexpr std::array<std::size_t, 6> quorum{{ 0, 1, 2, 2, 3, 3 }}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the number of special cases change from 10 to 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly because the special cases stopped being special. For 7, 8, and 9 the special quorum was the same as the new minimum. 6's special quorum is one less than the minimum, and David suggested we be conservative with numbers >5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is confusing... first, there's no n
, and I see nothing about 2f+1
anywhere in the code, although I do see what looks like 2f/3 + 1
. Also, it says we have special cases, but doesn't explain why. But I digress...
There's no reason for a table even, since n/2+1
gives the following (calc):
n |
quorum[n] |
n/2+1 |
OK? |
---|---|---|---|
0 | 0 | 1 | ❌ |
1 | 1 | 1 | ✅ |
2 | 2 | 2 | ✅ |
3 | 2 | 2 | ✅ |
4 | 3 | 3 | ✅ |
5 | 3 | 3 | ✅ |
With the exception of quorum[0]
this matches perfectly with our "special cases". And I am curious as to why we return 0
. That only seems possible in the case where nListedKeys == 0 && unlistedLocal == false
, so what case is that? Standalone mode? Startup of the first validator of a fresh network?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That ideone calculation is doing i/2 + 1
as opposed to 2/3 * i + 1
.
Are you suggesting that we calculate a 51% quorum for smaller values as opposed to using the std::array
? I see how that would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I edited my comment; I meant to say we don't even need the table - n/2+1
would work in the sense that it produces the same result as our existing "special case" table (except for the case of n==0
) and makes it simpler to express what we're doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it's possible to run with an empty [validators]
config section and have zero listed validators.
There also don't appear to be listed validators when running in standalone mode, so I'll keep setting the quorum to zero in that case.
src/ripple/app/misc/ValidatorList.h
Outdated
} | ||
else | ||
// Do not require 80% quorum for less than 10 trusted validators | ||
if (size >= 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this 10
relate to the explicit sizes used for 6
listed keys in calculateMinimumQuorum
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also suggest changing if(size >=10)
to if(rankedKeys.size() > 10)
. I know you set that in the line above, but those lines move apart in future refactorings, it would be best to make it clear the minimum 10 is relative to the rankedKeys
size rather than say the listed key size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we just had the custom quorums (less than 80%) for <10 validators. Now there's also custom quorums (less than the normal minimum) for <=5 validators.
src/ripple/app/misc/ValidatorList.h
Outdated
if (publisherLists_.size() == 1) | ||
{ | ||
// Try to raise the quorum to at least 80% of the trusted set | ||
std::size_t const targetQuorum = size - size / 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If size >= 10
, is it possible for targetQuorum
to be less than the minimum quorum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The declaration and if
should be collapsed to:
quorum = std::max(quorum, size - size / 5);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that can happen if rankedKeys.size()
(the number of potentially trusted validators) is much lower than keyListings_.size()
(the number of listed validators).
The check below makes sure we won't use the targetQuorum
if that's the case.
src/ripple/app/misc/ValidatorList.h
Outdated
if (localPubKey_.size() && ! localKeyListed && | ||
rankedKeys.size () > 1 && keyListings_.size () % 2 != 0) | ||
++quorum; | ||
// This minimum quorum of 2f + 1 guarantees safe overlap with the trusted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather we don't mention f
at all here:
// The minimum quorum guarantees safety with up 1/3 of the listed
// validators being malicious.
src/ripple/app/misc/ValidatorList.h
Outdated
if (publisherLists_.size() == 1) | ||
{ | ||
// Try to raise the quorum to at least 80% of the trusted set | ||
std::size_t const targetQuorum = size - size / 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The declaration and if
should be collapsed to:
quorum = std::max(quorum, size - size / 5);
// Use 80% for large values of n, but have special cases for small numbers. | ||
constexpr std::array<std::size_t, 10> quorum{{ 0, 1, 2, 2, 3, 3, 4, 5, 6, 7 }}; | ||
// Use 2f + 1 for large values of n, but have special cases for small numbers. | ||
constexpr std::array<std::size_t, 6> quorum{{ 0, 1, 2, 2, 3, 3 }}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is confusing... first, there's no n
, and I see nothing about 2f+1
anywhere in the code, although I do see what looks like 2f/3 + 1
. Also, it says we have special cases, but doesn't explain why. But I digress...
There's no reason for a table even, since n/2+1
gives the following (calc):
n |
quorum[n] |
n/2+1 |
OK? |
---|---|---|---|
0 | 0 | 1 | ❌ |
1 | 1 | 1 | ✅ |
2 | 2 | 2 | ✅ |
3 | 2 | 2 | ✅ |
4 | 3 | 3 | ✅ |
5 | 3 | 3 | ✅ |
With the exception of quorum[0]
this matches perfectly with our "special cases". And I am curious as to why we return 0
. That only seems possible in the case where nListedKeys == 0 && unlistedLocal == false
, so what case is that? Standalone mode? Startup of the first validator of a fresh network?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional questions for when ranked keys is much lower than listed keys.
src/ripple/app/misc/ValidatorList.h
Outdated
// reduce the trusted set size so that the quorum represents | ||
// at least 80% | ||
size = quorum * 1.25; | ||
// Use all eligible keys if there is only one trusted list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me why the single trusted list gets special status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more that for multiple trusted lists we are motivated to reduce the number of trusted validators so that we are only trusting those included on the most lists.
if (unlistedLocal) | ||
++nListedKeys; | ||
|
||
// Guarantee safety with up to 1/3 listed validators being malicious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth documenting the reason for 1/3
?
{ | ||
// Reduce the trusted set size so that the quorum represents | ||
// at least 80% | ||
size = quorum * 1.25; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is new with these changes, but can't size
end up greater than rankedKeys_.size()
?
For example, suppose there are
- More than 1 publisher lists
- 20 listed keys
- 10 ranked keys
- This node has an unlisted key
Then isn't the minimum quorum 14, but the calculated size = 14 * 1.25 = 17
? If so, I think that would make for an infinite loop down on line 450.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size
can be greater than rankedKeys_.size()
That loop should be fine in that case since it is iterating rankedKeys
. The result is just that we end up with fewer trusted keys than size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh, misread that loop. Thanks!
// Reduce the trusted set size so that the quorum represents | ||
// at least 80% | ||
size = quorum * 1.25; | ||
} | ||
} | ||
|
||
if (minimumQuorum_ && (seenValidators.empty() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar question to the above, but is setting quorum = *minimumQuorum_
dangerous for the extreme revoked key case?
Suppose
- More than 1 publisher list
- 10 list keys
- 7 ranked keys
- This node has unlisted key
Calculated minimum quorum is 8, sorankedKeys.size() < 8
would set quorum tominimumQuorum_
, which might be 3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting quorum = *minimumQuorum_
removes the safety guarantees from calculateMinimumQuorum
. The quorum will not be Byzantine fault tolerant and may simply be forkable (<51%).
The quorum
command line option used to set minimumQuorum_
is advertised as such
https://github.com/ripple/rippled/blob/develop/src/ripple/app/main/Main.cpp#L231
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider logging at warning
if quorum > minimumQuorum_
. Two tangential questions: first, under what circumstances would we use --quorum
and second, should the description of the command include a marker (e.g. "(EXPERT USE ONLY)")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're not bumping the quorum as an unlisted validator if there are less than 5, we might not need --quorum
.
https://github.com/ripple/rippled/pull/2093/files#diff-7f8c7a926e17debbef064bc5c2f572dbR420
Previously, it was impossible to start up a new set of validators (reset the altnet) without specifying --quorum
.
Note that I think --quorum
acted the same pre-dynamic unl. Like [validation_quorum]
, it was the absolute minimum quorum you were willing to allow, which could possibly be unsafe.
return quorum[nTrustedKeys]; | ||
|
||
return nTrustedKeys - nTrustedKeys / 5; | ||
if (nListedKeys == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @wilsonianb noted that standalone mode operates with an empty list. In that case, it would never fully validate a ledger. I'm not sure if that is a problem or not. If anything, that might be better configured via the "EXPERT ONLY" minimumQuorum, and leave this returning 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried standalone mode with a quorum of 1 (instead of 0), and it closed ledgers without any issue.
https://ripple.com/build/stand-alone-mode/#advancing-ledgers-in-stand-alone-mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
👍 |
Squashed and rebased on 0.80.0-b1 |
In 0.80.0-b2 |
cool stuff |
No description provided.