Harden validations, report server version & deprecate TTL fields #3291

nbougalis · 2020-03-09T06:58:45Z

This PR contains three separate commits:

Remove obsolete TTL fields

Proposals and validations include a TTL field. This was part of an earlier idea which never really materialized. The fields were never really used in practice (a TTL value is not set by default). This change simply makes this official: the fields are deprecated, should not be used and are not coming back.

Ripple engineers will be implementing the system outlined by @JoelKatz in this post.

Include version in validations

Currently, it is impossible to know what version of the software a given validator is running. This information is useful for server operators, for developing who are putting together dashboards, and the community at large.

This change adds a "packed" version in a validation message. The message will, by default, only be sent every 256 ledgers.

Harden validations

Several changes are included in this PR.

Track how many validations a given server publishes for a given ledger. This should be precisely one. When such validations are detected, the server will now log a message to notify server operators of this issue.
The commit adds a 64-bit "cookie" in every validation which can be used to attempt to decipher whether multiple validations are the result of accidental misconfiguration vs. outright byzantine behavior. It is important to note that a malicious validator operator can manipulate the cookie.
Adds the hash of the last ledger a validator considers to be fully validated in the validation message.

nbougalis · 2020-03-09T06:59:14Z

(an unofficial lookover of 330f35b by @bachase would be awesome!)

MarkusTeufelberger · 2020-03-09T12:23:32Z

When such validations are detected, the server will now log a message to notify server operators of this issue.

Could we optionally also get a full log (or sqlite database, if you feel fancy) of all validation messages received? As far as I know, this data is currently only available on WebSocket streams.

nbougalis · 2020-03-10T05:28:26Z

When such validations are detected, the server will now log a message to notify server operators of this issue.

Could we optionally also get a full log (or sqlite database, if you feel fancy) of all validation messages received? As far as I know, this data is currently only available on WebSocket streams.

We used to record validations in an SQL table, and we removed that functionality (see c5a95f1), because it was not really used by anyone/anything.

I think (and this is just my personal position) that the best option is to use the subscription and to index the received data separately. But would we, for example, report conflicting validations in the subscription stream?

Happy to discuss what to do here.

bachase

I only focused on 330f35b. I'm not convinced the new checks are reachable as implemented.

I'm generally supportive of them, although I'm not sure the best way to make them actionable within the current rippled reporting mechanisms. But that can always be something we build towards.

bachase · 2020-03-17T20:33:47Z

src/ripple/consensus/Validations.h

@@ -167,7 +167,11 @@ enum class ValStatus {
    /// Not current or was older than current from this node
    stale,
    /// A validation violates the increasing seq requirement
-    badSeq
+    badSeq,
+    /// Multiple validations for the same sequence for same from multiple validators


Typo, extra for same?

bachase · 2020-03-17T20:37:43Z

src/ripple/consensus/Validations.h

+
+                // If the ledger hash matches, it must be a validation for
+                // the same ledger:
+                bool const same = (it->second.ledgerID() == val.ledgerID());


I'm unclear how these conditions wouldn't already be blocked by the badSeq one above. That is, the enforcer requires a strictly increasing sequence number seen by a validator. So I think it already filters these alternate outcomes.

If you do want to check these, an option would be to move them above before badSeq returns.

Interesting comment here @bachase . I would argue that this is a valid check due to malicious/byzantine behavior. An attacker can increment the seq without changing the content. So this check can help us catch those byzantine behaviors is what I am thinking. I was thinking maybe the signature check would block the logic here but it seems to me that sfLedgerHash is just one field within the digest so a byzantine node can still sign the same validation with a different sequence and this logic here would catch it

@carlhua but doesn't this check presume the two validations being compared have the same sequence number? They both come from the bySequence_ map in Line 614.

@bachase you are right!! These checks would still guard against the scenarios depicted in the code right?

Same sequence, hashs match but cookies don't so we say its multiple

Same sequence, hashes don't match, we say its conflicting

I sorted this out already. I this I misfolded some of my work before pushing. See updated commit.

bachase · 2020-03-17T20:38:15Z

src/ripple/consensus/Validations.h

+                        std::max(it->second.signTime(), val.signTime()) -
+                        std::min(it->second.signTime(), val.signTime());
+
+                    if (diff > std::chrono::minutes{10} &&


Consider using one of the other parameterized durations over this hardcoded 10 minutes.

bachase · 2020-03-17T20:47:44Z

src/ripple/app/consensus/RCLConsensus.cpp

    auto validationTime = app_.timeKeeper().closeTime();
    if (validationTime <= lastValidationTime_)
        validationTime = lastValidationTime_ + 1s;
    lastValidationTime_ = validationTime;

-    STValidation::FeeSettings fees;
-    std::vector<uint256> amendments;
+    auto v = std::make_shared<STValidation>(


👍 Much cleaner

bachase · 2020-03-17T20:50:08Z

src/test/consensus/Validations_test.cpp

@@ -1080,6 +1080,14 @@ class Validations_test : public beast::unit_test::suite
        }
    }

+    void
+    testDifferentCookies()


I think testing this would be useful; I'm a little rusty on the test harness here but I think it should be feasible and can try to help.

Appreciate the offer. I'll work with someone else to try and get these tests so that more people learn about consensus instead of having all the information locked up in the heads of three or four people.

I agree that this is important. I'm not sure it needs to be addressed before this is merged (that's up to Brad and Nik), but it would be good to assign this task to someone before this is merged, just so we don't lose track of the task.

I will be happy to work on it, probably after the negative UNL network tests, if that is OK.

carlhua

Great PR, I am generally a huge fan of improving consensus. I left some comments down below. Aside from those comments, I think this PR will have API, documentation and Explorer impact so I will tag these to have downstream impacts via tags. @intelliot @mDuo13

carlhua · 2020-03-19T16:48:59Z

src/ripple/consensus/Validations.h

+
+                // If the ledger hash matches, it must be a validation for
+                // the same ledger:
+                bool const same = (it->second.ledgerID() == val.ledgerID());


Interesting comment here @bachase . I would argue that this is a valid check due to malicious/byzantine behavior. An attacker can increment the seq without changing the content. So this check can help us catch those byzantine behaviors is what I am thinking. I was thinking maybe the signature check would block the logic here but it seems to me that sfLedgerHash is just one field within the digest so a byzantine node can still sign the same validation with a different sequence and this logic here would catch it

carlhua · 2020-03-19T16:52:25Z

src/ripple/app/consensus/RCLValidations.cpp

        if(j.debug())
            dmp(j.debug(), to_string(outcome));

+        if (outcome == ValStatus::conflicting && j.fatal())


Couple thoughts here:

Consider this condition to be one of the criteria to be added to nUNL @pwang200 (future, but if we agree we should open an issue to track)

Consider this condition to be folded into peer selection/overlay enhancement - this is a byzantine behavior and we should consider disconnecting from this node or at least not connect to this node. Currently, this might not be a good idea but in the future when we are actively relaying validations from untrusted peers, we should consider keep records and potentially not connecting to these bad nodes. My argument here is that this allows us to eliminate bad nodes from the network.

carlhua · 2020-03-19T16:53:50Z

src/test/consensus/Validations_test.cpp

+    void
+    testDifferentCookies()
+    {
+        testcase("Multiple Validations");


I think it would be good to have more unit tests here. In addition, I would ask @manojsdoshi to consider to have automated tests for this PR to force the logic to execute on QE network. This should never happen naturally on the real network but it would be useful to see it in action in test.

mDuo13 · 2020-03-26T22:39:08Z

I don't see any documentation impacts beyond the new amendment. Am I missing anything? Are there changes to the fields provided in any of the API subscription streams for example?

bachase · 2020-04-03T19:48:51Z

src/ripple/app/consensus/RCLValidations.cpp

        if(j.debug())
            dmp(j.debug(), to_string(outcome));

-        if (outcome == ValStatus::badSeq && j.warn())
+        if (outcome == ValStatus::conflicting && j.fatal())


Note that this checks j.fatal(), but logs below with j.warn(). Same for line 208

src/ripple/consensus/Validations.h

bachase · 2020-04-03T19:54:28Z

src/ripple/consensus/Validations.h

+                // Two validations for the same sequence but for different
+                // ledgers. This could be the result of misconfiguration
+                // but it can also mean a Byzantine validator.
+                if (!same)


If they have different ledger IDs, what guarantees they still have the same sequence number at this spot in the code? Do you also have to check seqInserted?

I'll double-check this. Thanks.

gregtatcam · 2020-04-06T18:26:25Z

src/ripple/proto/ripple.proto

+    optional bool checkedSignature = 2 [deprecated = true];
+
+    // Number of hops traveled
+    optional uint32 hops = 3           [deprecated = true];


Could hops still be useful for the reduce relaying task? We can infer the preferred peer to receive TMValidation and TMPropose from as the ones with the smallest number of hops.

nbougalis · 2020-04-17T04:08:02Z

I think I've addressed all your concerns, except for improved unit test @bachase. I've avoided rebasing this to make reviewing easier; once you sign-off, I'll do it and force-push the branch.

In the meantime, I'm running a rebased version against the network.

bachase

Thanks @nbougalis -- logic looks good now.

carlhua

LGTM

nbougalis · 2020-04-28T23:39:46Z

I'm going to rebase this against b4 and squash.

This commit introduces the "HardenedValidations" amendment which, if enabled, allows validators to include additional information in their validations that can increase the robustness of consensus. Specifically, the commit introduces a new optional field that can be set in validation messages can be used to attest to the hash of the latest ledger that a validator considers to be fully validated. Additionally, the commit leverages the previously introduced "cookie" field to improve the robustness of the network by making it possible for servers to automatically detect accidental misconfiguration which results in two or more validators using the same validation key.

Currently there is no mechanism for a validator to report the version of the software it is currently running. Such reports can be useful for those who are developing network monitoring dashboards and server operators in general. This commit, if merged, defines an encoding scheme to encode a version string into a 64-bit unsigned integer and adds an additional optional field to validations. This commit piggybacks on "HardenedValidations" amendment to determine whether version information should be propagated or not. The general encoding scheme is: XXXXXXXX-XXXXXXXX-YYYYYYYY-YYYYYYYY-YYYYYYYY-YYYYYYYY-YYYYYYYY-YYYYYYYY X: 16 bits identifying the particular implementation Y: 48 bits of data specific to the implementation The rippled-specific format (implementation ID is: 0x18 0x3B) is: 00011000-00111011-MMMMMMMM-mmmmmmmm-pppppppp-TTNNNNNN-00000000-00000000 M: 8-bit major version (0-255) m: 8-bit minor version (0-255) p: 8-bit patch version (0-255) T: 11 if neither an RC nor a beta 10 if an RC 01 if a beta N: 6-bit rc/beta number (1-63)

nbougalis requested review from JoelKatz, gregtatcam, HowardHinnant and ximinez March 9, 2020 06:58

bachase reviewed Mar 17, 2020

View reviewed changes

carlhua reviewed Mar 19, 2020

View reviewed changes

carlhua added API Change Documentation README changes, code comments, etc. labels Mar 19, 2020

HowardHinnant previously approved these changes Mar 27, 2020

View reviewed changes

scottschurr assigned gregtatcam, HowardHinnant and ximinez Apr 1, 2020

gregtatcam previously approved these changes Apr 3, 2020

View reviewed changes

bachase reviewed Apr 3, 2020

View reviewed changes

gregtatcam reviewed Apr 6, 2020

View reviewed changes

nbougalis removed the API Change label Apr 10, 2020

nbougalis dismissed stale reviews from gregtatcam and HowardHinnant via 4ea4837 April 17, 2020 04:07

carlhua requested a review from bachase April 17, 2020 14:03

bachase previously approved these changes Apr 17, 2020

View reviewed changes

carlhua previously approved these changes Apr 21, 2020

View reviewed changes

carlhua added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Apr 21, 2020

carlhua removed the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Apr 28, 2020

carlhua added Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. and removed Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. labels Apr 28, 2020

nbougalis dismissed stale reviews from carlhua and bachase via 117c724 April 29, 2020 04:33

nbougalis mentioned this pull request Apr 29, 2020

Cleanup #3326

Closed

Deprecate 'Time to Live' fields

567e42e

carlhua removed the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label May 1, 2020

nbougalis added 2 commits May 1, 2020 12:55

nbougalis mentioned this pull request May 3, 2020

Proposed 1.6.0-b5 #3388

Merged

nbougalis closed this in #3388 May 6, 2020

nbougalis merged commit 2827de4 into XRPLF:develop May 6, 2020

wilsonianb mentioned this pull request Mar 18, 2021

Include new validation fields in subscription stream #3797

Closed

nbougalis deleted the hardv branch October 16, 2023 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden validations, report server version & deprecate TTL fields #3291

Harden validations, report server version & deprecate TTL fields #3291

nbougalis commented Mar 9, 2020

nbougalis commented Mar 9, 2020 •

edited

Loading

MarkusTeufelberger commented Mar 9, 2020

nbougalis commented Mar 10, 2020

bachase left a comment

bachase Mar 17, 2020

bachase Mar 17, 2020

bachase Mar 17, 2020

carlhua Mar 19, 2020

bachase Mar 19, 2020

carlhua Mar 23, 2020

nbougalis Mar 27, 2020

bachase Mar 17, 2020

bachase Mar 17, 2020

bachase Mar 17, 2020

nbougalis Mar 27, 2020

seelabs Apr 17, 2020

pwang200 Apr 17, 2020

carlhua left a comment •

edited

Loading

carlhua Mar 19, 2020

carlhua Mar 19, 2020

carlhua Mar 19, 2020

mDuo13 commented Mar 26, 2020

bachase Apr 3, 2020

bachase Apr 3, 2020

nbougalis Apr 10, 2020

gregtatcam Apr 6, 2020

nbougalis commented Apr 17, 2020

bachase left a comment

carlhua left a comment

nbougalis commented Apr 28, 2020

Harden validations, report server version & deprecate TTL fields #3291

Harden validations, report server version & deprecate TTL fields #3291

Conversation

nbougalis commented Mar 9, 2020

Remove obsolete TTL fields

Include version in validations

Harden validations

nbougalis commented Mar 9, 2020 • edited Loading

MarkusTeufelberger commented Mar 9, 2020

nbougalis commented Mar 10, 2020

bachase left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlhua left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mDuo13 commented Mar 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nbougalis commented Apr 17, 2020

bachase left a comment

Choose a reason for hiding this comment

carlhua left a comment

Choose a reason for hiding this comment

nbougalis commented Apr 28, 2020

nbougalis commented Mar 9, 2020 •

edited

Loading

carlhua left a comment •

edited

Loading