Improve log scrubbing: #2358

nbougalis · 2018-01-26T09:11:19Z

Per issue #2354, when the log level of a server was configured at "trace", sensitive keying meterial generated by the wallet_propose command could be written to the server's log file, if one was configured.

This commit improves the log scrubbing code to account for the sensitive information generated by a wallet_propose.

Important security consideration

We still caution everyone against executing this command on a server that they do not control: a malicious server operator could intercept the generated keypair, or operate a modified server that returns keypairs that are not securely generated.

ripplelabs-jenkins · 2018-01-26T09:47:03Z

Jenkins Build Summary

Built from this commit

Built at 20180130 - 02:13:15

Test Results

Build Type	Result	Status
coverage	985 cases, 0 failed, t: 623s	PASS ✅
clang.debug.unity	985 cases, 0 failed, t: 392s	PASS ✅
clang.debug.nounity	983 cases, 0 failed, t: 322s	PASS ✅
gcc.debug.unity	985 cases, 0 failed, t: 441s	PASS ✅
clang.release.unity	984 cases, 0 failed, t: 480s	PASS ✅
gcc.debug.nounity	983 cases, 0 failed, t: 787s	PASS ✅
gcc.release.unity	984 cases, 0 failed, t: 504s	PASS ✅

HowardHinnant · 2018-01-26T16:18:03Z

src/ripple/basics/impl/Log.cpp

+
+            s.replace (first, last - first, last - first, '*');
+        }
+    };


On the find('\"',...) there is no need to limit the starting search position to the size of the string. The find implementation does this check already.

I recommend searching for the second " instead of the , after the first " because the sensitive value may be the last key/value pair in that level of the jason.

Here is a minor tweak which takes these suggestions into account:

auto scrubber = [&s](char const* token) { auto first = s.find(token); // If we have found the specified token, then attempt to isolate the // sensitive data (it's enclosed by double quotes) // and mask it: if (first != std::string::npos) { first = s.find ('\"', first + std::strlen(token)); if (first != std::string::npos) { auto last = s.find('\"', first+1); if (last == std::string::npos) last = s.size(); else ++last; s.replace (first, last - first, last - first, '*'); } } };

Unlikely to happen but worth mentioning that subsequent occurrences of the token do not get masked.

Since this is called for every write to the log, and as written it does seven passes of the string, how about using a regex to do one pass. Something like:

std::string scrub (std::string s) { static std::regex const r{ R"regex("(seed|seed_hex|secret|master_key|master_seed|master_seed_hex|passphrase)")regex", std::regex_constants::optimize}; std::smatch m; std::smatch::difference_type p = 0; while (1) { if (!std::regex_search(s.cbegin() + p, s.cend(), m, r)) return s; p = m.position() + p; p = s.find('\"', p + m.length()); if (p == std::string::npos) return s; auto last = s.find(',', p); if (last == std::string::npos) last = s.size(); s.replace (p, last - p, last - p, '*'); p = last; } return s; }

Data point:

I took my implementation and tested it against the input string in the bug report that is prefixed by "2018-Jan-25 15:02:38 Server:DBG Reply:". Then I took Scott's implementation, changed it to search for the second " instead of ,, and tested it against the same string. Compiled with clang/libc++ at -O3. Mine took about 50us and Scott's about 230us. Other regex implementations might be much faster. I wrote the libc++ regex, and I know it sucks. :-\

@HowardHinnant Does that include the time to compile the regex?

@HowardHinnant Ah I see. I incorrectly assumed we only wanted to mask within the quotes to keep the json valid.

Personally I think "***" would look better, but I'm not going to loose any sleep either way. :-)

I agree. It would be, theoretically, marginally more secure (you don't leak the length). However, that would necessitate a lot more copying. I'm happy to tweak that if others think it's important.

I think masking only within the quotes is a good idea. The tweak I previously mentioned to @HowardHinnant may be all we need.

I wrote another version...

auto scrubber = [&s](char const* prefix, std::vector<char const*> const& endings) { auto first = s.find(prefix); if (first == std::string::npos) return; size_t const offset = first + std::strlen(prefix); for (auto const& e : endings) { auto const len = std::strlen(e); if ((s.compare(offset, len, e)) == 0) { first = s.find('\"', offset + len); if (first != std::string::npos) { auto last = s.find('\"', first + 1); if (last == std::string::npos) last = s.size(); else ++last; s.replace(first, last - first, last - first, '*'); } return; } } };

called as such...

static std::vector<std::pair<char const*, std::vector<char const*>>> const tokens { {{"\"seed"}, {"\"", "ed_hex\""}}, {{"\"secret"}, {"\""}}, {{"\"master_"}, {"key\"", "seed\"", "seed_hex\""}}, {{"\"passphrase"}, {"\""}} }; for (auto const& t : tokens) scrubber(t.first, t.second);

This version is about 80% faster than @HowardHinnant's version but probably not worth having the extra complexity.

HowardHinnant · 2018-01-26T16:22:05Z

src/ripple/basics/impl/Log.cpp

+    scrubber ("\"master_seed\"");
+    scrubber ("\"master_seed_hex\"");
+    scrubber ("\"passphrase\"");
+


As a minor exercise to increase our confidence this is a complete list, how about commenting the list of JSS in src/ripple/protocol/JsonFields.h with an attribute for each field that contains sensitive information. For example:

JSS ( passphrase ); // in [sensitive]: WalletPropose

As an example, the act of writing this comment caused me to wonder if the next JSS down from passphrase is also "sensitive" (password).

sublimator · 2018-01-26T17:11:00Z

Haha #bikeshedding

HowardHinnant · 2018-01-26T17:20:25Z

We have this thing about performance...

seelabs · 2018-01-29T19:04:38Z

One issue is we create an unnecessary copy of the message string when there are no secrets to be scrubbed (and the vast majority will have no secrets).

How about an interface that "appendsScrubbed" instead of "return scrubbed". This makes it easier to avoid the copy. Something like (note untested code, just look at the interface and how the copy is avoided). Another alternative would be remove the scrub function and inline the code into format: seelabs@ab1fc22

nbougalis · 2018-01-29T20:24:01Z

I've addressed all review comments and restructured the code to reduce the amount of copying by inlining the scrub function.

HowardHinnant

Looks good to me and tests out fine here.

miguelportilla · 2018-01-29T23:21:04Z

The code is passable but I will mention a few weaknesses.

Subsequent tokens do not get masked.
eg. "secret":"confidential","secret":"exposed"
If values are encountered similar to the tokens, the wrong thing gets masked.
eg. "some_key" : "secret"
The current masking includes the double quotation invalidating json objects.

Per issue #2354, when the log level of a server was configured at "trace", sensitive keying meterial generated by the `wallet_propose` command could be written to the server's log file, if one was configured. This commit improves the log scrubbing code to account for the sensitive information generated by a `wallet_propose`. ** Important security consideration ** We still caution everyone *against* executing this command on a server that they do not control: a malicious server operator could intercept the generated keypair, or operate a modified server that returns keypairs that are not securely generated.

mellery451

I'm good with the current state of this change. That said, I'm slightly apprehensive about the string matching for every log line, when in fact, we only care about certain messages that actually have json in them. To that end, I threw together a possibly alternative approach by modifying StyledStreamWriter: mellery451@592749a. This is incomplete and untested, but you get the idea. The main downside here is that you have to know to mod teh stream if you are streaming json to the log, but that should be a relatively small number of locations (?). In any event, I'm fine with the current approach and only threw this out to see if it's worth considering.

nbougalis · 2018-01-30T01:26:55Z

@miguelportilla: Thanks. I think that this isn't really meant to be a comprehensive solution. We can't possibly redact everything and if we could today, we couldn't guarantee that wouldn't change tomorrow.

To address your points more specifically:

There should never be subsequent tokens in the JSON we are filtering.
I guess someone could make it happen by crafting input that results in a value matching an expecetd token, but that only hurts the person doing that request and fixing it is not possible without extra logic to understand the JSON, which is probably not a good idea.
The current masking should not invalidate anything: it includes the opening and closing quotes and replaces only the content with *.

codecov-io · 2018-01-30T01:27:50Z

Codecov Report

Merging #2358 into develop will increase coverage by 0.01%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           develop    #2358      +/-   ##
===========================================
+ Coverage    70.04%   70.05%   +0.01%     
===========================================
  Files          704      703       -1     
  Lines        53342    53036     -306     
===========================================
- Hits         37361    37155     -206     
+ Misses       15981    15881     -100

Impacted Files	Coverage Δ
src/ripple/json/json_writer.h	`100% <ø> (ø)`	⬆️
src/ripple/rpc/impl/Handler.cpp	`98.96% <ø> (-0.02%)`	⬇️
src/ripple/net/impl/RPCCall.cpp	`61.8% <ø> (+0.24%)`	⬆️
src/ripple/rpc/impl/RPCHandler.cpp	`65.38% <100%> (+1.63%)`	⬆️
src/ripple/json/impl/json_writer.cpp	`79.71% <100%> (+0.64%)`	⬆️
src/ripple/basics/DecayingSample.h	`77.77% <0%> (-8.34%)`	⬇️
src/ripple/server/impl/BasePeer.h	`46.15% <0%> (-3.85%)`	⬇️
src/ripple/protocol/impl/STVar.cpp	`85.71% <0%> (-2.6%)`	⬇️
src/ripple/basics/impl/CheckLibraryVersions.cpp	`93.18% <0%> (-2.28%)`	⬇️
src/ripple/conditions/impl/error.cpp	`43.24% <0%> (-1.5%)`	⬇️
... and 107 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eaff9a0...e117578. Read the comment docs.

seelabs · 2018-01-30T02:53:20Z

In 0.90.0-b5

miguelportilla · 2018-01-30T14:23:49Z

We may want to use the jsonFields string values when defining the tokens.
eg. jss::seed vs "seed"

nbougalis requested review from HowardHinnant and mellery451 January 26, 2018 09:12

HowardHinnant reviewed Jan 26, 2018

View reviewed changes

scottschurr assigned mellery451 and HowardHinnant Jan 29, 2018

HowardHinnant approved these changes Jan 29, 2018

View reviewed changes

nbougalis added 2 commits January 29, 2018 17:21

Remove deprecated wallet_seed RPC endpoint

e117578

mellery451 approved these changes Jan 30, 2018

View reviewed changes

HowardHinnant added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Jan 30, 2018

seelabs closed this Jan 30, 2018

nbougalis deleted the scrub branch March 22, 2018 03:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve log scrubbing: #2358

Improve log scrubbing: #2358

nbougalis commented Jan 26, 2018 •

edited

Loading

ripplelabs-jenkins commented Jan 26, 2018 •

edited

Loading

HowardHinnant Jan 26, 2018

miguelportilla Jan 26, 2018 •

edited

Loading

seelabs Jan 26, 2018 •

edited

Loading

HowardHinnant Jan 26, 2018

seelabs Jan 26, 2018

miguelportilla Jan 29, 2018

HowardHinnant Jan 29, 2018

nbougalis Jan 29, 2018

miguelportilla Jan 29, 2018

miguelportilla Jan 29, 2018 •

edited

Loading

HowardHinnant Jan 26, 2018

sublimator commented Jan 26, 2018 via email

HowardHinnant commented Jan 26, 2018

seelabs commented Jan 29, 2018

nbougalis commented Jan 29, 2018 •

edited

Loading

HowardHinnant left a comment

miguelportilla commented Jan 29, 2018 •

edited

Loading

mellery451 left a comment •

edited

Loading

nbougalis commented Jan 30, 2018

codecov-io commented Jan 30, 2018

seelabs commented Jan 30, 2018

miguelportilla commented Jan 30, 2018

Improve log scrubbing: #2358

Improve log scrubbing: #2358

Conversation

nbougalis commented Jan 26, 2018 • edited Loading

Important security consideration

ripplelabs-jenkins commented Jan 26, 2018 • edited Loading

Jenkins Build Summary

Test Results

Choose a reason for hiding this comment

miguelportilla Jan 26, 2018 • edited Loading

Choose a reason for hiding this comment

seelabs Jan 26, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miguelportilla Jan 29, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sublimator commented Jan 26, 2018 via email

HowardHinnant commented Jan 26, 2018

seelabs commented Jan 29, 2018

nbougalis commented Jan 29, 2018 • edited Loading

HowardHinnant left a comment

Choose a reason for hiding this comment

miguelportilla commented Jan 29, 2018 • edited Loading

mellery451 left a comment • edited Loading

Choose a reason for hiding this comment

nbougalis commented Jan 30, 2018

codecov-io commented Jan 30, 2018

Codecov Report

seelabs commented Jan 30, 2018

miguelportilla commented Jan 30, 2018

nbougalis commented Jan 26, 2018 •

edited

Loading

ripplelabs-jenkins commented Jan 26, 2018 •

edited

Loading

miguelportilla Jan 26, 2018 •

edited

Loading

seelabs Jan 26, 2018 •

edited

Loading

miguelportilla Jan 29, 2018 •

edited

Loading

nbougalis commented Jan 29, 2018 •

edited

Loading

miguelportilla commented Jan 29, 2018 •

edited

Loading

mellery451 left a comment •

edited

Loading