-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add shard download and import RPC #2561
Conversation
@miguelportilla, I'm honored by the review request. You should be aware that I'll be leaving for a C++ Standards Committee meeting on Friday and I won't be done with the meeting work until June 11th. So I'm unlikely to get started on the review work until June 11th. I'm fine with that, but if you want some review attention sooner than that you may want to select a different reviewer. Sorry for the inconvenience. |
Jenkins Build SummaryBuilt from this commit Built at 20180807 - 23:52:53 Test Results
|
@scottschurr Thanks for that, I'll see if I can get someone else to start before that. |
Are nodes that are being imported checked/validated for being actually part of the current chain? |
@MarkusTeufelberger Yes. By default, each imported shard is checked and validated as being part of the current chain. Because validating all nodes of the SHAmaps is expensive, the validation can be skipped ( |
Is there some deterministic way to generate a shard database? Both NuDB and RocksDB don't seem very deterministic to me regarding the storage format... it would be nice to have a way to generate/verify static historic files that contain shard contents. Probably out of scope for this PR though. Maybe a bit more in scope - would HTTP transport be possible too? I could imagine hosting some extensive shard databases on IPFS (https://ipfs.io) and while it probably would be possible to also host a local mirror with https, it would likely be easier and secure enough to do so with http. |
@MarkusTeufelberger Thanks for your feedback. HTTP transport can be supported. I have been investigating IPFS. Unfortunately, there isn't a c++ protocol implementation. There is, however, a c implementation that may be useful. |
It would be enough for now to have HTTP, since it is relatively easy to run a local instance that makes files available via HTTP (similar to maybe a FUSE filesystem) |
src/ripple/basics/impl/Archive.cpp
Outdated
boost::filesystem::path const& dst) | ||
{ | ||
auto ar {archive_read_new()}; | ||
archive_read_support_format_tar(ar); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typical of c-APIs, all of the methods return status codes. Although I'm not sure what failure mode could cause these unchecked methods to fail, I wonder if we should at least wrap them in asserts
to catch the totally unexpected? I just hate for any rc to be ignored if it's being returned...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, will address.
src/ripple/basics/impl/Archive.cpp
Outdated
archive_write_close(aw); | ||
archive_write_free(aw); | ||
|
||
if (!err.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an option to use unique_ptr
to close/free these resources and it cleans-up the code here a little bit : mellery451@07415db if you want
to consider that approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good approach, will use.
complete(); | ||
} | ||
|
||
}// ripple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this download class, but I'm wondering how it differs or overlaps with the functionality in HTTPClient.h/cpp
- would there be an opportunity to merge the two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference is that this class specifically downloads files where HTTPClient can't. It also uses the latest boost beast whereas HTTPClient does not use beast. In a future PR, HTTPClient could be replaced with this class.
auto const shardIndex {seqToShardIndex(seq)}; | ||
std::unique_lock<std::mutex> l(m_); | ||
assert(init_); | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think this scope block is especially helpful now that the lock is moved out, but it's your call...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its to avoid renaming the pre shard iterator. I am not sure if renaming the pre iterator and breaking the name convention is an improvement. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that makes sense.
src/ripple/nodestore/impl/Shard.cpp
Outdated
auto fail = [&](std::string msg) | ||
{ | ||
if (!msg.empty()) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you have single-line conditionals with and without braces here...I'd prefer it consistent either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line conditionals using the JLOG
macro make gcc unhappy. I'd prefer single line but the price is silencing the compiler on the warning.
return RPC::missing_field_error(jss::url); | ||
parsedURL url; | ||
if (!parseUrl(url, it[jss::url].asString()) || | ||
url.scheme != "https" || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for scheme in particular, it might be nice to return "only https supported" because I don't know if that would be obvious to me as a caller otherwise.
return RPC::make_param_error("invalid shard ids"); | ||
|
||
// Remove unsuitable shards | ||
for (auto it = dCtx->shards.begin(); it != dCtx->shards.end();) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What might cause a shard to be "unsuitable" - would it be if it's already in the shardstore and/or is in progress via peernet fetch? Just wondering if there is any reason to report this info back to the user...like "we already have shards x,y" or "we're already working on z".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, those cases make shard ids unsuitable. Keep in mind this is an admin command and the ids that will be downloaded are reported. I can add the omitted ids to the report but not sure if there is sufficient reason to. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at a minimum, I guess we should make a note to document the behavior - that is, if you ask for x,y,z only only get z back, then x and y were skipped for and see the logs for more info (?). I think the documentation approach is reasonable for this fairly specialized command.
dCtx->validate = context.params[jss::validate].asBool(); | ||
} | ||
if (dCtx->validate && | ||
context.app.getOPs().getOperatingMode() != NetworkOPs::omFULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since validation happens sometime later (after download, unpack, and move), is the omFULL
condition implicitly/explicitly checked again when the shard validation logic is called? To put it another way, what if the server falls out of sync between this point and when the shard is actually validated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state is checked again when the validation occurs. If we've fallen out of sync at the time of validation, the import is skipped for the id. You bring up a good point. It's probably a good idea to defer each import/validation attempt if not synced. That would also remove the need to be synced when the RPC is received.
CMakeLists.txt
Outdated
@@ -156,6 +156,115 @@ use_protobuf() | |||
|
|||
setup_build_boilerplate() | |||
|
|||
include(ExternalProject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this EP doesn't work if you happen to set DESTDIR when to run make
, so I've made a few fixes in mellery451@f861f7c. That commit also stores the EP in a dir separate from the binary dir, so it's slightly more likely to be reused and not rebuilt each time. Please let me know if you have any issues with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few questions mostly - 👍 for me
fc18357
to
6b23a41
Compare
@mellery451 I pushed several commits to address your feedback. One primary change is that the free handler function that was in the downloadshard RPC was moved into its own class. It is now not necessary to be synced when the RPC is received and shard validation will be deferred if not synced. The RPC error reporting was improved and the archive function checks return codes. There is one exception to the return code checking and that is in the free call. I suppose if we passed in a log ref we would be able to log it but not sure if its worth it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes look good.
auto it {archives_.find(shardIndex)}; | ||
if (it != archives_.end()) | ||
{ | ||
return url.scheme == it->second.scheme && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might make sense to add operator ==
for parsedURL
, but not a big deal since this is the only comparison in our code at this point.
ShardArchiveHandler::add(std::uint32_t shardIndex, parsedURL&& url) | ||
{ | ||
assert(downloader_); | ||
auto it {archives_.find(shardIndex)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super micro nit, but this can be const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JoeLoser.
CMakeLists.txt
Outdated
include(ExternalProject) | ||
set(RIPPLED_NIH_LIBRARIES) | ||
###################################################################### | ||
### NIH dep: lz4 ### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does NIH stand for? "Not Invented Here"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - it's shorter than "third-party" or "external-dependency".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works! I can safely swap out "National Institutes of Health" which made much less sense. :)
2f3202c
to
edb5c61
Compare
Codecov Report
@@ Coverage Diff @@
## develop #2561 +/- ##
===========================================
- Coverage 70.83% 70.15% -0.68%
===========================================
Files 698 703 +5
Lines 54367 54883 +516
===========================================
- Hits 38511 38505 -6
- Misses 15856 16378 +522
Continue to review full report at Codecov.
|
src/ripple/net/impl/RPCCall.cpp
Outdated
jvParams[0u].size() == 0) | ||
{ | ||
return rpcError(rpcINVALID_PARAMS); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried invoking download_shard
on the command line, and I think this check is wrong.
First of all, the commands[]
array on 1099 is already checking that there is at least one argument, so if I provide 0, it doesn't even get this far. Second, if there are arguments, I don't think jvParams[jss::shards]
will ever be set.
Examples:
eah@badger:~/dev/rippled-merge (miguelportilla--shard_transfer=)$ rippled download_shard
Loading: "/home/eah/.config/ripple/rippled.cfg"
2018-Jul-09 23:14:28.490922982 HTTPClient:NFO Connecting to 127.0.0.1:5015
{
"result" : {
"error" : "invalidParams",
"error_code" : 31,
"error_message" : "Missing field 'shards'.",
"request" : {
"command" : "download_shard"
},
"status" : "error"
}
}
eah@badger:~/dev/rippled-merge (miguelportilla--shard_transfer=)$ rippled download_shard 1
Loading: "/home/eah/.config/ripple/rippled.cfg"
2018-Jul-09 23:14:37.181363866 HTTPClient:NFO Connecting to 127.0.0.1:5015
{
"result" : {
"error" : "invalidParams",
"error_code" : 31,
"error_message" : "Missing field 'shards'.",
"request" : {
"command" : "download_shard",
"params" : [ "1" ]
},
"status" : "error"
}
}
eah@badger:~/dev/rippled-merge (miguelportilla--shard_transfer=)$ rippled download_shard 1 https://dl.dropboxusercontent.com/s/bwfu5q3l1301hmk/1.tar.lz4
Loading: "/home/eah/.config/ripple/rippled.cfg"
2018-Jul-09 23:15:19.958912395 HTTPClient:NFO Connecting to 127.0.0.1:5015
{
"result" : {
"error" : "invalidParams",
"error_code" : 31,
"error_message" : "Missing field 'shards'.",
"request" : {
"command" : "download_shard",
"params" : [ "1", "https://dl.dropboxusercontent.com/s/bwfu5q3l1301hmk/1.tar.lz4" ]
},
"status" : "error"
}
}
eah@badger:~/dev/rippled-merge (miguelportilla--shard_transfer=)$ rippled download_shard 1 https://dl.dropboxusercontent.com/s/bwfu5q3l1301hmk/1.tar.lz4 validate
Loading: "/home/eah/.config/ripple/rippled.cfg"
2018-Jul-09 23:15:24.651213680 HTTPClient:NFO Connecting to 127.0.0.1:5015
{
"result" : {
"error" : "invalidParams",
"error_code" : 31,
"error_message" : "Missing field 'shards'.",
"request" : {
"command" : "download_shard",
"params" : [
"1",
"https://dl.dropboxusercontent.com/s/bwfu5q3l1301hmk/1.tar.lz4",
"validate"
]
},
"status" : "error"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ximinez thanks for testing and finding that, fixed in the latest commit.
3bd7db2
to
c65c93c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not done yet, but here are some more comments so you don't have to sit around waiting while I go through the whole thing.
Also, I noticed something weird when looking at my rippled shards directory. Somewhere in the process, file creation is not respecting my umask. I haven't spotted where it's happening yet, but I thought I'd alert you to it. 1
, 2
, and 5
were downloaded using download_shards
. The others were through the existing process.
$ ls -l ~/rippled/db/shards/
total 52
drwxrwxrwx 2 eah eah 4096 Jul 11 20:01 1
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 146
drwxrwxrwx 2 eah eah 4096 Jul 11 20:01 2
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 2002
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 232
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 402
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 487
drwxrwxrwx 2 eah eah 4096 Jul 11 20:01 5
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 538
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 587
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 661
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 791
drwxr-xr-x 2 eah eah 4096 Jul 11 20:01 834
$ umask
0022
src/ripple/net/impl/RPCCall.cpp
Outdated
|
||
// If odd number of params then 'validate' may have been specified | ||
if (sz & 1) | ||
jvResult[jss::validate] = jvParams[--sz].asBool(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear to do any input validation. It's just checking that a value was passed. I would suggest that you explicitly check that the value is validate
because if it's not, the user could have made an error (eg. provided another URL without an index), or passed in garbage. Either way, this may not be an optimal outcome. If it's not validate
, that might justify an rpcINVALID_PARAMS
.
} | ||
// Check if greater or equal to the current shard | ||
auto seqCheck = [&](std::uint32_t seq){ | ||
if (seq > earliestSeq() && shardIndex >= seqToShardIndex(seq)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something about this conditional smells to me. However unlikely it is, if the sequence is < earliestSeq
, then this lambda is going to indicated things are fine, and it's only checked here. I think that would allow the shard to be prepped even if rippled is not synced. Is that what you want? (And if it is, could you add a comment explaining the reasoning?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we try to validate the shard index using the current ledger but only if available (synced). I'll document.
src/ripple/nodestore/impl/Shard.cpp
Outdated
{ | ||
assert(!backend_); | ||
using namespace boost::filesystem; | ||
dir_ = dir / std::to_string(index_); | ||
auto const newShard {!is_directory(dir_) || is_empty(dir_)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if dir_
exists as a plain file? I believe that newShard
would be true, so backend_->open()
will try to create folders, which will fail, so open
will throw, that will be caught and fail
will be called. Since newShard
is true, fail
will delete the file. Now chances are the file doesn't belong there, and there's an argument it should be deleted, but my concern is that rippled will delete an arbitrary file that something else put there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The directories are created exclusively for the internal use of shards by rippled and should not be manually modified. In the scenario you described, a file or sym link created by an external event in the shard root directory bearing an index name would be deleted. Although in practice that should not happen, I agree with your concern. Thanks, I will address the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree that nobody should be in there, but people do weird things...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting there... One more suggestion from today.
std::string(jss::validate), "a bool"); | ||
} | ||
validate = context.params[jss::validate].asBool(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since validate
defaults to true
, there's no point in including it as an option for the command line, unless you change parseDownloadShard
to default jvResult[jss::validate]
to false if it's not specified. I do not endorse this.
In fact, what I'd suggest, is to change the command-line optional parameter to novalidate
, and reverse the logic in parseDownloadShard
. The justification is that if we want to validate by default, then that should be the default on the command line, too, and it should require extra action to disable validation. See also, the comment I left in parseDownloadShard
about validating the provided value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to respond to your earlier comment and mention validate
defaults to true. I like the novalidate
suggestion, will do.
e415893
to
c5a5ae3
Compare
@ximinez feedback addressed. Thanks for the deep and thorough look. The directory permissions issue is tricky. Apparently, boost has a known issue creating directories recursively with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty impressed by these new features. I left a couple more comments, but I'm approving the PR as a whole.
std::to_string(url.port.get_value_or(443)), | ||
url.path, | ||
11, | ||
dstDir / url.path.substr(url.path.find_last_of("/\\") + 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably pretty unlikely, but this computed directory name could cause some shenanigans if the url
is poorly formed and not validated before being added. I'm thinking specifically of a case where new code calls add
without the checks that are currently in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, fixed! Thanks.
++i; | ||
else if (!iequals(jvParams[--sz].asString(), "novalidate")) | ||
return rpcError(rpcINVALID_PARAMS); | ||
jvResult[jss::validate] = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pretty clever. Allow just a little bit of room for the user not to do it perfectly. I like it.
df3ca8f
to
35ff3af
Compare
35ff3af
to
83006ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good. Still 👍
83006ed
to
8ce82db
Compare
4b2c28d
to
384c6b8
Compare
384c6b8
to
dd2c3f2
Compare
In 1.1.0-rc1 |
Adds an RPC that allows admins to download and import shard archives from HTTPS web servers. A shard archive is a tar of a complete shard directory compressed with lz4. The libarchive library was added to rippled for its multi-format archive and compression functionality. The pull request consists of four commits. The first adds a free function that uses libarchive to extract the archives (tar.lz4). The second commit adds an HTTPS client designed to download files using the boost beast library and asynchronous coroutines. The third commit adds hot shard import capability to the shard store. The fourth adds the actual RPC command allowing rippled to download and import archives.
Admin privilege is necessary for the RPC.
The RPC takes the following form
For testing purposes I have created the following archives in NuDB and RocksDB formats.
NuDB format
RocksDB format
For testing purposes, the RPC can be issued using a browser and this site http://www.websocket.org/echo.html or from the command line using curl eg.
curl -X POST -d '{ "method" : "download_shard", "params" : [ { "shards" : [{"index": 1, "url": "https://dl.dropboxusercontent.com/s/bwfu5q3l1301hmk/1.tar.lz4"}] } ] }' http://localhost:5005