-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persist routing table #254
Comments
Thoughts. Persistence medium
Seeding strategyWe need to decide how we're going to seed the routing table on start, because it'll affect what metadata we persist about peers in the routing table, and hence the data model. Brainstorming options:
|
I think we should (a) keep the address information in the peer store and (b) persist a single blob containing the routing table (peer IDs only). Really, we can probably just persist the entire routing table and reconnect to the entire routing table on start, no need to score. Thoughts? |
go-ethereum seeds with a randomised sample instead of rematerializing the table verbatim, as a way to recover from borked tables. I think it’s a sensible strategy, it’s more secure, and it works. |
Randomized sample from the routing table? Connecting to a subset shouldn't be any more secure than connecting to all, unless I'm missing something. |
In a scenario where your routing table is borked (e.g. all nodes keep their TCP connections alive but they ignore queries) and you restart, if you materialise the same table you will be stuck again. That’s why it’s more secure to seed a subset, and perform a random walk from those, to fill empty slots with fresh nodes. |
Won't that subset just ignore queries? |
@Stebalien you’re right. I don’t know, it feels wrong not to introduce a randomisation component over restarts, maybe mixing in well-known/reputable nodes as well during seeding (for now, our bootstrappers; later, nodes we’ve calculated a high reputation for over time). Randomisation is a good starting point to defend against some attacks in a Byzantine scenario. |
Basically with a mix of randomised past nodes + a few bootstrappers (to keep it simple) we should be able to recover from a borked table after a restart, albeit slowly, but without manual intervention. |
Mixing in some bootstrappers sounds like a good idea. |
From looking at BitTorrent client implementations; the only criterion for storing a node for future use is this: you’ve successfully established an outgoing connection to it (it’s reachable.) On the next startup, a number of nodes are selected at random (e.g. 10). If any of them are unreachable, replace it with another randomly selected node. If you run out of nodes, fall back to the bootstrapping nodes. If you want to be smart about it then you can try selecting some nodes that have a short prefix length from your own public IP and some nodes from a diverse set of subnets. However, I do believe blindly choosing nodes at random is a perfectly valid strategy. Nodes are expected to leave and join the network all the time so it’s probably just a waste of time to try to carefully select the perfect set of nodes. |
@Mikaela afaik, atm, one way to do that is if the application talks to the cjdns/yggdrasil API directly, to fetch nodes. But assuming you're not peering over clearnet and nodes around you are cjdns/ygg native, you could discover them by ipv6 loopback multicast, and then switch to cjdns/ygg for data exchange, once you discover the ips these nodes are advertising (and save them because ygg/cjdns ips don't change when nodes move around). |
Right on. You just store your routing table wholesale.
Again, essentially standard routing table treatment. Your point about falling back to bootstrap nodes is the missing link in this implementation. Anytime the routing table becomes unviable (or a query stalls due to that), a callback that returns bootstrap nodes can be invoked to get things rolling again, like
I agree. It's sufficient to just use your routing table management policy at all times, both before and after persisting. By definition the routing table is the best available peers, appropriately distributed throughout the keyspace at all times anyway. |
Let's try not to hardcode any behaviour, and let's make these strategies modular. That's the ethos of libp2p. Unlike a specific BitTorrent client, etc. libp2p is a library for multitude of use cases, and no one size fits all. I'm thinking of two interfaces you can inject via type Loader interface {
// Load receives a snapshots of the members of the previous routing table
// (possibly 0-length), and the peerstore (so it can verify whether we
// still hold addresses for past members).
//
// It selects the candidates to add to the routing table, and it may use
// any heuristic, such as past observed latencies, failure rate,
// helpfulness score, etc. (recorded as metadata in the peerstore),
Load(previous []peer.ID, pstore pstore.Peerstore) (candidates []peer.ID)
} type Seeder interface {
// Seed receives the routing table to populate, the candidates the Loader
// picked, and the fallback peers (static bootstrap peers).
//
// Seed must ensure connectivity to any peers it adds, and in that process
// it can use metrics such as current, real latency to filter peers.
//
// In the event that candidates are unviable, it can fall back to the
// static bootstrap peers to salvage the situation.
Seed(host libp2p.Host, table kb.RoutingTable, candidates, fallback []peer.ID)
} We should have default strategies for when the user doesn't pass these options in the constructor, like the ones that have been cited. We can also define common strategies in exported vars, so that the user can combine them as if they were recipes. |
After creating a node, it should be possible to seed the routing table with entries. It's already possible to access the routing table of a node, we only need to determine appropriate values to be reconstituted with a peer's entry, and expose that. That would include the peer ID, and its known addresses if they're not handled externally. Adding nodes to the routing table directly via a method on the DHT should only be required if the routing table does not have sufficient context to receive the above type. Correspondingly, the DHT user should be able to retrieve a snapshot of the routing table before it is closed, or on closing, to persist as they see fit. Again, the only question here is what data is required from the DHT to persist a routing table entry and restore it later. Deferring address management and persistent to the peerstore, or whichever component is responsible for addresses makes things a little less convenient, but possibly more flexible. |
I think #283 should be done first, as it may affect what information needs to be persisted. |
#283 also makes reconstituting a routing table more robust. Currently if we did this, we'd have to connect to each peer added to maintain the routing table invariant that all entries are connected. Which also highlights how flawed that assumption is: If we can't connect to any peers while adding to the routing table due to intermittent connectivity issues, we can't restore the routing table. |
I'm suggesting, more or less:
All other behaviours at the whim of the consumer. |
Note that the DHT may not be the only service that would benefit from remembering peers and reconnecting to them. I kind of see that as a separate, non-dht issue that should be handled by the libp2p host directly.
tl;dr: Don't make peerstore persistence a dht-only feature. |
Juan brought up that simply persisting the routing table and reviving from that could be dangerous. If an attacker can ever "take over" a peer's routing table and control their view of the network, recovering from this situation would be impossible without using the centralized bootstrappers. The proposed solution is to look at historical data. That is, we should bootstrap off of peers that we've seen repeatedly over a long period of time. A possible solution (probably not optimal) is to keep bucketed counts of how frequently we've seen and used each peer over time:
We'd prefer peers that:
This solution is probably quite a bit more complicated than it needs to be, but I hope it illustrates what I think we need. |
Temporary solution:
And I found that Network().Peerstore().PeersWithAddrs() contains more information about peers than kademlia.RoutingTable().ListPeers(). So it is better to use it. |
Standard DHT implementations don't require bootstrapping on every instance start-up. It should be possible to import previous instance routing tables, and avoid having to bootstrap from a set of bootstrap nodes. Note that regularly refreshing the routing table is still common (and may be called bootstrapping).
The text was updated successfully, but these errors were encountered: