Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kad] Provide a targeted store operation. #1988

Merged
merged 8 commits into from
Mar 9, 2021

Conversation

romanb
Copy link
Contributor

@romanb romanb commented Mar 4, 2021

This PR adds Kademlia::put_record_to for storing a record at specific nodes, e.g. for write-back caching after a successful read. In that context, peers that were queried in a successful Kademlia::get_record operation but did not return a record are now returned in the GetRecordOk::no_record list of peer IDs as additional information which may be used to choose which of these peers to send (one of) the found records to.

The motivation for this PR originates in the discussion at #1577, i.e. the inability for libp2p-kad itself to do write-back caching of records after successful lookups when a quorum > 1 is used. It is deemed best and the most flexible to provide the means to do so on the API. There may be other use-cases for targeted store operations as well. That is to say, Kademlia::put_record_to is intended as to augment the use of Kademlia::put_record with occasional targeted store operations, not as a replacement. Though if someone wishes to only use Kademlia::put_record_to, that can obviously be done, the resulting DHT then just no longer has the properties attributed to Kademlia.

I'm not sure if it would be worthwhile to be able to configure (i.e. enable/disable) the default behaviour of write-back caching upon lookups with a quorum of 1. Maybe some users would like full control over that behaviour by disabling it and doing any such writes themselves via Kademlia::put_record_to. Feedback appreciated.

e.g. for write-back caching after a successful read. In that context,
peers that were queried in a successful `Kademlia::get_record` operation but
did not return a record are now returned in the `GetRecordOk::no_record`
list of peer IDs.

Closes libp2p#1577.
@romanb romanb changed the title [kad] Provider a targeted store operation. [kad] Provide a targeted store operation. Mar 4, 2021
if source_key.distance(&key) < cache_key.distance(&key) {
} else {
no_record.push(source);
if quorum.get() == 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it would be worthwhile to be able to configure (i.e. enable/disable) the default behaviour of write-back caching upon lookups with a quorum of 1. Maybe some users would like full control over that behaviour by disabling it and doing any such writes themselves via Kademlia::put_record_to. Feedback appreciated.

I am fine with either solution. That said I do tend towards not treating quorum of 1 as a special case. Instead I suggest to always require the user to call put_record_to in order for write back caching to close nodes without the record to happen. Not having a special case for quorum of 1 seems more consistent to me, despite the fact that it is not perfectly inline with the paper.

I would deem an additional note on the GetRecordOk structure, informing the user of their responsibility, as well as an entry in the changelog enough. We could as well enforce this at the type level, requiring the collection of peers returned in GetRecordOk::no_record to be used, just like std::result::Result.

What do you think @romanb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit reluctant to not do it by default for quorum 1 lookups, because I think libp2p-kad should offer "standard Kademlia" behaviour out of the box where you just put_record and get_record with the expected behaviour. It should of course be documented clearly that when using get_record with a quorum > 1 (or put_record with a quorum < ALL for that matter), you are leaving the territory of basic Kademlia and then you should need to care about put_record_to.

In the most recent commit I tried a different path: I made the caching more configurable, but retaining the default behaviour. Moreover, configuring the caching and enabling it gives control over how many "closest nodes to the key that did not return a record" are tracked and returned on a successful lookup. These are always returned in GetRecordOk::cache_candidates if caching is enabled, ordered by distance, but only for quorum 1 lookups do these also get automatically sent the record. I tried to document this clearly. What do you think? If you feel strongly that we should never do the caching automatically, let me know.

protocols/kad/src/behaviour.rs Outdated Show resolved Hide resolved
romanb and others added 3 commits March 9, 2021 10:42
Rather than returning the peers that are cache candidates
in a `Vec` in an arbitrary order, use a `BTreeMap`. Furthermore,
make the caching configurable, being enabled by default with
`max_peers` of 1. By configuring it with `max_peers` > 1 it is
now also possible to control at how many nodes the record is cached
automatically after a lookup with quorum 1. When enabled,
successful lookups will always return `cache_candidates`, which
can be used explicitly with `Kademlia::put_record_to` after
lookups with a quorum > 1.
Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit reluctant to not do it by default for quorum 1 lookups, because I think libp2p-kad should offer "standard Kademlia" behaviour out of the box where you just put_record and get_record with the expected behaviour.

Agree that this is something worth striving for.

In the most recent commit I tried a different path: I made the caching more configurable, but retaining the default behaviour. Moreover, configuring the caching and enabling it gives control over how many "closest nodes to the key that did not return a record" are tracked and returned on a successful lookup. These are always returned in GetRecordOk::cache_candidates if caching is enabled, ordered by distance, but only for quorum 1 lookups do these also get automatically sent the record. I tried to document this clearly. What do you think? If you feel strongly that we should never do the caching automatically, let me know.

I don't feel strongly about this and I am in favor of your recent changes, namely the introduction of KademliaCaching. In case people are in need for configuring the quorum == 1 caching case as well, I would hope for them to raise an issue in the future.

While always the case, thank you for the detailed documentation!

protocols/kad/CHANGELOG.md Outdated Show resolved Hide resolved
protocols/kad/CHANGELOG.md Outdated Show resolved Hide resolved
@romanb romanb merged commit f48bb15 into libp2p:master Mar 9, 2021
@romanb romanb deleted the kad-put-record-to branch March 9, 2021 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

protocols/kad: Why run caching procedure for queries with Quorum::One only?
2 participants