Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GossipSub 1.2] IDONTWANT control message #548

Merged
merged 11 commits into from
May 15, 2024
106 changes: 106 additions & 0 deletions pubsub/gossipsub/gossipsub-v1.2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# gossipsub v1.2: TODO

| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|---------------------------|--------|-----------------|
| 1A | Working Draft | Active | r1, 2023-07-14 |

Authors: [@Nashatyrev], [@Menduist]

Interest Group: [@vyzo], [@Nashatyrev], [@Menduist]

[@vyzo]: https://github.com/vyzo
[@Nashatyrev]: https://github.com/Nashatyrev
[@Menduist]: https://github.com/Menduist

See the [lifecycle document][lifecycle-spec] for context about maturity level and spec status.

[lifecycle-spec]: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md

# Overview

This document aims to provide a minimal extension to the [gossipsub
v1.1](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md)
protocol.

The proposed extensions are backwards-compatible and aim to enhance the
efficiency (minimize amplification/duplicates and decrease message latency) of
the gossip mesh networks for larger messages.

In more specific terms, a new control message is introduced: `IDONTWANT`. It's primarily
intended to notify mesh peers that the node already received a message and there is no
need to send its duplicate.

# Specification

## Protocol Id

Nodes that support this Gossipsub extension should additionally advertise the
version number `1.2.0`. Gossipsub nodes can advertise their own protocol-id
prefix, by default this is `meshsub` giving the default protocol id:
- `/meshsub/1.2.0`
Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved

## Parameters

This section lists the configuration parameters that needs to agreed on across clients to avoid
peer penalizations

| Parameter | Description | Reasonable Default |
|--------------------------|------------------------------------------------------------------|--------------|
| `max_idontwant_messages` | The maximum number of `IDONTWANT` messages per heartbeat per peer | ??? |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this probably needs to be per-topic per heartbeat, rather than a total per heartbeat.

It seems it could be tied in with the scoring for mesh message delivery rate. I.e the more messages we are expecting per topic, the more IDONTWANT messages we would expect to receive.

One thought would be to add a behaviour penalty, similar to broken promises, if the number of IDONTWANT messages received from a peer exceeds the mesh message delivery rate.

We intend to implement this fairly soon. Perhaps we can leave the scoring penalty here for a future PR if we dont want to specify it now.

Copy link
Contributor Author

@Nashatyrev Nashatyrev Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking IDONTWANT by topics would be perfect, however you don't know a message topic by its ID unless you already received this message.
IDONTWANT message is almost semantically equivalent to IHAVE message, so probably should have a similar anti-spam protection mechanism?
Probably it could make sense to set the 'reasonable default' to maxIHaveLength * maxIHaveMessages ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however you don't know a message topic by its ID unless you already received this message.

Is it a good reason to include topics to IDONTWANT and not care what IHAVE and IWANT are doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a good reason to include topics to IDONTWANT and not care what IHAVE and IWANT are doing?

Including a topic would increase IDONTWANT traffic 2-3 times (messageID is 20 bytes, topic could be up to around 40 bytes). Not sure if it's worth just to enable more advanced spam protection...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also consider that IHAVE is sent on heartbeat with a batch of messages which may be grouped by topics and the topic overhead here could not be that significant.
IDONTWANT on the other hand is intended to be flushed immediately and would most likely contain just a single messageID so the topic overhead would be more significant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it might be valuable to specify it - ie effectively, after mcache expires, the message should no longer exist and implementations should be able to rely on them being "resent" if they resurface after that time - this more faithfully keeps the protocol consistent in this aspect

Copy link
Contributor Author

@Nashatyrev Nashatyrev May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why using a weak hash function for unwanted is a problem.

@ppopth I was meaning this HashDoS attack.
The messageIDs in IDONTWANT are not validated: neither upon receive nor later (as for IHAVE). Thus an adversary may generate and send significant amount of messageIDs which yields the same hashCode in the context of unwanted hash set. That would result in a DoS on the receiver's side.
Actually this attack vector could be addressed on the implementation side pretty easily. I just wanted to add a 'warning note' for implementers into the spec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nashatyrev Got it. Thank you so much.

Copy link
Contributor Author

@Nashatyrev Nashatyrev May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure it's been mentioned, but because the message id:s are not validated

@arnetheduck Good point!
I believe as far as the messageId function is application specific, the validation of a messageId is left for an application responsibility. I believe Ethereum clients should all have such validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Ethereum clients should all have such validation

I'm not convinced this is the case, ie at least in nimbus, we don't validate message id:s (only messages). There's a custom message id generation feature, but this again does not validate message id:s themselves.

This PR represents the first time we receive message id:s that we're expected to store / keep track of - all others are either generated from actual messages or ephemeral.



## IDONTWANT Message

### Basic scenario

When the peer receives the first message instance it immediately broadcasts
(not queue for later piggybacking) `IDONTWANT` with the `messageId` to all its mesh peers.
This could be performed prior to the message validation to further increase the effectiveness of the approach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerns about spam attacks triggering amplified IDONTWANT spam?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like a feasible attack vector to me:

  • IDONTWANT is primarily intended for larger messages, so the cumulative size of resulting IDONTWANT messages is expected to be significantly smaller than the original message
  • If an attacker is sending invalid messages to initiate IDONTWANT spamming it would be pretty quickly banned due to negative scoring
  • And we have max_idontwant_messages limit as the last resort


On the other side a node maintains per-peer `dont_send_message_ids` set. Upon receiving `IDONTWANT` from
a peer the `messageId` is added to the `dont_send_message_ids` set.
When later relaying the `messageId` message to the mesh the peers found in `dont_send_message_ids` MUST be skipped.

Old entries from `dont_send_message_ids` SHOULD be pruned during heartbeat processing.
The prune strategy is outside of the spec scope and can be decided by implementations.

`IDONTWANT` message is supposed to be _optional_ for both receiver and sender. I.e. the sender MAY NOT utilize
this message. The receiver in turn MAY ignore `IDONTWANT`: sending a message after the corresponding `IDONTWANT`
should not be penalized.

The `IDONTWANT` may have negative effect on small messages as it may increase the overall traffic and CPU load.
Thus it is better to utilize `IDONTWANT` for messages of a larger size.
The exact policy of `IDONTWANT` appliance is outside of the spec scope. Every implementation MAY choose whatever
is more appropriate for it. Possible options are either choose a message size threshold and broadcast `IDONTWANT`
on per message basis when the size is exceeded or just use `IDONTWANT` for all messages on selected topics.

To prevent DoS the number of `IDONTWANT` control messages is limited to `max_idontwant_messages` per heartbeat

### Cancelling `IWANT`

If a node requested a message via `IWANT` and then occasionally receives the message from other peer it MAY
try to cancel its `IWANT` requests with the corresponding `IDONTWANT` message. It may work in cases when a
peer delays/queues `IWANT` requests and the `IWANT` request SHOULD be removed from the queue if not processed yet

## Protobuf Extension

The protobuf messages are identical to those specified in the [gossipsub v1.0.0
specification](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md)
with the following control message modifications:

```protobuf
message RPC {
// ... see definition in the gossipsub specification
}

message ControlMessage {
// messages from v1.0
repeated ControlIDontWant iDontWant = 5;
Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved
}

message ControlIDontWant {
required bytes messageID = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for discussion, what would be the pros/cons of also including the topic?
I think it's the first time that we reference messages only by their id on the wire, so needs some consideration

I guess the only con of include the topic is more bandwidth usage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an optional field?
Agreed about bandwidth usage, we should aim to keep this lean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IHAVE also has an optional topic field. But it looks like it is utilized by one implementation only (not sure which one exactly). I couldn't find any reasonable usage of topic for IDONTWANT tbh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no one is opposed or has any ideas on usage scenarios I would keep it without optional topic field to be more explicit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no one is opposed or has any ideas on usage scenarios I would keep it without optional topic field to be more explicit

The usage can be #548 (comment)

Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved
Nashatyrev marked this conversation as resolved.
Show resolved Hide resolved
}

```