Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossipsub v1.1: introduce message signing policy #294

Merged
merged 9 commits into from
Sep 30, 2020

Conversation

protolambda
Copy link
Contributor

See libp2p/go-libp2p-pubsub#359 for motivation and related discussion.

In short: Eth2 uses content-addressing and does not use these fields. If not content-addressed, signature data should be verified more strictly.

Changes:

  • Pubsub readme:
    • Describe the optionality of singing in the relevant signing section of pubsub, and refer to new gossipsub v1.1 configuration option
    • Clarify that signatures are singing over all fields (incl unrecognized fields)
  • Gossipsub v1.1:
    • Briefly describe the general two cases when the fields are desired or not.
    • Document legacy "lax" behavior may be implemented as additional signing policies
    • Document two main (strict) signing policies
      • Feedback wanted: best way to describe the rejection of messages with bad signature or unexpected/missing fields?

Out of scope:

  • We had a related discussion about rejecting unrecognized protobuf fields altogether, to avoid any unexpected privacy issues or encoding edge cases. (will open an issue for this)

See ethereum/consensus-specs#1981 for Eth2 progress on this work, and clients implementing this functionality in gossipsub implementations.

PS. This is my first contribution directly to the libp2p specs, let me know if you have feedback to details such as formatting, I am happy to make changes.

cc @raulk, implemented the spec changes, review/edits welcome.


For signing purposes, the `signature` and `key` fields are used:
- The `signature` field contains the signature.
- The `key` field contains the signing key when it cannot be inlined in the source peer ID.
When present, it must match the peer ID.

The signature is computed over the marshalled message protobuf _excluding_ the key field.
This includes any fields that are not recognized, but still included in the marshalled data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is problematic because protobuf is non-deterministic in its marshalling - the way to write this specification would be to remove mentions of protobuf and instead explicitly outline the marshalling in terms of key-value pairs and field types - it would have to be done carefully such that the result is readable by a compliant protobuf parser - it might also happen by chance that some protobuf encoders produce a valid byte stream, but with unrecognised fields in particular, the likelihood is low.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hoist the protobuf deterministic serialisation discussion to a new issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been recurrent enough that it needs to be a first-class discussion with a solid outcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specify canonical encoding here to remove all ambiguity.

Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments; I'm going to take the liberty to push to your branch to make a few editorial changes, if you don't mind.

pubsub/README.md Outdated
@@ -153,13 +153,18 @@ and

Messages can be optionally signed, and it is up to the peer whether to accept and forward
unsigned messages.
When the receiver expects unsigned content-based messages, and thus does not expect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind elaborating what does content-based messages means here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the two paragraphs that follow are better expressed as a bullet. Why don't we introduce the Lax modes here too? We anyway restrict all of this to v1.1, so we might as well introduce all options here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have called it content-addressed, or maybe you can think of better wording, to describe messages that identify themselves without the author, seqnoi and signature approach, like in eth2. These messages do not need to be signed, as for these it depends on the application layer, which checks contents etc.

Comment on lines 141 to 142
These fields may negatively affect privacy in content-addressed messaging,
and may need to be strictly enforced in author-addressed messaging.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the content-addressed and author-addressed word choice might be a bit confusing. How about "anonymous" and "origin-stamped"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work as well, maybe worth a short definition somewhere, it may pop up in other places of the pubsub spec as well. E.g. the message-ID customization may refer to the difference.

Comment on lines 146 to 147
An implementation may choose to support the legacy v1.0 "lax" signing policy,
along with an explicit message authoring option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"origin-stamping" option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this paragraph below the specification of both options. Otherwise we're speaking about v1.0, then v1.1, then v1.0 support in v1.1, back to v1.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, can you include this in the changes you wanted to make? I'm including this line mostly for awareness of backwards compatibility options, but it does not need to be part of the spec otherwise.

Comment on lines 156 to 157
- Produces messages without the `signature`, `key`, `from` and `seqno` fields.
The corresponding protobuf key-value pairs are absent from the marshalled message, not just empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate these in two sub-bullets. "On the message producer side:" / "On the message receiver side:"

@arnetheduck
Copy link
Contributor

arnetheduck commented Sep 22, 2020

One important point to raise here is that one reason to use gossipsub and not a custom protocol is that it's a generic message transport - we can expect that the gossipsub networks will cross-pollinate - say between ipfs and eth2 for example - and it's plausible that one might wish to be a participant of both. In fact, it's one of the main advantages of using gossipsub: peer diversity is increased.

The strict policies here somewhat limit that benefit - it creates, in effect, two different networks. The issue could be mitigated by making sure that implementations support different policies per topic - the way to achieve this would be to move some of the validation up to the application layer which easier can handle the per-topic nuances and instead only have a lax mode in gossipsub where:

  • iff a field is present, it must be valid for its type
    • seqno must be 8 bytes
    • public keys and peerid's must parse as valid for their type
    • signatures, if present, must validate according to rules that will be lifted to a different PR
  • iff the field is not present, gossipsub doesn't care per se - the application might though.
  • this is congruent with the fields being optional in the protobuf description of the messages

on top of that policy, applications could then choose to restrict things further within the topic namespace they occupy - ie ETH2 might choose to not propagate messages with gossipsub signatures on eth2 topics (because we already do a mostly equivalent check / validation by verifying a payload signature).

Status is developing a similar protocol on top of gossipsub that also doesn't use the gossipsub signatures (because we negotiate a different scheme which is more appropriate for the status use case - the gossipsub signatures themselves live at the wrong level of abstraction which is why their usefulness is limited, often)

@djrtwo
Copy link

djrtwo commented Sep 22, 2020

In fact, it's one of the main advantages of using gossipsub: peer diversity is increased.

Is this one of the main benefits? That is a clear benefit of using a cross-pollinated DHT but nodes of different networks are not going to generally subscribe to (or be able to execute validations of) gossipsub topics on an alternate network.

@arnetheduck
Copy link
Contributor

well, I'd say so, yes - we're paying the protobuf tax, the "optional-seqno-and-signature-tax", we namespace the topics etc in order to achieve this interoperability where multiple applications can share the infrastructure - and it's not so much about validation specifically but alternatives that might have different needs but still be related (L2, a future ETC fork, client extensions etc).

If it was only ETH2, a much simpler thing to do would have been to take the research behind gossipsub, even forking it, and run it as a separate libp2p protocol (eth2sub/1.0.0) that would use SSZ and pure content-hashing - content-hashing is already a fairly significant departure from "standard" gossipsub - allowing per-topic optionality in the gossipsub fields in a way achieves the same goal but in a softer, more enabling way - ie it's still possible to create a restricted software where only a certain topic space with specific validations rules are applied, but it's also possible to create a mixed network with different rules applied to separate namespaces, without polluting the libp2p spec with application details.

@raulk
Copy link
Member

raulk commented Sep 22, 2020

I think @arnetheduck makes a valid point. If we make these signing policies configurable at the entire gossipsub stack level, it's no longer possible to participate in pubsub topics that adopt different signing policies. That is, you could never have a single node subscribed to both Filecoin (lax) and Eth2 (strict no sign) topics, which does harm interoperability and sabotages the feasibility of very desirable cross-chain applications and use cases.

I'm inclined to not merge this PR and iterate on the implementation submitted in libp2p/go-libp2p-pubsub#359 to push these options down to the topic level. That way, when you join to a topic, you specify the policy that applies to that topic.

@protolambda I know you are probably backlogged. Maybe someone in the libp2p or IPFS teams could take this on.

@protolambda
Copy link
Contributor Author

@raulk pushing these options down to a topic level would be great, sounds like a good solution. And yes, quite backlogged with work. I think it is fair to add a notice in this PR that the options can be implemented on a per-topic basis. Then expand on that once the go-libp2p implementation has decided on the best approach.

@raulk
Copy link
Member

raulk commented Sep 22, 2020

@protolambda makes sense. I'll rejig the text, and add an implementation margin note that we can remove once this is pushed down to the topic level in go-libp2p.

@arnetheduck
Copy link
Contributor

While working on this, we also noticed that this breaks the default message id generation, which relies on sequence number and peer id, is somewhat broken when the fields are not included - ie migrating fully to a content-based hash method when the anonymous mode is enabled is necessary.

@protolambda
Copy link
Contributor Author

@arnetheduck yes, good point. And so message ID should be per-topic as well (it's kind of already, if you switch based on the topic info within the message). Will update PR now to incorporate formatting + per-topic behavior recommendation.

@protolambda
Copy link
Contributor Author

@raulk update:

  • Implemented your PR suggestions (or at least my interpretation of them, I wanted to clarify and format some more details).
  • Added a section about message identification, and clarify the two common flavors, as well as moving the extra details from the fields section there.
  • Message signature policy now states that it can initially be a global configuration, but per-topic policies facilitate mixed protocols better
  • Optionality of fields is clarified, and states to depend on message ID and signature policy choices.

Again, feel free to make editorial choices, squash/rename commits, or any feedback. I would like to keep this PR moving, so we can continue with ethereum/consensus-specs#2060 before the next Eth2 specs release.

@raulk
Copy link
Member

raulk commented Sep 24, 2020

Apologies. Will get to this later today.

pubsub/README.md Outdated Show resolved Hide resolved
pubsub/README.md Outdated Show resolved Hide resolved
pubsub/README.md Outdated Show resolved Hide resolved
pubsub/README.md Outdated Show resolved Hide resolved
pubsub/README.md Outdated Show resolved Hide resolved
pubsub/gossipsub/gossipsub-v1.1.md Outdated Show resolved Hide resolved
pubsub/gossipsub/gossipsub-v1.1.md Outdated Show resolved Hide resolved
pubsub/gossipsub/gossipsub-v1.1.md Outdated Show resolved Hide resolved
pubsub/gossipsub/gossipsub-v1.1.md Outdated Show resolved Hide resolved
Comment on lines +144 to +148
In the default origin-stamped messaging, the fields need to be strictly enforced:
the `seqno` and `from` fields form the `message_id`, and should be verified to avoid `message_id` collisions.

In content-stamped messaging, the fields may negatively affect privacy:
revealing the relationship between `data` and `from`/`seqno`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the goal of this is. This might be better moved to after "Signature Policy Options", and put under a "Usage indications/notes" section.

I would reword to something like this:

  • Origin-stamped messaging: choose StrictSign, and use in combination with the default message_id_fn.
  • Content-addressed messaging: enable StrictNoSign, with a custom hash-based message_id_fn. Any signatures would now be part of the payload, and would need to be validated through a custom message validator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to give some context/intuition why the fields are handled like they are. It could be worded directly as message id approach -> signing policy directions, like you suggest. Either option works for us.

@raulk
Copy link
Member

raulk commented Sep 24, 2020

I'm going to commit my suggestions, and I'll do another pass tomorrow. Also requesting @vyzo's review here.

@raulk raulk requested a review from vyzo September 24, 2020 22:01
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is a very welcome addition.

However, I think that the signature policy should be part of the pubsub spec and not the gossipsub spec.
There is nothing gossipsub specific about signature policies, and the policy applies to all routing algorithms as signing is handled outside the router -- at least in the reference implementation.

pubsub/README.md Outdated Show resolved Hide resolved

For signing purposes, the `signature` and `key` fields are used:
- The `signature` field contains the signature.
- The `key` field contains the signing key when it cannot be inlined in the source peer ID.
When present, it must match the peer ID.

The signature is computed over the marshalled message protobuf _excluding_ the key field.
This includes any fields that are not recognized, but still included in the marshalled data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specify canonical encoding here to remove all ambiguity.

@@ -134,6 +136,50 @@ message PeerInfo {
}
```

### Signature Policy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this should be in the gossipsub spec -- it is a generic pubsub policy that applies to all routing algorithms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will move it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was looking for a place to version it, but since it's not breaking anything necessarily, and applies to pubsub, moving it to the pubsub doc sounds good.

pubsub/README.md Outdated Show resolved Hide resolved
protolambda added a commit to ethereum/consensus-specs that referenced this pull request Sep 25, 2020
@raulk raulk merged commit e8b85d0 into libp2p:master Sep 30, 2020
djrtwo added a commit to ethereum/consensus-specs that referenced this pull request Sep 30, 2020
Use gossip signing policy in p2p spec, see libp2p/specs#294
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants