Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3996: Encrypted mentions-only rooms. #3996

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 193 additions & 0 deletions proposals/3996-encrypted-mentions-only-rooms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# MSC3996: Encrypted mentions-only rooms
Copy link
Member

@kegsay kegsay May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this proposal overall, but it isn't clear that this is the right tradeoff in terms of privacy. Earlier proposals for putting cleartext user IDs were rejected, so what amount of metadata leakage is acceptable?

  • Cleartext user IDs. Rejected.
  • No metadata leakage at all ever.
  • This proposal -> Boolean flag to indicate some kind of @ mention. This leaks the same as cleartext user IDs in the case of a 1:1 DM.
  • Some kind of K-Anonymity ala haveibeenpwned.com which leaks to the server that a group of users should be notified. Leaks that one of that group (or potentially multiple) have been @ mentioned.
  • The receiver decrypts and then "bookmarks" the event if it is a direct @ mention. This may help to cut down the number of unnecessary operations as each device doesn't need to repeat work, though it remains O(n) where n=number of devices for that user. This leaks which messages are @ mentions to the user's server if this bookmarking is automatic and in the clear, whilst leaking no metadata to federated servers.
  • Something else entirely?

Until we get clarity on the table stakes for metadata leakage, I can't in good faith say this is the best proposal. I am personally of the opinion that we should be leaking metadata to provide a better UX, as the inability to reliably present @ mentions in encrypted rooms has significant effects on daily users e.g causing people to immediately action those messages just in case they get lost in scrollback and are then unfindable. This also has negative effects on users who have a 2nd+ account which they use infrequently and just want to check it occasionally to see who has @ mentioned them - the current UX is incomplete due to the inability to accurately go through E2EE rooms.

We already leak a lot of metadata today, but proposals like MSC4014: Pseudonymous Identities would go some way to help obscure this metadata by providing a layer of indirection on events (you would be @ mentioning epehemeral per-room per-user keys, not static user IDs).

Copy link
Member

@ara4n ara4n Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something feels wrong here. As per #3952 (comment), i believe we have no choice but for clients to receive all E2EE events (and so get pushed for them). They need this to:

  • Accurately display unread state (given they don't know what the event type is of events until they've decrypted them, and so whether that event is displayable)
  • Check for keyword mentions
  • Normal mentions (modulo this MSC)
  • Generate FTS indexes.

The fact that current clients don't do this for E2EE rooms is very much a bug (and means that you can't trust unread count or highlight counts in E2EE rooms).

Yes, this means that bandwidth is O(N) with the number of E2EE messages, but this is acceptable because you can spider the rooms in the background to search for lost mentions (and index for FTS) - so it doesn't have to slow down the UX of app. Also, the volume of E2EE rooms is typically much lower and more bounded than large spammy unencrypted public chatrooms. Empirically, on a poweruser account like mine, being pushed in the bg for every E2EE message seems acceptable in terms of battery/bandwidth use - and can have the advantage of using the pushes to prepopulate the local storage cache on the client so up-to-date history appears instantly when you open the app.

It's also how Signal, WhatsApp(?) and Threema work - and it's very hard to justify why we would expose more metadata on the server for E2EE messages (either via this or MSC3952) given other apps give an acceptable UX without doing so.

So: I believe this MSC is premature optimisation and needlessly leaks metadata. Instead, we should push the clients for all E2EE (given they need it to operate anyway), and if that needs to be optimised, only then should we give the option of sacrificing metadata privacy (and keyword mentions) if that's unacceptable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This was mentioned in a private channel, but I'll mention it here for transparency since I did a poor job of giving context on why this MSC exists.)

This MSC was written to formalize the idea of whether it would make sense to expose any additional metadata (the bare minimum). The trade-offs might not make sense, but I wanted to remove that conversation from MSC3952 to avoid blocking that MSC further. Thanks for starting the conversation here! 👍

Currently I have no plans to implement this MSC, as you said it has downsides of more metadata leakage and doesn't necessarily give us all the features we want.

I was hoping those who were advocating for exposing all the mentions information in MSC3952 might be part of the conversation here, but so far this MSC has generated little interest.

In the meantime we're exploring other ways to properly push all E2EE messages while accounting for rooms that are muted or set to mentions & keywords only, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got some time to actually read through this and understand what it's saying. I think part of why I at least haven't chimed in is that the MSC is somewhat confusingly named. I think that "Unencrypted Mention Indicator in Encrypted Rooms" or something would be more clear as to what this is about.

As for what Beeper is planning on doing to resolve the concerns we had with MSC3952, we are just putting the mentions in plaintext anyway. (We are using the presence of that field to determine whether or not the intentional mentions code path should be used to evaluate push rules.) Currently, only our bridges are utilizing this MSC to ensure that our users get notifications for mentions from other networks.

Example of what we've done in mautrix-go: https://github.com/mautrix/go/blob/3e840e962e24e90a439f91ce0cdf5e7251008413/crypto/encryptmegolm.go#LL133-L135. Whether or not to include the mentions in plaintext is configurable. We have that enabled for at least the whatsapp bridge right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think part of why I at least haven't chimed in is that the MSC is somewhat confusingly named. I think that "Unencrypted Mention Indicator in Encrypted Rooms" or something would be more clear as to what this is about.

I suppose it should really be "Supporting mentions-only encrypted room" or something.


It is currently not possible for mobile clients to properly implement a mentions-only
room.

Currently, every new event in an encrypted room might be pushed to mobile clients
due to the [`.m.rule.encrypted` default underride rule](https://spec.matrix.org/v1.6/client-server-api/#default-underride-rules).

However, a room can be configured to be mentions-only by creating a
[room-specific push rule](https://spec.matrix.org/v1.6/client-server-api/#push-rules)
with the `id` property set to the room ID & `actions` set to do nothing.[^1] Since
this is evaluated *before* the `.m.rule.encrypted` rule (almost)
**no events get pushed for a mentions-only room**.

Additionally, a room can be configured to be muted by creating the earliest
[override push rule](https://spec.matrix.org/v1.6/client-server-api/#push-rules)
possible which matches the room ID & has `actions` set to do nothing[^2], e.g.:

```json
{
"rule_id" : "!abcdef:example.com",
"conditions" : [
{
"key" : "room_id",
"kind" : "event_match",
"pattern" : "!abcdef:example.com"
}
],
"default" : false,
"enabled" : true,
"actions" : []
}
```

## Proposal

A new top-level property (`m.has_mentions`) is defined which contains a boolean
value. It is sent in cleartext (i.e. it is not encrypted) and is set to `true` if
the event mentions another user or the room per
[MSC3952](https://github.com/matrix-org/matrix-spec-proposals/pull/3952).[^3]

A new push rule is added after the `.m.rule.is_room_mention` push rule to match
encrypted mentions:

```json
{
"rule_id": ".m.rule.is_encrypted_mention",
"default": true,
"enabled": true,
"conditions": [
{
"kind": "event_property_is",
"key": "content.m\\.has_mentions",
clokep marked this conversation as resolved.
Show resolved Hide resolved
"value": true
},
{
"kind": "event_property_is",
"key": "type",
"value": "m.room.encrypted"
}
],
"actions": ["notify"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giomfo had wondered if a "push, but don't increase unread count" action would be more appropriate here. I think that we don't need that here, but it could be useful?

I'd be curious of thoughts from other client developers!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikjohnston said on MSC3952:

I think its worth noting that this means that servers (still) won't be able to differentiate between notifications and mentions. This mostly affects how often we push to devices, but also means that we can't make /notifications?only=highlight API be useful again

I think this applies to this conversation as well. If we added a highlight tweak to this rule then homeservers could push the event (and return it it as part of /notifications?only=highlight) and allow clients to reparse push rules and either downgrade the highlight -> notifications (thus ignoring the event).

This would potentially mean that /notifications?only=highlight would return a lot of not-useful data, but it could be interested to consider.

The other potential solution here is the ideas in element-hq/element-web#6874.

}
```

(Note: `\\.` would become a single logical backslash followed by a dot since the
above is in JSON-representation. See
[MSC3873](https://github.com/matrix-org/matrix-spec-proposals/pull/3873).)

When this push rule matches then it would push the event to all users, similar to
how the [`.m.rule.encrypted` default underride rule](https://spec.matrix.org/v1.6/client-server-api/#default-underride-rules)
push rule works, but with the added context that *some clients will probably care*.
Clients would [decrypt the event and re-run push rules](https://spec.matrix.org/unstable/client-server-api/#receiving-notifications)
as normal. This would result in:

1. No push notification for muted rooms as their push rule has a higher priority,
as mentioned [above](#encrypted-mentions-only-rooms).
2. A push notification which the client would discard if the room is set to
mentions-only and the user/room was not mentioned in the event.[^4]
3. No change in behavior if the user has no special rules for the room and is not
mentioned (i.e. the event would generate a push notification via the
[`.m.rule.encrypted` default underride rule](https://spec.matrix.org/v1.6/client-server-api/#default-underride-rules).
4. A push notification which the client would "upgrade" to a highlight notification
if the (decrypted) event mentions the user.

The overall tradeoff is that clients will receive extra pushes some of the time
(which won't matter), but help hide the metadata of who, in particular, is being
mentioned.

If the decrypted ciphertext contains a `m.has_mentions` property it should be ignored.

## Potential issues
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing a section about abuse potential -- what happens if someone just says every message has a mentions on it... wasting the bandwidth, power, and CPU of others in the room?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if clients could surface this field being set (to moderators/admins only?) so the abuse wouldn't be so invisible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field is unencrypted so yeah I agree -- this is mostly up to clients or moderation tooling exposing some aggregate information on this? I think I just need to add some text. 😄


If the mentioned user has muted the room then the above logic breaks down since the
event will get pushed to all *other* users unnecessarily. This could be considered
a good thing (see [security consideration](#security-considerations)) to protect
who was mentioned in the message.

## Alternatives

[MSC1796](https://github.com/matrix-org/matrix-spec-proposals/pull/1796) to keep
mentions information in cleartext was rejected.

The previously discusssed alternative[^5] is for clients to download all encrypted
messages, decrypt them locally and evaluate push rules. This is a costly process
in terms of bandwidth, CPU, and battery since the client must either receive every
message via push notifications or backpaginate every room fully via a polling loop,
in case of a gappy sync.

Both of the above solutions are sub-optimal however:

* Some platforms don't allow a polling loop,
[e.g. iOS](https://github.com/matrix-org/matrix-spec-proposals/pull/3952#discussion_r1065004790),
so Matrix homeservers are forced to push every (encrypted) message.
* Some platforms, e.g. Android without Firebase Cloud Messaging support, it is known to be
[expensive to run a polling loop](https://github.com/vector-im/element-android/issues/2055)
to download all messages and search them for notifications.

This MSC proposes a middle ground, which should help with platforms which support
push notifications. (Although it isn't as directly able to help with platforms
without push unless an easier API to find these events is created.)

It is also asserted[^5] that clients want to download all data for search (and
keyword matching) in encrypted rooms. It is unclear if there is actually a need
to do this on mobile devices (and it could use excessive storage and bandwidth,
as mentioned above).

[MSC3575: Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/blob/kegan/sync-v3/proposals/3575-sync.md#e2ee-handling)
discusses the trade-off of choosing which rooms to sync based on whether the rooms
are encrypted and whether they have notifications. This proposal may help slightly
to give more confidence that an event is notification or not, but does not help
solve the issue of which rooms to sync first.

## Security considerations

[MSC1796](https://github.com/matrix-org/matrix-spec-proposals/pull/1796) and an
[earlier version of MSC3952](https://github.com/matrix-org/matrix-spec-proposals/blob/86bf972c2c8ef04dc849ada5bbcb986ac990a7a3/proposals/3952-intentional-mentions.md)[^6]
proposed sending the mentioned users (or room) in cleartext to allow for more
fine grained control of which events are pushed, but this was deemed to be too
much of a metadata leak.

This proposal significantly reduces the metadata leak by only including *if* there
is a mentioned user or room. This helps reduce metadata leakage by:

1. Not including the mentioned users in cleartext. 🙃
2. Treating a room mention and an individual user mention identically (it is
likely that some or most users will discard the message on the client device
once decrypted).

This seems like a reasonable trade-off, but needs vetting.

## Future extensions

This proposal does not attempt to solve the issue of keyword notifications,
although it has been [suggested](https://github.com/matrix-org/matrix-spec-proposals/blob/matthew/msc1796/proposals/1796-e2e-notifications.md#better-handling-of-custom-keyword-notifications)
that custom keywords are uncommon and a sufficiently different problem. Regardless,
it is recommended that client authors explain this limitation to users or provide
a way for users to enable keywords in a subset of rooms.

## Unstable prefix

During development the new push rule shall use `org.matrix.mscxxxx.is_encrypted_mention`
instead of `.m.rule.is_encrypted_mention`.

## Dependencies

This proposal depends on the following MSCs which, at the time of writing, have
not yet been accepted into the Matrix spec:

* [MSC3952](https://github.com/matrix-org/matrix-spec-proposals/pull/3952): Intentional Mentions


[^1]: Via either an explicit `"dont_notify"` action or an empty array. These are
equivalent, see [issue #643](https://github.com/matrix-org/matrix-spec/issues/643).

[^2]: The [`.m.rule.master`](https://spec.matrix.org/v1.6/client-server-api/#default-override-rules)
is *always* first, so this rule gets created right after it.

[^3]: Strictly speaking this does not require MSC3952, but it simplifies the text
to assume it will be accepted.

[^4]: In the past it was not possible to discard notifications on iOS: if a push
notification was received it *had to be displayed*. This is [no longer the case](https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_developer_usernotifications_filtering).

[^5]: [From MSC3952 comments](https://github.com/matrix-org/matrix-spec-proposals/pull/3952#discussion_r1113525021):
"we already are committed to downloading all contents in E2EE rooms in order to
calculate unread state (given the server doesn't know if a msg is visible until
it's decrypted), notifying for keyword mentions, updating full-text indexes for
E2EE content, etc."

[^6]: In particular see [this thread](https://github.com/matrix-org/matrix-spec-proposals/pull/3952#discussion_r1112154200)
on MSC3952 and the [subsequent](https://github.com/matrix-org/matrix-spec-proposals/commit/f0a1f6ad184788814c45d58370248b8052142171)
reversal of the cleartext mentions information.