Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: dom.createPort() #679

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions proposals/dom_create_port.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# Proposal: Inter (JS) World Communication

**Summary**

This introduces a special message port that can be passed and shared between different JS "worlds" (contexts) in the same document.

**Document Metadata**

**Author:** rdcronin

**Sponsoring Browser:** Chrome

**Contributors:** Rob--W, ...

**Created:** 2024-06-24

**Related Issues:** <TODO>

## Motivation

### Objective

This allows the ability to establish a communication channel between different
active JS worlds they may have. This enables coordination between e.g. a main
content script and any scripts that may inject in the main world, or between a
content script and other user scripts.

#### Use Cases

The primary use case is coordinating work between different JS worlds in a way
that is more difficult for any other script in the main world to intercept.
This helps extensions avoid interacting with the existing page script, when
doing so is undesirable. This also allows executing code based on data within
a content script without leaking the details of that content script to the main
world. (Though we don't consider the isolated world boundary a security
boundary, it is useful isolation and can serve as a "first line of defense").

### Known Consumers

User script managers have expressed an interest in this API. More broadly,
this would help any extension that needs to communicate between the main world
of a web page with a more trusted script.

## Specification

### Schema

```
declare namespace dom {
interface PortProperties {
world: ExecutionWorld;
worldId?: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both @carlosjeurissen and @xeenon had mentioned that this might be redundant, and I can't immediately see a strong need for this (it is only useful if an extension sends the port to multiple worlds, which seems unlikely - and we might be able to just make it so the port only works in the first world it is sent to). Any thoughts on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't to support multiple worlds (this wouldn't do that, since this still only supports one), but rather to ensure that:
a) a developer is aware that a port is bound to a single world. This is distinct from the existing uses of ports, which are a many-to-one relationship between receivers and senders.
b) a developer can't accidentally sending a port to an incorrect world.

Technically, this is redundant, since it could be derived from the first world it's sent to (and, since dom.executeScript() is proposed to be synchronous, there in theory aren't any TOCTOU issues here?), but that still seems a bit less obvious to me than binding the port at the start. (And makes browser implementation slightly more complex, though we can always work around that.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you mean by "this might be redundant". Are you saying that worldId is not necessary, that world should not be required if worldId is present, or something else?

The Allow multiple user script worlds proposal introduced the concept of worldId (along side world) as a property of RegisteredUserScript. Right now the presence of both properties on PortProperties follows that pattern. The main difference between how these properties are used is that this proposal defines world as a required property while in the other prooposal its optional. That allows developers to omit the world property when targeting a specific user script world or to omit worldId when targeting MAIN or USER_SCRIPTS worlds.

The only scenario I can think of where requiring world might be useful is if we anticipate allowing other types of script injection to occur in separate, isolated worlds. Or, at least want to reserve the ability to introduce such a concept in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that worldId is not necessary

This was the intended meaning of my original comment. Thinking about it some more (and as Devlin touched on below) the world property is also not strictly necessary.

I can see the argument for requiring these, but right now I'm not convinced the benefit is worth it. If a developer tries to send a port to a second world, we can easily throw a "Port cannot be sent to multiple worlds" error.

That seems like enough for developers to discover the right path without making the API more verbose.

};

interface MessagePort {
sendMessage(args?: any): void
onMessage(args?: any): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm that you are intentionally using a new design here:

There is no precedent in extension APIs for using properties to control API behavior - API interaction is always through methods. Some APIs can accept function members on objects passed to functions (e.g. contextMenus.update(id, { onclick })), but there is no namespace whose behavior changes through properties.

If this new pattern is not intentional, setMessageListener( MessageListenerCallback? ); would fit the usual conventions (I don't mention getMessageListener because there is presumably no need to change the listener later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to update this earlier. This should be an event, not a function. In our yet-to-be-finalized new syntax, this would be something like:

callback OnMessageListener = void (any message);

interface OnMessageEvent : ExtensionEvent {
  static void addListener(OnMessageListener listener);
  static void removeListener(OnMessageListener listener);
  static boolean hasListener(OnMessageListener listener);
}

interface MessagePort {
  static attribute OnMessageEvent onMessage;
  ...
}

Not sure if we want to use that here, or a more typescript-centric approach; lemme know if you have a preference

Comment on lines +56 to +57
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sendMessage(args?: any): void
onMessage(args?: any): void
sendMessage(message?: any): void
onMessage(message?: any): void

To avoid confusion with args in dom.execute, suggest renaming to message.

};

export function createPort(
properties: PortProperties
): MessagePort;
}
```

### Behavior

#### Creating a Port

An extension can create a new message port using `dom.createPort()`. This will
create and return a new message port according to the specifies `properties`
(see below).

#### Port Properties and World ID

A message port can only be used to communicate with a single other world. This:
* Avoids the "many-to-one" opener-listener behavior that existing message ports
(created by `runtime.connect()`) in extensions have. This behavior has caused
increased complexity and developer confusion. A given port will have only one
channel.
* Protects against accidentally sharing the port with another world.

The caller indicates which world it would like the port to be associated with
by specifying the `world` property to the appropriate type of execution world
and, if appropriate, the `worldId` (e.g. for specific user script worlds).

#### Passing a Message Port

A message port can be passed to another world (in order to establish a
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be
expanded to have special serialization logic for the MessagePort type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expanded to have special serialization logic for the MessagePort type.
expanded to have special serialization logic for the MessagePort type.
Once a message port has been passed to a `dom.execute` call, it cannot be passed
again - an error will be thrown because it is not structurally cloneable.

I'm explicitly noting that the port cannot be reused. And also emphasizing that the port is not structurally cloneable, because making the type cloneable could be a trivial way to make it fit in the dom.execute requirement of args being strucurally cloneable (which would trigger a bunch of new issues, such as the "clone" potentially inverting roles without dom.execute, e.g. when calling structuredClone(port)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; will incorporate


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a more explicit description of the message dispatch semantics. This defines synchronicity, error handling and non-errors.

Suggested change
### Message dispatch
Upon calling `sendMessage`, `onMessage` is synchronously called before
`sendMessage` returns. `sendMessage` returns successfully if `onMessage` has
been called, even if the function itself threw an error.
`sendMessage` may throw if the input message cannot be serialized.
`sendMessage` may throw if the return value cannot be serialized.
`sendMessage` may throw when the recipient is unavailable. For example, if the
recipient did not assign an `onMessage` listener. Or if the port was never
delivered to the other world.

I am also willing to use the phrase throws instead of may throw if you agree with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to touch on the synchronocity aspect a bit more here (and I think return values are a natural byproduct of that). I know we discussed this briefly before, but I'd like some more clarity on what this enables. Given we have a synchronous dom.execute(), what the use cases in which synchronous message passing (as opposed to just FIFO) to a port is critical? Is it for cases where the main world needs synchronous execution with the extension world? If so, why? Or for some other use case?

A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, synchronous message passing enables the already injected main world script to reach back to the isolated world to look up relevant state that the script in the main world needs.

Other extension authors have already commented on some use cases at #284

As an example of a concrete use case from personal experience: I have an extension called Don't Track Me Google, which is just a content script that ensures that becomes. Among other things, the extension intercepts the setter of HTMLAnchorElement.prototype.href`, to hook in on calls from the web page. It sanitizes it and calls the original setter.

To minimize interference from the web page, the sanitization logic should run in an isolated world. And because the original caller expects a synchronous API, the messaging API has to be synchronous.

This specific use case also shows why sometimes it is desirable to be able to receive DOM objects, so that the page and caller can pass the DOM elements (that they already share!) without requiring appending to the DOM.

Note: this extension was originally a MV2 extension. To convert to MV3, I had to duplicate the sanitization logic in the main world and use a shared DOM element for communicating specific state (preferences from the options page) from the content script to the main world script: https://github.com/Rob--W/dont-track-me-google/blob/a73ab8b3bfa9794c75cb4e111c6424e31e6c019e/main_world_script.js. The shared DOM element can be detected by the web page. In my case it's a reasonable trade-off, especially because I don't expect Google web properties to intentionally break the extension. This assumption is not necessarily true for extensions that are privacy-focused and designed for the whole web.

A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).

There is already a synchronous communication between worlds - they share the same DOM, and therefore DOM events.

Copy link

@tophf tophf Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we have a synchronous dom.execute()

Abusing it to pass messages synchronously would be very inefficient as running code requires a lot more internal work on top of passing the arguments. Userscripts may exchange thousands of messages between the worlds.

use cases in which synchronous message passing [...] is critical

It allows implementing simple super-performant API via object getters exposed to the userscript or main worlds that receive the actual info from the secure world synchronously.

Tampermonkey, Violentmonkey, Firemonkey would like to keep supporting the legacy GM_xxx API, which is synchronous. This alone seems critical as there are thousands of userscripts that still work even though they are abandoned and no longer developed.

we could potentially have a single world blocked while another is unblocked

They share the same JS/DOM thread, so what you describe is impossible.

If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).

No, we just abandon the idea of MessagePort altogether and use CustomEvent mechanism internally, which already does it synchronously and which is what userscript extensions have been always using. The side benefit would be the API may be simplified as there's no need for a port.

#### Example Usage

Script that executes in the main world:
```js
function getFoo() { ... }
function getBar() { ... }
// Other code dispatches 'foochanged' and 'barchanged' events.
```

Content script:
```js
// This function will execute in the main world.
function setUpMainWorld(port) {
// This responds to a message received from the content script.
port.onMessage = (message) => {
if (message == 'getFoo') {
const foo = getFoo();
port.sendMessage({foo});
} else if (message == 'getBar') {
const bar = getBar();
port.sendMessage({bar});
Comment on lines +109 to +113
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligning with the code below to make the examples more readable / prevent perceived differences where they do not apply.

Suggested change
const foo = getFoo();
port.sendMessage({foo});
} else if (message == 'getBar') {
const bar = getBar();
port.sendMessage({bar});
port.sendMessage({foo: getFoo()});
} else if (message == 'getBar') {
port.sendMessage({bar: getBar()});

}
};

// These notify the content script of changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// These notify the content script of changes.
// These notify the content script of changes.
// NOTE: addEventListener used in this example is not part of the API.

Adding emphasis on the fact that addEventListener is NOT part of the API. Because if it were, it would easily be intercepted by the web page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack; will incorporate

addEventListener('foochanged', () => {
port.sendMessage({foo: getFoo()});
});
addEventListener('barchanged', () => {
port.sendMessage({bar: getBar()});
});
}

// The rest of this code executes in the content script world.
const mainWorldPort = browser.dom.createPort({world: 'MAIN'});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the port needs to be passed to the main world anyway. What is the reason to require explicitly specifying the world?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above from Oliver's comment. This is to make it explicit and obvious to both the browser and the developer that this port is implicitly bound to one world, and disallow any accidental passing to another world or multiple worlds.


mainWorldPort.onMessage = (message) => {
if (message.foo) {
updateFoo(message.foo);
} else if (message.bar) {
updateBar(message.bar);
}
};

browser.dom.execute(
{
func: setUpMainWorld,
args: mainWorldPort,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args: mainWorldPort,
args: [mainWorldPort],

});

function fetchFoo() {
mainWorldPort.sendMessage('getFoo');
}

function fetchBar() {
mainWorldPort.sendMessage('getBar');
}
```

##### Message serialization

Message arguments are serialized and deserialized using the Structured Cloning
algorithm. This allows for more flexibility than simply JSON conversion.
Message ports cannot pass other message ports in arguments.

### New Permissions

There are no new permissions for this capability. This does not allow any new
data access, since it is only accessible once the extension has already
injected a script into the document. Extensions can also already interact with
the main world of the document through either appending script tags or directly
injecting with registered content or user scripts, or the
scripting.executeScript() method.

### Manifest File Changes

There are no necessary manifest changes.

## Security and Privacy

### Exposed Sensitive Data

This does not result in any new data being exposed.

### Abuse Mitigations

This doesn't enable any new data access. To reduce risk of cross-world
contamination, extensions must specify the world with which they want to
communicate, and all arguments are copied rather than directly shared.

### Additional Security Considerations

N/A.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting to include a short write-up of the brittle safety offered and not-offered by the API. This clarifies what the API does and does not do to those who are not experts in the subject.

Suggested change
#### Secrecy of data
This API is designed to offer a secure communication channel between one world
and an untrusted world. It is the responsibility of the extension to maintain
further secrecy beyond that. In general, this requires avoiding the use of
prototype members.
For example, imagine a func that reads `message.foo` in the main world. This
is safe if `message` is always an object with the `foo` member. If this is not
the case, then it is a language feature of JavaScript to look up the `foo`
member in the prototype chain. Most objects ultimately inherit from
`Object.prototype`, which can be manipulated by other (untrusted) scripts to
intercept the message.
Example:
```javascript
// Attack:
Object.defineProperty(Object.prototype, "foo", {
get() { console.log("Intercepted via foo:", this); return "intercepted" }
});
// Safe use: foo always defined, web page cannot intercept access.
message = { secret: 1, foo: true };
console.log("message.foo=", message.foo);
// Logs "message.foo= true"
// Unsafe use: foo not set; web page can intercept access attempt.
message = { secret: 1 };
console.log("message.foo=", message.foo);
// Logs "Intercepted via foo: { secret: 1 }
// Logs "message.foo= intercepted"
```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, will incorporate

## Alternatives

### Existing Workarounds

Today, to communicate between JS worlds, extensions can use custom events as
described in the [content script
documentation](https://developer.chrome.com/docs/extensions/develop/concepts/content-scripts#host-page-communication).
This is fragile and hacky, and can lean to leaking more data to the embedding
page.

### Open Web API

The open web is unaware of the concept of multiple Javascript worlds, so this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the open web does not support the concept of multiple worlds, it does already have a concept of a MessagePort (https://developer.mozilla.org/en-US/docs/Web/API/MessagePort). It's worth calling this out explicitly, and also mention that we cannot use this because (1) it is async and (2) the use of web prototypes would enable web pages to intercept messages (e.g. through MessagePort.prototype or Event.prototype).

Here is an example:

mc = new MessageChannel();
mc.port1.onmessage = e => console.log(e.data);
mc.port2.postMessage(2);
console.log(3);

Result:
3
2
// ( 2 logged after 3 means that the message delivery was async )

Copy link
Member

@dotproto dotproto Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename MessagePort to SyncMessagePort? That would both address the naming collision called out in the previous comment and describe what is different about this interface.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename MessagePort to SyncMessagePort?

Rather MessagePortSync because Sync at the end is the established pattern in the web platform (FileEntrySync) and nodejs (fs.readFileSync), whereas Sync at the beginning indicates it being the primary subject e.g. SyncEvent is for synchronization events.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot use MessagePort because it is async

Indeed, so maybe just make a new simpler API that doesn't use the concept of ports, but just establishes one secure synchronous channel.

wouldn't belong as an open web API.

## Implementation Notes

N/A.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand with notes on the prototype (maybe at the behavior section?):

Suggested change
N/A.
### Prototype
To avoid interception by web pages, the prototype chain of the `MessagePort`
instance should terminate at `null` instead of `Object.prototype`.
Concretely, this means that an instance of `MessagePort` could either inherit
from a (hidden) MessagePort class \* (which itself has a `null` Prototype), or
be a plain object with a null Prototype and all specified members as own
property descriptors.
\* hidden means: not exposed as a global class. If anyone (e.g. the web page)
ever gets access to a `MessagePort` instance, it may see the hidden prototype
via `port.__proto__` or `port.constructor.prototype` and through that intercept
& interfere with all future port communications.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to find an existing web specification that has interfaces that define an interface or object that inherits from null. The closest thing I was able to find was the create an exports object algorithm from the WebAssembly JavaScript Interface working draft.


## Future Work