-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: p2p Real Time Collaborative Core Data Entity Editing #17964
Conversation
🤯 I think this is going to take me a while to fully understand. |
Out of curiosity, how would this compare with a WebRTC-based solution — see the old exploration at #1930 — if the latter took advantage of the latest gains around entities, etc.? What kind of added reconciliation work would be needed on top of WebRTC? What trade-offs are we generally considering? |
This also uses Web-RTC for transport under the hood. The main difference from a user's perspective is that instead of locking blocks, we do conflict resolution down to the individual block property level. We can do this easily because JSON objects without arrays are generally convertible to CRDTs through adding some type of vector clock system to make deep merge style updates commutative. Here is a high level overview of how the library used does this. With that, we just need the trick described in The only case where this could start to break down would be when multiple peers are editing the same |
@epiqueras Where did you hear about Gun, or how did you come to it? Have you used it before? Any general info that's useful to read? |
Hacker News
No
|
Perhaps something that's interesting to read as they went for Operational Transformation: https://ckeditor.com/blog/Lessons-learned-from-creating-a-rich-text-editor-with-real-time-collaboration/. |
Yeah, I'm reading about Gun right new. Starting with https://gun.eco/distributed/matters.html. |
The better engine to use for free and open source collaborative rich text editing is GitHub.com/yjs/yjs. A demo using atlaskit with prosemirror can be found here: https://yjs-demos.now.sh/prosemirror-atlaskit/ Similar to GUN Yjs can work completely peer2peer and also uses CRDTs (as would Automerge). The server code is also rather trivial for Yjs —- it’s basically passing messages from client to client - when WebETC is not available / reliable. CKEditor did build a great commercial product, which they use to make money with (good for them), but implementing OT for OSS projects is not so great as there are a lot of rules to implement. Building upon CRDTs has the added advantage that Eg collaboratively configuring Components becomes a breeze - similar to how it was implemented here with gun.eco. We talked a lot about real time collaborative editing and the lessons we learned implementing Yjs on our podcast series TagTeamTalks on YouTube for further background information. Yjs has been working great for us and I strongly believe that if we all invest in OpenSource we cold have real time collaborativr editing in the future in every CMS with some companies providing YJs-store-as-a-service similar to how Solr works. |
Hello @LionsAd, thanks for the info here, and for joining the conversation 😄 I haven't looked too deeply into Gun vs. YJS, but my first impression is that Gun is more popular and robust, providing a lot more utilities for things that are auxiliary, although still very necessary for us, to the shared types, like user management and crypto based authentication. What makes you think YJS is a better choice? |
Hi, I'm the Author of Yjs. Fabian just pointed me to this thread. Thanks for your positive review @LionsAd ! Gun is an awesome project and I'm inspired by some their concepts. Yjs and Gun share similar goals - mainly providing observable shared data types. But Yjs really focuses on providing an efficient backend for collaborative editing, offline editing, and showing differences between different states. Basically everything that you would expect from Google Docs - but p2p and OSS. To my knowledge, Yjs is the first CRDT implementation that is suited for rich-text editing on large documents. Yjs implements a CRDT algorithm that is highly optimized for collaborative (rich)-text editing. I don't want to discourage you from using Gun. I think it is impressive that you made it work on-top of Gun. Would you be able to provide performance data on how Gun performs for text editing. I have not yet considered Gun as a backend for text editing. If you are interested you can see how Yjs compares to AutoMerge - a CRDT implementation that also build for Text editing https://github.com/dmonad/crdt-benchmarks. |
Welcome @dmonad 😄! So the differences here are in performance for different use cases/data types? We should run the same benchmarks on Gun, but we also need to consider that in Gutenberg a lot of editing happens outside of rich text, namely block inserts/moves/removals and block attribute changes. These edits are just plain object and array mutations. I also see you are working on a new version of YJS. Are there plans for this version to have new APIs like Gun's user API? |
@epiqueras The single point where Yjs is better than gun.eco is that it supports the text editing. While it's pretty simple to implement text editing with CRDTs, it's hard to do that efficiently for long texts and also for rich-text editing (e.g. bold, italic, etc.). Yjs has a YXMLFragment data type for that, gun.eco only had a prototype of collab text (no formatting) editing. Indeed all the other things can be expressed as just mutations in a tree database and you could likely "lock" the content area as an alternative with gun.eco (though p2p locking is really hard). However Yjs and gun.eco here have the same properties: gun.eco gives you a powerful tree database, while Yjs gives you data types like YArray, YMap, etc. to use for that, which can also be nested and hence used like a tree database in essence. gun.eco's encryption support for user authentication could also be used with Yjs -- it's orthogonal to the transport and syncing. While WebRTBC with p2p is obviously the holy grail, usually a more practical solution to start is the more google-docsy way of having a (nodejs) server for transport and then using a uuid for authentication. If you know the uuid, you can participate, but if you don't, you can't. For this initial handshake if it really needs to be p2p, gun could be used. (e.g. use the best technology for the best job.) However at least for the sites where I work with CMS, there is also a lot of rich-text formatting, undo, etc. needed as features of an editor and there Yjs is more specialized. I hope that helps clarify a little bit :). |
Thanks for that. That's sort of what I understood as well. Gun gives you the full p2p setup with authentication out of the box, while YJS requires you to do more heavy lifting for that. Having us instead rely on a node server for transport is not possible as it wouldn't be supported by WordPress hosts. The rich text support as I understand it, is not really a selling point for Gutenberg, because it relies on either binding to the Quill editor or integrating with the type's operations. For that we would need to refactor our rich text editor potentially quite extensively and I'm not sure if that's warranted for slightly improving the user experience of editing the same rich text field. |
Exactly. Gun doesn't really provide a method to model text for concurrent operations. As you mentioned above, you simply replace the whole text-field so that only one user can work at a single section at a time. This might lead to a very bad user-experience when working offline or with a slow connection. Yjs is really great at manipulating large lists. As @LionsAd mentioned, you can even use the Yjs Text type to assign ranges on text attributes (like bold, italic, ..).
Yjs is completely agnostic to user-management and transport protocols. You can definitely implement a similar authentication scheme by building your own connector. There are several connectors that you could use a template. Yjs supports many popular communication protocols like Dat, IPFS, websockets, webrtc as plugins. I think the y-dat connector might be pretty interesting to you if you want a backend-less solution (which is still in progress at the time). Although I believe that in some scenarios you might want a central server to manage the data. Yjs gives you that option.
Types in Yjs are just represented as data types that you can manipulate with methods. Very much like Gun. Here is an example:
I think it would be fairly easy to port your approach to Yjs. If you are interested I'd be happy to help out. |
But YJS doesn't provide a model compatible with our rich text editor so integrating it would probably be as hard as using Gun primitives to create a model from scratch. I would want to see the same performance benchmarks ran on Gun before investing more time in either solution.
What's the tentative ETA for that? |
I investigated Gun a bit and I'm not sure if I can fairly compare gun with the other CRDTs. The benchmark mocks the network layer and Gun has its network layer deeply integrated. If I'm correct, Gutenberg has an immutable data structure - similar to ProseMirror (another editor that Yjs supports). I will create a minimal demo tomorrow. We can compare the approaches then. |
Correct.
That will be awesome, thank you 😄! |
Forgot to answer this: This could be a couple of months. I'm still figuring out the concepts and how to build a good dev-experience with dat. At the moment the dat browser-support is a bit unstable. |
Hi @epiqueras. I created a basic demo using Yjs as a sync engine and opened a PR for you to try it out #18357. Here is a live demo link: https://gutenberg-yjs.now.sh/ I'm looking forward to your feedback. You should definitely compare the propagation delay and message size between the two approaches. |
I hooked Yjs to the React editor state in the /playground. When a editor block changes, it currently overwrites the complete block content, instead of applying the differences. This is basically the same syncing approach as described in WordPress#17964, therefore it should allow for a fair comparison. But Yjs also allows to apply differences to the text object and is better suited to enable multiple users to work on the same paragraph.
@dmonad Thank you so much! I commented there: #18357 (comment). |
Related to #1930
Background
I’ve been thinking a lot about full collaborative editing for Gutenberg, Google Docs style.
I know this was explored in a few PRs (#1930) with per-block locking, because “full collaborative” seemed too hard without relying on the WordPress host being able to run and keep up long running syncing processes and what not. However, I think that if we reframe a few things we could achieve it relatively easily, fully client side, and avoid going with a solution that might make it harder to refactor into supporting “full collaborative” later on.
Here is the minimum code/effort approach that I think would work:
Convert block edits into conflict-free replicated data types (CRDT). This should be pretty straight forward because they are mostly Redux action POJOs. We would need to add a version vector for them and a new relative index based abstraction for storing block position, i.e. inserting a block between block 1 and block 2 should yield a new block at position 1.5 and not change the others’ indexes.
This would essentially mean that all block edits would become idempotent and commutative. We would also need to implement undo in
block-editor
, instead of relying oncore-data
, so that we can keep the undo stack specific to the client.For
RichText
, things will be a bit more complex and a CRDT refactor might be too costly, but we could easily implement diff-sync for it or use a compatible library.Diff-sync, in short, is when the client keeps a copy of what the server has and the server keeps a copy of what each client has. When the client wants to sync, it can easily send a diff, the server then applies that diff both to its copy of the client’s contents and to itself. Now the diff between the server’s copy of the client’s contents and what the server has, is a representation of what couldn’t be applied due to changes made to the server by other clients and this is sent back to the client to be patched on top. There’s a bit more nuance to it for handling failures, but there are a lot of open source implementations for it that work with formatted text.
The only issue with diff-sync is that you need a host, now although not ideal, if the WordPress instance does not support being a host for this, we could fallback to a client acting as the host. Otherwise, we could embark on the
RichText
refactor or just lockRichText
fields and send CRDT edits for their whole value.Description
This PR explores a Conflict-free Replicated Data Type (CRDT) based p2p approach to collaborative
core-data
entity (#17368) editing.It adds a new property to entity configs that allows you to specify edits that should be synced across peers and uses it for post blocks. The demo video below also shows how easily you can include other properties like post titles.
The p2p layer uses the Gun engine which automatically converts JSON objects without arrays into CRDTs with some compromises, with some extensions to support array edits as CRDTs. The reasoning for the extensions is provided in the header comment on
packages/core-data/src/gun.js
.The server-side requirements of Gun are very low. All that is needed is a lightweight stateless relay server, similar to WebRTCs public signaling servers, to bootstrap peer connections with peers without public IP addresses. The relay servers could be provided by a plugin provider or even be community-ran and shared between all WordPress instances. They even support a Daisy-chain Ad-hoc Mesh-network mode to relay messages between peers without WebRTC, but of course, enabling that would make it more prohibitive to host publicly.
Authentication doesn't even have to happen at the relay server, because Gun supports only subscribing to subgraphs of the network and subgraphs can be encrypted by say, the hash of a token or password shared by a WordPress site's users with access to that data.
How has this been tested?
It was verified that running
npm run dev:gun
starts a local relay server and lets you edit collaboratively onlocalhost
.Video
https://youtu.be/8SejOZSTJ5I
Types of Changes
New Feature: Core Data entities now support p2p synced edits.
Checklist: