The Gutenberg database field shouldn’t contain any markup at all instead the blocks should employ the to be built Fields API to get structured reusable content #2718

rpkoller · 2017-09-12T19:50:32Z

From a content modelling perspective with Gutenberg things haven’t changed much to the better. Initially the idea of structuring the blob of content in the wysiwyg field by the aid of blocks seems perfect and long overdue. But the problem is the implementation. Cuz first of all the blocks aren’t saved in separate fields inside the database but instead in a single field and on top of it that includes markup for the blocks as well as the comments identifiying each block by type. So what looks like an introduction of structured content chunks is basically still a blob of content enlarged in weight by extra markup. Which makes reusability more or less impossible. In my humble opinion markup should never ever be saved in the database. Separation of concerns is the key - never ever merge content and markup. Aside of that with Gutenberg every custom post type will have the same set of blocks available as far as I understood. Which might propbably lead to a rise of clutter, especially in editorial heavy environment, cuz editors will use in each and every gutenberg wysiwg field another combination of blocks for the same custom post type. Instead of a consistent and controlled set of blocks for a particular custom post type you have one single set for all content, which has also performance implications. Smells like clutter and big trouble in the long run.

Instead I would suggest the following approach. Introduce the Fields API to WordPress and take extended usage of it within Gutenberg. I would introduce a menu point Fields in the admin menu where you are able to create field types (a list of one to many different fields) which you are able to assign to a custom post type. So a Gutenberg field isn’t a uniform blob of data spiced up with markup anymore but just a set of fields aka blocks where Gutenberg is simply the wrapper for. So the Administrator or Author are creating a field type and you get a consistent amount of fields for a certain post type and it will be possible to reuse the data afterwards flawlessly.

In my opion a necessary step to finally step away from the blob of content towards a more sane approach to handle content, making it structured and reuseable.

dmsnell · 2017-09-17T23:50:04Z

things haven’t changed much to the better

Thanks for your feedback @rpkoller.

In my humble opinion markup should never ever be saved in the database. Separation of concerns is the key - never ever merge content and markup.

One way to help the project with feedback like this is to try and see if there are specific problems being caused by the code or decisions taken. I would like to ask how the existing serialization and parse of the content is actually problematic. Have you experienced difficulties using Gutenberg or extending it when working with the code and editor or is it the mere presence of this markup which is causing anxiety?

In other words, why shouldn't this method be used in this project and what specific problems have you seen it introduce in this project? I personally have trust in our system and it seems to be working well; I suspect that if you dive into the discussions in this repository and in #core-editor in the WordPress Slack channel you may discover that the fears enumerated here have already been addressed.

Our goal was to remove ambiguity which leads to views of posts not meeting the expectations of the authors who wrote them. Our goal wasn't to make sure that we don't break any rules-of-thumb but rather to preserve the author's experience and also work within the context of a system which has been running for over a decade and is powering a quarter of the web. If you find that we can't reliably represent our structured data model in this way then please open an issue with the specific bug you have found 😄

In my opion a necessary step to finally step away from the blob of content towards a more sane approach to handle content, making it structured and reuseable.

If you dig in I think you will discover that the Gutenberg data model is in fact structured. That information is stored in a string or in a different form than we expect doesn't immediately break its ability to be restored to the format with which we nominally interact. In the case of Gutenberg we have a way to seamlessly transform back and forth between the post_content serialization and the normative JSON tree representing the post.

ghost · 2017-11-23T22:45:44Z

I would like to ask how the existing serialization and parse of the content is actually problematic. Have you experienced difficulties using Gutenberg

@dmsnell: A simple WordPress search for "paragraph" or "core" or "image" (if an image was added) shows unexpected results, example.com/?s=paragraph WP 4.8.3, Gutenberg Plugin 1.7.0

dmsnell · 2017-11-23T23:13:01Z

Interesting point @TKES! I did a test and confirmed that shortcodes produce search matches as well 😱

After creating a post with a [gallery /] shortcode but no gallery text otherwise (although it appears in CSS class names in the render I guess) the posts shows up in the search results.

I don't believe this is a problem that Gutenberg is introducing nor is it blocking. Sounds to me like a good discussion for core what "search" means in this context. That is, if we want to introduce a new behavior that the search only searches #text nodes in the rendered HTML, then I think it'd be feasible to make that happen.

I'm sure this wouldn't be any easier if posts were stored in a separate data structure, do you? That is, regardless of how it's stored, we're probably looking at the difference between SELECT * FROM wp_posts WHERE post_content LIKE '%blarg%' and parsing documents no matter how they are stored.

Thoughts?

ghost · 2017-11-24T00:02:31Z

@dmsnell: Shortcodes are optional, most posts do not contain any.

The question was

how the existing [Gutenberg] serialization and parse of the content is actually problematic.

Gutenberg serialization markup is mandatory and leads to unexpected search results with above mentioned and many more keywords and keyword parts like para, graph, text, but, butt, button, cat, ate, categories, code, over, cover, form, head, ding, html, late, latest, post, list, quote, tor, table, ...

rpkoller mentioned this issue Nov 3, 2017

Do you have a road map? sc0ttkclark/wordpress-fields-api#84

Closed

pento closed this as completed Nov 3, 2017

ghost mentioned this issue Nov 30, 2017

WordPress search, unexpected results due to Gutenberg serialization markup #3739

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Gutenberg database field shouldn’t contain any markup at all instead the blocks should employ the to be built Fields API to get structured reusable content #2718

The Gutenberg database field shouldn’t contain any markup at all instead the blocks should employ the to be built Fields API to get structured reusable content #2718

rpkoller commented Sep 12, 2017

dmsnell commented Sep 17, 2017

ghost commented Nov 23, 2017

dmsnell commented Nov 23, 2017

ghost commented Nov 24, 2017

The Gutenberg database field shouldn’t contain any markup at all instead the blocks should employ the to be built Fields API to get structured reusable content #2718

The Gutenberg database field shouldn’t contain any markup at all instead the blocks should employ the to be built Fields API to get structured reusable content #2718

Comments

rpkoller commented Sep 12, 2017

dmsnell commented Sep 17, 2017

ghost commented Nov 23, 2017

dmsnell commented Nov 23, 2017

ghost commented Nov 24, 2017