Compat: Disable wpautop for block-based posts #2806

aduth · 2017-09-27T01:38:52Z

This pull request seeks to disable wpautop for Gutenberg posts. The markup structure of a block is defined as the result of the block implementation's save function, and does not need (and should not have) formatting of automatic paragraph application from wpautop. The changes here disable the_content's filter of wpautop for posts containing blocks.

The challenge here is in managing posts which were originally authored in the Classic editor, then later in Gutenberg to add new content blocks (which will be a common occurrence after Gutenberg is merged).

(See revised behavior: #2806 (comment))

To preserve paragraphs which had previously relied on the behavior of autop, the Gutenberg editor will apply autop during the parse step. This has the result of setting <p> into saved content, instead of relying on this to be applied later during the_content.

This is unlike some of the considerations proposed at #2708, where autop would be applied on a per-block basis in the_content. I am very wary of the performance impact of detecting and individually formatting blocks in content during a front-end filter. In the suggested use-case of an external editor, autop behavior would continue to apply for posts which do not contain blocks. It should never be the case that a post contains both blocks and legacy freeform content without explicit <p> tags if edited in Gutenberg, but it would be true that such an external client would need to pass new saved content in a block post with explicit tags as well. This is not complete compatibility, and it may be worth exploring whether wpautop should also be applied server-side when saving a post containing both blocks and freeform content to capture these instances.

Testing instructions:

Create a new post in the classic editor with a single paragraph of text
Save the post as a draft
Return to the posts list
Edit the post saved in Step 2 in the Gutenberg editor (hover in list, then click Gutenberg)
Add a new paragraph block with some text
Press Preview
Inspect content of preview
Note that both paragraphs are wrapped in <p> tags

ellatrix · 2017-09-27T12:47:31Z

I am very wary of the performance impact of detecting and individually formatting blocks in content during a front-end filter. [...] This is not complete compatibility, and it may be worth exploring whether wpautop should also be applied server-side when saving a post containing both blocks and freeform content to capture these instances.

What about just applying autop on freeform blocks server-side before save?

ellatrix · 2017-09-27T12:50:22Z

(This might capture two instances, and we wouldn't have to add a JS autop script.)

aduth · 2017-09-27T13:28:04Z

What about just applying autop on freeform blocks server-side before save?

Yeah, this came to mind as I was writing the pull request description. That might help simplify some things, particularly in the implementation of wpautop in the browser, which I've found to not faithfully follow the server-side variant.

aduth · 2017-09-27T13:33:58Z

One issue we might encounter is that TinyMCE will apply the paragraph tags to the freeform content, so on the initial edit of a legacy content post in Gutenberg, changes will be flagged even if they don't truly exist (merely focusing then blurring the freeform block). Since any post saved in Gutenberg would then have paragraphs applied (by a server-side filter in recent recommendations), I expect it'd only be an issue for that initial editing session.

This relates to Classic editor running removep on content before saving.

ellatrix · 2017-09-27T14:09:42Z

blocks/api/parser.js

-	name = name || getUnknownTypeHandlerName();
+	// If unknown type, use unknown handler and apply paragraph formatting.
+	if ( undefined === name ) {
+		rawContent = formatting.autop( rawContent );


Hm, don't we already do that here: https://github.com/WordPress/gutenberg/blob/v1.1.0/blocks/library/freeform/old-editor.js#L79?

Hm, don't we already do that here:

Good call, I did not see this, and it would likely be redundant. An issue with leaving the current implementation is that it needlessly flags the post as dirty because the original value of the block content does not have autop applied, only after setup has completed (in master, focusing and blurring a freeform block without changes will mark the post as needing save).

With ideas floating around server-side autop, I'll consider whether the existing autop call in OldEditor is still necessary.

aduth · 2017-09-28T18:30:06Z

Reference to original approach, in anticipation of dropping commit during rebase: 392d554

aduth · 2017-09-28T20:59:59Z

I've updated this branch with a revised approach, where the client is no longer expected at all to apply autop behavior. Instead, when loading Gutenberg, the content of the post is formatted to apply autop selectively to freeform content only (i.e. content is parsed and autop applied on a block-by-block basis).

To compare:

We no longer need to maintain or account for differences in wpautop implementation differences between the client and server, since we rely entirely on the server
We could apply this on save, but due to the behavior described at Compat: Disable wpautop for block-based posts #2806 (comment), we would still want autop applied before the editor session begins, either on the server (as implemented currently) or in the client as part of the parse (difficult to faithfully recreate)
- Even if another client would save with removep, if ever the user reopened the post in Gutenberg, the content should still be effectively the same

Where I hesitate:

I don't like that we modify the post entity from the server-side REST call when preparing the post for the editor. Unfortunately, there is already some precedent here in how we reset the Auto Draft title to an empty string. Alternatives here might be: Modifying content during REST filters, creating a separate content property (e.g. Add block data to REST API #2649 (comment)), or bootstrapping the page with a custom object shape for the post (not implying strict adherence to REST API schema).
Recreating serialize behavior on the server seems useful, but I'm skeptical how well it mimics the behavior we have in the client: one in particular was slash encoding applied by default in PHP's json_encode, which can be disabled in PHP 5.4+ with the JSON_UNESCAPED_SLASHES flag. Since we can't target 5.4+, I manually recreated this effect with a str_replace. I may need to check whether we need JSON_UNESCAPED_UNICODE here as well.

dmsnell · 2017-09-29T14:37:56Z

lib/blocks.php

+		$attrs_json = json_encode( $block['attrs'] );
+
+		// In PHP 5.4+, we would pass the `JSON_UNESCAPED_SLASHES` option to
+		// `json_encode`. To support older versions, we must apply manually.


Nice comment!

Also, check out the JS serializer to see the other transforms we make in order to safeguard the attributes

gutenberg/blocks/api/serializer.js

Line 120 in e7f1223

export function serializeAttributes( attrs ) {

Also, check out the JS serializer to see the other transforms we make in order to safeguard the attributes

Ah, right, will make a pass to port over this behavior and equivalent tests.

dmsnell · 2017-09-29T14:43:16Z

lib/blocks.php

+		$blocks[ $i ]['rawContent'] = $content;
+	}
+
+	return gutenberg_serialize_blocks( $blocks );


This is one place where I have concern about using HTML strings as Editable (cc: @iseulde @youknowriad @mtias).

If we could iterate over the text nodes themselves and know that they contain no HTML then it seems like we could guarantee behaviors with autop. In this mixture, however, it feels like we leave the door open to bugs. Maybe it's not too big of a deal. I think < is illegal inside attributes and tags, but not illegal inside comments.

Not sure also how autop works so please read my review with that in mind.

I'm not quite sure I understand the concern here. With these changes and #2708, we're effectively turning off wpautop for blocks, so it shouldn't be the case that we need to worry about how rich text (via HTML or other structure) is treated.

more of a general concern about existing apply_filter() code

dmsnell · 2017-09-29T14:46:52Z

lib/register.php

+/**
+ * Determine whether a content string contains blocks.
+ *
+ * @since 1.2.0


do you think it would be worth adding a note on the limited ability this has to accurately determine blocks? I would think it obvious from looking at the code but maybe someone wouldn't realize this is only a first-order approximation.

Yeah, could be a good idea. The implementation is dubious, but... do you have in mind any scenarios in which it would not be accurate?

just the rare case where someone writes something like…

This is not a  block. 

…or when something odd like that accidentally crops up.

Not really saying we need to go overboard at this point to
detect that stuff, but it would be good to point out that it's
an optimization against parsing the document to find at
least one valid block.

dmsnell · 2017-09-29T14:47:42Z

phpunit/fixtures/wpautop-original.html

+
+Touch
+
+This


mkaz · 2017-10-02T22:43:36Z

Testing this change and I don't get the initial fail state.

With a fresh check out and build from master, following the instructions above to create paragraph in Classic, and then in Gutenberg - both paragraphs get created with <p> tags around them. Editing also shows as expected with Classic Block and Gutenberg Paragraph Block.

I also tested on the branch with the change, and it behaves the same.

aduth · 2017-10-04T22:25:09Z

Rebased to resolve conflicts, and with a few revisions included in the rebase.

The rebased e8d802b ports string replacements for comment escaping from the JavaScript serializer to PHP, with an additional unit test for verifying serialized -> parsed -> serialized equivalence.
The rebased 2d2775b expands on the DocBlock to clarify accuracy of gutenberg_content_has_blocks. It could be reasonable to introduce a "strict" flag akin to the third argument of PHP's in_array which would change the logic to running the parser and testing the result, but since we don't have an immediate need for this, it would be best to postpone for a future revision.
I added 86a9792 which applies wpautop to freeform content for any save which includes block content. While this is arguably unnecessary and a waste of resources in parsing for saved content coming from the Gutenberg or classic editors, it could improve compatibility of saved content originating from other third-party editors which don't fall under Gutenberg or the classic editor changes from Compat: Disable wpautop in Classic Editor when post contains block #2708
I omitted changes to remove the gutenberg_ prefixing from block content functions. We may want to consider going down this path for easier merge, but should be done separately from this pull request.

dmsnell · 2017-10-04T22:55:29Z

lib/blocks.php

+function gutenberg_serialize_block( $block ) {
+	// Return content of unknown block verbatim.
+	if ( ! isset( $block['blockName'] ) ) {
+		return $block['rawContent'];


quick ear-mark here as I think we're going to probably rename rawContent to something like innerHtml

quick ear-mark here as I think we're going to probably rename rawContent to something like innerHtml

I think that would be good. One idea I had originally was that, in parsing blocks, we could save effort in reserializing them since we should already know the original string text from which the block was parsed. Do you think it could be reasonable to still have something like rawContent returned from the parser which is assigned as the raw text of the block, comments and all?

Then again, there could be value in having the serializer available anyways, and not relying on blocks maybe having this rawContent property available (e.g. not available if created from PHP, only if parsed from text).

my original intention in bringing up rawContent was an exception - that most blocks would return a tree of children but a few blocks which "need" the raw HTML would get it instead of the tree.

I can imagine use cases for the full body of the block but at the same time anticipate that providing as much is more of an enabler for working around the system than with it.

the way the PEG generator works this can be somewhat tricky to provide both - not sure why - which means that it may take more time if we want that to happen (we end up having to carry the rawContent attribute to each level of token and then reduce at the top level to stitch them together)

dmsnell

I did not test but I did review the code. Thanks for adding the comments on has_blocky_stuff()!

aduth · 2017-10-27T13:00:24Z

Rebased to resolve conflicts. Would be good to have a few more eyes on this one, particularly after the latest changes.

codecov · 2017-10-27T13:16:45Z

Codecov Report

Merging #2806 into master will not change coverage.
The diff coverage is 0%.

@@           Coverage Diff           @@
##           master    #2806   +/-   ##
=======================================
  Coverage   31.98%   31.98%           
=======================================
  Files         249      249           
  Lines        6901     6901           
  Branches     1254     1254           
=======================================
  Hits         2207     2207           
  Misses       3945     3945           
  Partials      749      749

Impacted Files	Coverage Δ
blocks/library/freeform/old-editor.js	`1.51% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48ccc03...098218d. Read the comment docs.

mtias · 2017-10-27T13:56:22Z

lib/blocks.php

@@ -58,6 +58,80 @@ function gutenberg_parse_blocks( $content ) {
 }

 /**
+ * Given an array of parsed blocks, returns content string.
+ *
+ * @since 1.3.0


need to update these before merge

mkaz · 2017-10-27T17:45:57Z

Testing out this change it works good, tested with various blocks of test with varying p tags and worked as expected.

However, the issue in #2817 is back when Jetpack Markdown is enabled, I'm reopening that for @pento to take a look

aduth · 2017-10-28T13:42:39Z

Noting that this branch incidentally resolves the issue where focusing then blurring a freeform block from a post authored in the classic editor would flag the post as needing to be saved.

aduth · 2017-10-30T18:59:54Z

However, the issue in #2817 is back when Jetpack Markdown is enabled, I'm reopening that for @pento to take a look

By "back", do you mean reintroduced by these changes? Or is this issue present on master as well?

mkaz · 2017-10-30T20:49:25Z

I see the issue still on master and confused by the state of #2817.
Primarily due to @pento comment which was part of 2bfed02

Revert the revert of #2817.

The change he made looks right, so I think he was trying to put back my change that got overwritten.

mkaz · 2017-10-30T20:53:54Z

ok, I'm pretty sure I know why it's not working, the loading of scripts changed from gutenberg.php to lib/load.php and the require for lib/plugin-compat.php was not included.

I'll create a new PR for it.

A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806

* Framework: Drop server-side block serialization, wpautop A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806 * Parser: Apply autop to fallback block content

aduth added [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Status] In Progress Tracking issues with work in progress labels Sep 27, 2017

ellatrix reviewed Sep 27, 2017

View reviewed changes

aduth force-pushed the remove/autop-gutenberg branch 2 times, most recently from 468d000 to 113d161 Compare September 28, 2017 20:39

aduth removed the [Status] In Progress Tracking issues with work in progress label Sep 28, 2017

dmsnell reviewed Sep 29, 2017

View reviewed changes

phpunit/fixtures/wpautop-original.html

Touch

This

Copy link

Member

dmsnell Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔨 🕐

mkaz added a commit that referenced this pull request Oct 2, 2017

Move content_has_blocks to register.php to be compat with PR #2806

2836321

mkaz added a commit that referenced this pull request Oct 3, 2017

Move content_has_blocks to register.php to be compat with PR #2806

9377638

aduth mentioned this pull request Oct 3, 2017

Compat: Disable wpautop in Classic Editor when post contains block #2708

Merged

aduth force-pushed the remove/autop-gutenberg branch from c55ebb1 to 86a9792 Compare October 4, 2017 22:17

dmsnell reviewed Oct 4, 2017

View reviewed changes

dmsnell approved these changes Oct 4, 2017

View reviewed changes

aduth force-pushed the remove/autop-gutenberg branch from 86a9792 to ebf95c0 Compare October 27, 2017 12:59

aduth force-pushed the remove/autop-gutenberg branch from ebf95c0 to 91424c4 Compare October 27, 2017 13:16

mtias reviewed Oct 27, 2017

View reviewed changes

aduth added 3 commits November 9, 2017 11:59

Compat: Disable wpautop for posts containing blocks

cae4e46

Compat: Apply autop to freeform content for Gutenberg

a7b6c1e

Compat: Apply autop to freeform block content on post save

098218d

aduth force-pushed the remove/autop-gutenberg branch from 91424c4 to 098218d Compare November 9, 2017 17:00

aduth merged commit 475cd88 into master Nov 9, 2017

aduth deleted the remove/autop-gutenberg branch November 9, 2017 18:13

youknowriad mentioned this pull request Nov 10, 2017

Centered button identified as "modified externally" #3426

Closed

mcsf mentioned this pull request Nov 13, 2017

Unslash post content before parsing during save #3455

Merged

3 tasks

aduth mentioned this pull request Nov 18, 2017

Parsing: Replace server serialization by parsed outerHTML #3545

Closed

aduth mentioned this pull request Nov 30, 2017

Blocks: Support block nesting (Add Columns block, InnerBlocks component) #3745

Merged

aduth mentioned this pull request Dec 14, 2017

Parser: Client-side wpautop #4005

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compat: Disable wpautop for block-based posts #2806

Compat: Disable wpautop for block-based posts #2806

aduth commented Sep 27, 2017 •

edited

Loading

ellatrix commented Sep 27, 2017

ellatrix commented Sep 27, 2017 •

edited

Loading

aduth commented Sep 27, 2017

aduth commented Sep 27, 2017

ellatrix Sep 27, 2017

aduth Sep 28, 2017

aduth commented Sep 28, 2017

aduth commented Sep 28, 2017

dmsnell Sep 29, 2017

aduth Oct 4, 2017

dmsnell Sep 29, 2017

aduth Oct 4, 2017

dmsnell Oct 4, 2017

dmsnell Sep 29, 2017

aduth Oct 4, 2017

dmsnell Oct 4, 2017 •

edited

Loading

dmsnell Sep 29, 2017

mkaz commented Oct 2, 2017

aduth commented Oct 4, 2017

dmsnell Oct 4, 2017

aduth Oct 4, 2017

dmsnell Oct 4, 2017

dmsnell left a comment

aduth commented Oct 27, 2017 •

edited

Loading

codecov bot commented Oct 27, 2017 •

edited

Loading

mtias Oct 27, 2017

mkaz commented Oct 27, 2017

aduth commented Oct 28, 2017

aduth commented Oct 30, 2017

mkaz commented Oct 30, 2017

mkaz commented Oct 30, 2017

Compat: Disable wpautop for block-based posts #2806

Compat: Disable wpautop for block-based posts #2806

Conversation

aduth commented Sep 27, 2017 • edited Loading

ellatrix commented Sep 27, 2017

ellatrix commented Sep 27, 2017 • edited Loading

aduth commented Sep 27, 2017

aduth commented Sep 27, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aduth commented Sep 28, 2017

aduth commented Sep 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell Oct 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkaz commented Oct 2, 2017

aduth commented Oct 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell left a comment

Choose a reason for hiding this comment

aduth commented Oct 27, 2017 • edited Loading

codecov bot commented Oct 27, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

mkaz commented Oct 27, 2017

aduth commented Oct 28, 2017

aduth commented Oct 30, 2017

mkaz commented Oct 30, 2017

mkaz commented Oct 30, 2017

aduth commented Sep 27, 2017 •

edited

Loading

ellatrix commented Sep 27, 2017 •

edited

Loading

dmsnell Oct 4, 2017 •

edited

Loading

aduth commented Oct 27, 2017 •

edited

Loading

codecov bot commented Oct 27, 2017 •

edited

Loading