-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compat: Disable wpautop for block-based posts #2806
Conversation
What about just applying |
(This might capture two instances, and we wouldn't have to add a JS |
Yeah, this came to mind as I was writing the pull request description. That might help simplify some things, particularly in the implementation of |
One issue we might encounter is that TinyMCE will apply the paragraph tags to the freeform content, so on the initial edit of a legacy content post in Gutenberg, changes will be flagged even if they don't truly exist (merely focusing then blurring the freeform block). Since any post saved in Gutenberg would then have paragraphs applied (by a server-side filter in recent recommendations), I expect it'd only be an issue for that initial editing session. This relates to Classic editor running |
blocks/api/parser.js
Outdated
name = name || getUnknownTypeHandlerName(); | ||
// If unknown type, use unknown handler and apply paragraph formatting. | ||
if ( undefined === name ) { | ||
rawContent = formatting.autop( rawContent ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, don't we already do that here: https://github.com/WordPress/gutenberg/blob/v1.1.0/blocks/library/freeform/old-editor.js#L79?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, don't we already do that here:
Good call, I did not see this, and it would likely be redundant. An issue with leaving the current implementation is that it needlessly flags the post as dirty because the original value of the block content does not have autop
applied, only after setup has completed (in master, focusing and blurring a freeform block without changes will mark the post as needing save).
With ideas floating around server-side autop, I'll consider whether the existing autop
call in OldEditor
is still necessary.
Reference to original approach, in anticipation of dropping commit during rebase: 392d554 |
468d000
to
113d161
Compare
I've updated this branch with a revised approach, where the client is no longer expected at all to apply autop behavior. Instead, when loading Gutenberg, the content of the post is formatted to apply autop selectively to freeform content only (i.e. content is parsed and autop applied on a block-by-block basis). To compare:
Where I hesitate:
|
$attrs_json = json_encode( $block['attrs'] ); | ||
|
||
// In PHP 5.4+, we would pass the `JSON_UNESCAPED_SLASHES` option to | ||
// `json_encode`. To support older versions, we must apply manually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice comment!
Also, check out the JS serializer to see the other transforms we make in order to safeguard the attributes
gutenberg/blocks/api/serializer.js
Line 120 in e7f1223
export function serializeAttributes( attrs ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, check out the JS serializer to see the other transforms we make in order to safeguard the attributes
Ah, right, will make a pass to port over this behavior and equivalent tests.
lib/blocks.php
Outdated
$blocks[ $i ]['rawContent'] = $content; | ||
} | ||
|
||
return gutenberg_serialize_blocks( $blocks ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one place where I have concern about using HTML strings as Editable (cc: @iseulde @youknowriad @mtias).
If we could iterate over the text nodes themselves and know that they contain no HTML then it seems like we could guarantee behaviors with autop
. In this mixture, however, it feels like we leave the door open to bugs. Maybe it's not too big of a deal. I think <
is illegal inside attributes and tags, but not illegal inside comments.
Not sure also how autop
works so please read my review with that in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I understand the concern here. With these changes and #2708, we're effectively turning off wpautop
for blocks, so it shouldn't be the case that we need to worry about how rich text (via HTML or other structure) is treated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more of a general concern about existing apply_filter()
code
lib/register.php
Outdated
/** | ||
* Determine whether a content string contains blocks. | ||
* | ||
* @since 1.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it would be worth adding a note on the limited ability this has to accurately determine blocks? I would think it obvious from looking at the code but maybe someone wouldn't realize this is only a first-order approximation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, could be a good idea. The implementation is dubious, but... do you have in mind any scenarios in which it would not be accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the rare case where someone writes something like…
This is not a <!-- wp --> block. <!-- wp:comment -->
…or when something odd like that accidentally crops up.
Not really saying we need to go overboard at this point to
detect that stuff, but it would be good to point out that it's
an optimization against parsing the document to find at
least one valid block.
|
||
Touch | ||
|
||
This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔨 🕐
Testing this change and I don't get the initial fail state. With a fresh check out and build from master, following the instructions above to create paragraph in Classic, and then in Gutenberg - both paragraphs get created with I also tested on the branch with the change, and it behaves the same. |
c55ebb1
to
86a9792
Compare
Rebased to resolve conflicts, and with a few revisions included in the rebase.
|
lib/blocks.php
Outdated
function gutenberg_serialize_block( $block ) { | ||
// Return content of unknown block verbatim. | ||
if ( ! isset( $block['blockName'] ) ) { | ||
return $block['rawContent']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick ear-mark here as I think we're going to probably rename rawContent
to something like innerHtml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick ear-mark here as I think we're going to probably rename
rawContent
to something likeinnerHtml
I think that would be good. One idea I had originally was that, in parsing blocks, we could save effort in reserializing them since we should already know the original string text from which the block was parsed. Do you think it could be reasonable to still have something like rawContent
returned from the parser which is assigned as the raw text of the block, comments and all?
Then again, there could be value in having the serializer available anyways, and not relying on blocks maybe having this rawContent
property available (e.g. not available if created from PHP, only if parsed from text).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my original intention in bringing up rawContent
was an exception - that most blocks would return a tree of children but a few blocks which "need" the raw HTML would get it instead of the tree.
I can imagine use cases for the full body of the block but at the same time anticipate that providing as much is more of an enabler for working around the system than with it.
the way the PEG generator works this can be somewhat tricky to provide both - not sure why - which means that it may take more time if we want that to happen (we end up having to carry the rawContent
attribute to each level of token and then reduce at the top level to stitch them together)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not test but I did review the code. Thanks for adding the comments on has_blocky_stuff()
!
86a9792
to
ebf95c0
Compare
Rebased to resolve conflicts. Would be good to have a few more eyes on this one, particularly after the latest changes. |
ebf95c0
to
91424c4
Compare
Codecov Report
@@ Coverage Diff @@
## master #2806 +/- ##
=======================================
Coverage 31.98% 31.98%
=======================================
Files 249 249
Lines 6901 6901
Branches 1254 1254
=======================================
Hits 2207 2207
Misses 3945 3945
Partials 749 749
Continue to review full report at Codecov.
|
lib/blocks.php
Outdated
@@ -58,6 +58,80 @@ function gutenberg_parse_blocks( $content ) { | |||
} | |||
|
|||
/** | |||
* Given an array of parsed blocks, returns content string. | |||
* | |||
* @since 1.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update these before merge
Noting that this branch incidentally resolves the issue where focusing then blurring a freeform block from a post authored in the classic editor would flag the post as needing to be saved. |
ok, I'm pretty sure I know why it's not working, the loading of scripts changed from I'll create a new PR for it. |
91424c4
to
098218d
Compare
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806
* Framework: Drop server-side block serialization, wpautop A parsed block has no awareness of where inner blocks exist in its innerHTML, so it cannot safely reserialize. There are a few options: - Since we merely skip wpautop for known blocks, we could avoid reserialization and return the block's original HTML verbatim if we had access to its outerHTML. See nylen/phpegjs#3 - Move wpautop behavior for freeform content to the editor client. This may align well with desires to transparently upgrade legacy paragraph content to paragraph blocks. This would also allow the server to avoid any preprocessing before showing a post on the front-end, assuming that the saved content has had wpautop applied already. Acknowledging that this effectively reverts large parts of #2806 * Parser: Apply autop to fallback block content
Closes #2736
This pull request seeks to disable wpautop for Gutenberg posts. The markup structure of a block is defined as the result of the block implementation's
save
function, and does not need (and should not have) formatting of automatic paragraph application fromwpautop
. The changes here disablethe_content
's filter ofwpautop
for posts containing blocks.The challenge here is in managing posts which were originally authored in the Classic editor, then later in Gutenberg to add new content blocks (which will be a common occurrence after Gutenberg is merged).
(See revised behavior: #2806 (comment))
To preserve paragraphs which had previously relied on the behavior ofautop
, the Gutenberg editor will applyautop
during the parse step. This has the result of setting<p>
into saved content, instead of relying on this to be applied later duringthe_content
.This is unlike some of the considerations proposed at #2708, where autop would be applied on a per-block basis inthe_content
. I am very wary of the performance impact of detecting and individually formatting blocks in content during a front-end filter. In the suggested use-case of an external editor, autop behavior would continue to apply for posts which do not contain blocks. It should never be the case that a post contains both blocks and legacy freeform content without explicit<p>
tags if edited in Gutenberg, but it would be true that such an external client would need to pass new saved content in a block post with explicit tags as well. This is not complete compatibility, and it may be worth exploring whetherwpautop
should also be applied server-side when saving a post containing both blocks and freeform content to capture these instances.Testing instructions:
<p>
tags