Parser: Allow block nesting #2743

dmsnell · 2017-09-17T00:22:25Z

In this patch we're opening up a new avenue for allowing nested blocks
in the data structure.

For each block:

Nested blocks appear as innerBlocks as a sequential list
The contained HTML without the nested blocks appear as a string
property innerHtml which replaces rawContent (update it doesn't replace it yet)

Testing

Smoke test in the browser.
Load the grammar into the explorer (you can use the fetcher URL in the explorer for the grammar link) then load in some test content and play with nesting. You should be able to see how innerBlocks grows and when blocks do nest, that the snippet which represents them disappears from innerHtml.

Input

Front

<!-- wp:core/void /-->

<!-- wp:core/nest -->
Before
<!-- wp:core/inside -->
This is working
<!-- /wp:core/inside -->
Middle
<!-- wp:core/inside -->
Another in the <!-- wp:core/mark -->jump here<!-- /wp:core/mark -->list
<!-- /wp:core/inside -->
After
<!-- /wp:core/nest -->

End

Output

[
  {
    "blockName": "core/freeform",
    "rawContent": "Front\n\n"
  },
  {
    "blockName": "core/void",
    "attrs": null,
    "innerBlocks": [],
    "rawContent": ""
  },
  {
    "blockName": "core/freeform",
    "rawContent": "\n\n"
  },
  {
    "blockName": "core/nest",
    "attrs": null,
    "innerBlocks": [
      {
        "blockName": "core/inside",
        "attrs": null,
        "innerBlocks": [],
        "rawContent": "\nThis is working\n"
      },
      {
        "blockName": "core/inside",
        "attrs": null,
        "innerBlocks": [
          {
            "blockName": "core/mark",
            "attrs": null,
            "innerBlocks": [],
            "rawContent": "jump here"
          }
        ],
        "rawContent": "\nAnother in the list\n"
      }
    ],
    "rawContent": "\nBefore\n\nMiddle\n\nAfter\n"
  },
  {
    "blockName": "core/freeform",
    "rawContent": "\n\nEnd"
  }
]

Notes

This could end up going really slowly on account of how we are parsing a list of non-token characters and then eventually joining them.
I haven't tested any of this. Main purpose of the PR is to get something going to explore and see if it's good. We can update and fix all things later if we need to.

BE-Webdesign · 2017-09-17T17:37:17Z

dmsnell to the rescue! I think allowing nested blocks from the parser will be a good step forward! Awesome work so far!

Slightly off topic, but definitely tied to this, although they are more next step questions:

How do you think we will handle block normalization, or if we even want to do normalization?
How do we handle reordering of child blocks after parse ( tied to above )?

How do we parse this correctly?:

{
    "blockName": "core/inside",
    "attrs": null,
    "innerBlocks": [
        {
          "blockName": "core/mark",
          "attrs": null,
          "innerBlocks": [],
          "rawContent": "jump here"
        }
    ],
    "rawContent": "\nAnother in the list\n"
}

"jump here" is supposed to live in between the words "the" and "list", how do we denote that location in our block data? Do we not allow other blocks within text to avoid the need for cursor data or do we need to figure out a data model that can accommodate nested blocks better, or do we create some sort of template syntax to abstract our render tree?

I don't know if this is already a solved problem, but I think the above example, in my opinion, shows that we really want to create a data spec for blocks first, and then build our API around that. Maybe with the news of changing view libraries, this could be a good time to reflect on the current data model for blocks state.

mtias · 2017-09-17T21:54:13Z

"jump here" is supposed to live in between the words "the" and "list", how do we denote that location in our block data.

There are two aspects involved in determining how a block is rendered. The parsing step handled by the grammar breaks up the ingredients, but doesn't know how they are combined—that is determined by the block edit and save interfaces. Children are an inner boundary of a block, so they are extracted into innerBlocks. From there, the block author will received as part of edit arguments { attributes, innerBlocks ... }. This will be passed to a <BlockList children={ innerBlocks } /> sort of component that would handle the UI to manage this inner list of blocks.

dmsnell · 2017-09-17T22:56:35Z

thanks for jumping in @BE-Webdesign (sorry I should have added you to this one).

Awesome work so far!

much obliged 🙇

for starters I can say sorry for pushing out a PR without much description, explanation of motivation, or context. I've been talking in person with @mtias and some others and wanted to make sure that I could record in code some of the ideas we've spoken about. will add more of the written part later as we continue to think through the implications here.

How do you think we will handle block normalization, or if we even want to do normalization?
How do we handle reordering of child blocks after parse ( tied to above )?
How do we parse this correctly?:

mostly I'll just defer to the response that @mtias gave, but we had lots of discussion on this very point. this PR is the realization of the ideas we landed at.

BE-Webdesign · 2017-09-18T12:30:04Z

There are two aspects involved in determining how a block is rendered. The parsing step handled by the grammar breaks up the ingredients, but doesn't know how they are combined—that is determined by the block edit and save interfaces. Children are an inner boundary of a block, so they are extracted into innerBlocks. From there, the block author will received as part of edit arguments { attributes, innerBlocks ... }. This will be passed to a <BlockList children={ innerBlocks } /> sort of component that would handle the UI to manage this inner list of blocks.

That sounds like a cool approach matias. What I am trying to figure out is how our block state will give those components enough information to render in the correct order. Maybe parsing each text node as a block would be a good solution?

BE-Webdesign · 2017-09-18T13:13:47Z

(sorry I should have added you to this one)

No need to apologize. I don't expect to be added to anything. I am only a casual contributor. Handling the parsing/serialization and inbetween for nested blocks will be a challenge, and hopefully I can constructively aid that discussion and path forward.

belcherj · 2017-10-03T17:28:44Z

@dmsnell

You have a merge conflict, it is much easier to resolve if you squash before rebasing
You have a lint issue: https://github.com/belcherj/gutenberg/pull/1/commits/e1ec93f5ba2149ea2cbc2f00933e909c8ce416bb
I think to fix the test all of the fixtures need to be updated due to the changes to the parser: https://github.com/belcherj/gutenberg/pull/1/commits/3f31d1927e3cad158ddeec433efe27377ab6ef2d

dmsnell · 2017-10-03T17:36:04Z

You have a merge conflict, it is much easier to resolve if you squash before rebasing

Thanks @belcherj - was from the work @aduth did with the more tag - I resolved manually from inside of GitHub

You have a lint issue: belcherj/gutenberg@e1ec93f

Best to defer until this merges - the innerBlocks is there to serve as a landmark that it's available. We need to add that data into the blocks inside the app as well.

I think to fix the test all of the fixtures need to be updated due to the changes to the parser:

Thanks! when I recreated the fixtures before they didn't all pass the tests and I didn't have the chance to dig in to find out why. Are they passing tests in your branch?

dmsnell · 2017-10-03T21:24:43Z

Thanks @belcherj! (oops look like tests are still failing)

Is anyone opposed to merging this PR? The ability to nest blocks doesn't force blocks to nest and as-is, this parser should be compliant with all existing posts.

Would anyone be interested in helping out by testing this against several different existing posts?

belcherj · 2017-10-03T21:37:38Z

@dmsnell still a couple tests failing. Could not quickly determine how to fix.

aduth · 2017-10-06T14:33:10Z

blocks/api/parser.js

@@ -220,7 +220,8 @@ export function createBlockWithFallback( name, rawContent, attributes ) {
 */
 export function parseWithGrammar( content ) {
 	return grammarParse( content ).reduce( ( memo, blockNode ) => {
-		const { blockName, rawContent, attrs } = blockNode;
+		// eslint-disable-next-line no-unused-vars


~~If we're truly not using innerBlocks, what's the intent with disabling the rule?~~

Noting your comment at #2743 (comment) , I could go either way on whether it's sensible to leave as a placeholder. It should be fine.

aduth · 2017-10-06T14:36:15Z

blocks/api/parser.js

@@ -220,7 +220,8 @@ export function createBlockWithFallback( name, rawContent, attributes ) {
 */
 export function parseWithGrammar( content ) {
 	return grammarParse( content ).reduce( ( memo, blockNode ) => {
-		const { blockName, rawContent, attrs } = blockNode;
+		// eslint-disable-next-line no-unused-vars
+		const { blockName, innerBlocks, innerHtml: rawContent, attrs } = blockNode;


Should we replace all references to rawContent as innerHTML ? Here or in a separate pull changeset?

or now I thought a second PR would be good so that this can be introduced in as minimal of a way as possible - I'm open to input

aduth · 2017-10-06T14:37:33Z

blocks/api/post.pegjs

-  / WP_Block_Html
+    return [
+      freeform( pre ),
+      ...ts.reduce( ( out, [ t, h ] ) => [ ...out, t, freeform( h ) ], [] ),


I recall we had issues previously because we don't run Babel on the .pegjs JavaScript, so I have a feeling this spread syntax won't work in IE11 and we'd either need to port to ES5 equivalent or investigate transpiling the result of the file.

fixed in 4e41437

aduth · 2017-10-06T14:38:49Z

blocks/api/post.pegjs

+    ts:(t:Token html:$((!Token .)*) { /** <?php return $t; ?> **/ return [ t, html ] })*
+    post:$(.*)
+  { /** <?php
+    $blocks = [];


The square bracket array syntax was added in PHP 5.4. We need to use the more-verbose array() for support back to PHP 5.2.4

http://php.net/manual/en/language.types.array.php

fixed in 4e41437

aduth · 2017-10-06T14:40:00Z

blocks/api/post.pegjs

+function freeform( s ) {
+  return s.length && {
+    blockName: 'core/freeform',
+    innerHtml: s


We might want to consider the capitalization here. I created #2511 a while back to open discussion on a convention. I've started leaning more toward capitalizing abbreviations (innerHTML) but really just care to enforce some consistency.

personally I don't care. I'll use HTML

changed in 4e41437

aduth · 2017-10-06T14:44:01Z

blocks/api/post.pegjs

  {
    /** <?php
+    $innerBlocks = array_filter( $children, function( $a ) {


Anonymous functions are only available in PHP 5.3+. To support PHP 5.2.4+, we'd need this to be a named function, passing the string callable reference to array_filter.

http://php.net/manual/en/functions.anonymous.php

fixed in 4e41437

aduth · 2017-10-06T14:44:28Z

blocks/api/post.pegjs

+    } );
+
+    $innerHtml = array_filter( $children, function( $a ) {
+      return is_string( $a );


Or here we could probably do array_filter( $children, 'is_string' );

good catch. I actually intended on maybe creating an array_partition function and maybe will still do that. I always get confused which core functions are functions and which are operators (like empty()) and cannot be used as Callables

changed in 4e41437

dmsnell · 2017-10-25T05:15:03Z

All, here are some big updates since the last time I addressed this: more complete PHP support, some big comments, and a few functions extracting out specific logic for return value shapes.

I've been inspired by some alternative parser generators in how they separate actions on a rule from the grammar itself, allowing the host language to define them separately.

For example, Canopy lets you call these actions on a rule by adding something like %combine_blocks and then in the language-specific file you supply an actions map like this:

const actions = {
	combine_blocks: function( pre, tokens, post ) {
		…
	}
}

Ohm also has this notion and they consider it separating the syntax from the semantics of the language being parsed.

Anyway, something to think about as we start introducing more code. I'd love to have a grammar completely free from any code.

aduth

Looks like fixtures need to be regenerated per innerHtml -> innerHTML change:

npm run fixtures:regenerate

Resolves some, but not all PHPUnit errors.

It looks like we no longer guarantee that attrs will exist on the returned parsed block object, so there are PHP errors occurring here where we assume to be able to access the property:

gutenberg/lib/blocks.php

Line 77 in 6e984b0

$attributes = is_array( $block['attrs'] ) ? $block['attrs'] : array();

1) Dynamic_Blocks_Render_Test::test_dynamic_block_rendering
Undefined index: attrs

aduth · 2017-10-25T12:15:44Z

blocks/api/post.pegjs

 ?> **/

+function freeform( s ) {
+  return s.length && {
+    blockName: 'core/freeform',


Elsewhere we have made assumptions that freeform is the absence of a block name. Specifically, this test case now fails for a few reasons:

In block registration we allow unknown type handler to be assigned to any block name, not assuming core/freeform

The content of the freeform block now includes the block delimiters  which was not the case previously and arguably not desirable (see also Blocks: Preserve unknown block, remove freeform block comment delimiters #2513)

aduth · 2017-10-25T12:16:30Z

blocks/api/post.pegjs

+    var truthy = [];
+    var falsey = [];
+
+    // nod to performance over a simpler reduce


❤️ A good place for micro-optimizations.

aduth · 2017-10-25T12:17:05Z

blocks/api/post.pegjs

+        predicate( item )
+            ? truthy.push( item )
+            : falsey.push( item );
+    };


Unnecessary semi-colon. Guessing it wouldn't be very easy for us to apply linting to this file, would it?

is it? I thought I would be following style guidelines by including it.

Yes, unnecessary: http://bit.ly/2y5BO9c

dmsnell · 2017-10-26T00:12:53Z

Thanks @aduth - I have updated with your feedback!

codecov · 2017-10-26T20:41:21Z

Codecov Report

Merging #2743 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master    #2743   +/-   ##
=======================================
  Coverage   32.05%   32.05%           
=======================================
  Files         249      249           
  Lines        6885     6885           
  Branches     1249     1249           
=======================================
  Hits         2207     2207           
  Misses       3934     3934           
  Partials      744      744

Impacted Files	Coverage Δ
blocks/api/parser.js	`98.21% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1cbe5e6...4e0b589. Read the comment docs.

aduth · 2017-10-26T20:48:48Z

blocks/api/parser.js

@@ -225,7 +225,8 @@ export function createBlockWithFallback( name, rawContent, attributes ) {
 */
 export function parseWithGrammar( content ) {
 	return grammarParse( content ).reduce( ( memo, blockNode ) => {
-		const { blockName, rawContent, attrs } = blockNode;
+		// eslint-disable-next-line no-unused-vars
+		const { blockName, innerBlocks, innerHTML: rawContent, attrs } = blockNode;


We probably want to normalize this toward innerHTML and drop rawContent, for consistency, and also because objects of innerBlocks will keep their innerHTML property. Should be fine to do separately.

aduth

Oops, looks like there's still an issue: Support for passing strings as callables in PHP was only added in PHP 5.4, which does not meet our support requirements of 5.2.4:

http://php.net/manual/en/language.types.callable.php
https://wordpress.org/about/requirements/

aduth · 2017-11-02T16:17:07Z

It was easy enough to drop the callable type hint and have it work just as well. See c344da6.

I'd personally like to see us push forward with block nesting. It would help in other cases such as #3047 and #2854 where I think we've had to make some compromises due to lack of proper nesting support.

dmsnell · 2017-11-02T18:15:29Z

@aduth what’s the point of removing the type hint?

aduth · 2017-11-02T18:19:17Z

@dmsnell See #2743 (review)

dmsnell · 2017-11-02T18:31:46Z

oh roger @aduth - I thought that the act itself of passing a string callable was the issue, not adding the type hint

In this patch we're opening up a new avenue for allowing nested blocks in the data structure. For each block: - Nested blocks appear as `innerBlocks` as a sequential list - The contained HTML _without_ the nested blocks appear as a string property `innerHtml` which replaces `rawContent` Also: - Remove `WP_` prefix on grammar terms - not needed - was there from the earliest iterations where different parts were prefixed by which spec they implemented, such as HTML_, URL_, etc... Regenerates fixtures based on updated parser Disable eslint for line so tests will run Fix rebase issue Update based on PR feedback I'm breaking my own rules here by introducing more code into the parser but I'm also not sure how we can escape this without placing higher demands on some post-processing after the parse. Changes: - Do away with non-supported language features - Abstract `joinBlocks( pre, tokens, post )` into a function basically just need to join non-empty items into a flat list Tiny fix and big header comment Added comment to start of rules Actually return blocks from peg_join_blocks() Minor adjustments to parser to preserve existing behavior Update parser and fix bug in updates When the `Balanced_Block` was rebuilt to be defined as a starting block, some number of tokens and non-closing HTML, finished by a closing block, I used a `+` to indicate that we needed at least _some_ content inside of the block to be valid. In some regards this is true because empty blocks should be void blocks. On the other hand, it's very likely that we'll receive empty non-void blocks in practice and the parser should not invalidate one because it has chosen the wrong syntax. This update replaces the `+` with a `*` such that we can have empty blocks and they will be treated as normal. Remove unnecessary semicolon Blocks: Access block content by innerHTML Parser: Drop callable type hint Type hint supported only in PHP 5.4+, above support level. Only use is in call_user_func, supportable with expected string callable input prior to 5.4

dmsnell · 2017-11-09T16:05:11Z

Thanks for all your help on this one @aduth!

nylen · 2017-11-16T06:36:39Z

There are no test cases for this major and foundational change to the parser, even though there is a framework in place for creating them.

What is the plan for adding these test cases?

dmsnell added the [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f label Sep 17, 2017

dmsnell requested review from youknowriad and mtias September 17, 2017 00:22

This was referenced Sep 20, 2017

Block List: Display inserter button after block on hover, focus #2752

Closed

Inserter: Remove "global" (toolbar) inserter #2755

Closed

dmsnell mentioned this pull request Sep 21, 2017

Parser: Start tokenizing HTML within the blocks #2286

Closed

mtias mentioned this pull request Sep 22, 2017

Add button to cover image #2765

Closed

dmsnell force-pushed the parser/allow-block-nesting branch from bcb4c07 to 8a47e27 Compare October 1, 2017 00:03

dmsnell force-pushed the parser/allow-block-nesting branch from 21c08a3 to 395bb69 Compare October 3, 2017 17:47

aduth reviewed Oct 6, 2017

View reviewed changes

dmsnell force-pushed the parser/allow-block-nesting branch from 42e3c23 to 4e41437 Compare October 25, 2017 04:07

dmsnell force-pushed the parser/allow-block-nesting branch from 460e57f to 74a2fcb Compare October 25, 2017 05:22

aduth reviewed Oct 25, 2017

View reviewed changes

aduth approved these changes Oct 26, 2017

View reviewed changes

aduth requested changes Oct 26, 2017

View reviewed changes

aduth approved these changes Nov 2, 2017

View reviewed changes

mtias added this to the Roadmap milestone Nov 3, 2017

aduth force-pushed the parser/allow-block-nesting branch from c344da6 to 4e0b589 Compare November 9, 2017 15:38

aduth merged commit 48ccc03 into master Nov 9, 2017

aduth deleted the parser/allow-block-nesting branch November 9, 2017 15:50

aduth mentioned this pull request Nov 9, 2017

Framework: Rename rawContent references as innerHTML #3411

Merged

mcsf mentioned this pull request Nov 12, 2017

Centered button identified as "modified externally" #3426

Closed

aduth mentioned this pull request Nov 17, 2017

Use input_substr to retrieve text nylen/phpegjs#3

Merged

aduth mentioned this pull request Nov 30, 2017

Blocks: Support block nesting (Add Columns block, InnerBlocks component) #3745

Merged

Parser: Allow block nesting #2743

Parser: Allow block nesting #2743

Conversation

dmsnell commented Sep 17, 2017 • edited Loading

BE-Webdesign commented Sep 17, 2017 • edited Loading

mtias commented Sep 17, 2017

dmsnell commented Sep 17, 2017

BE-Webdesign commented Sep 18, 2017

BE-Webdesign commented Sep 18, 2017 • edited Loading

belcherj commented Oct 3, 2017

dmsnell commented Oct 3, 2017

dmsnell commented Oct 3, 2017 • edited Loading

belcherj commented Oct 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell Oct 25, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell commented Oct 25, 2017 • edited Loading

aduth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell commented Oct 26, 2017

codecov bot commented Oct 26, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

aduth left a comment

Choose a reason for hiding this comment

aduth commented Nov 2, 2017

dmsnell commented Nov 2, 2017

aduth commented Nov 2, 2017

dmsnell commented Nov 2, 2017

dmsnell commented Nov 9, 2017

nylen commented Nov 16, 2017

dmsnell commented Sep 17, 2017 •

edited

Loading

BE-Webdesign commented Sep 17, 2017 •

edited

Loading

BE-Webdesign commented Sep 18, 2017 •

edited

Loading

dmsnell commented Oct 3, 2017 •

edited

Loading

dmsnell Oct 25, 2017 •

edited

Loading

dmsnell commented Oct 25, 2017 •

edited

Loading

codecov bot commented Oct 26, 2017 •

edited

Loading