Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: Remove dependency on WP_HTML_Tag_Processor #47349

Closed
2 of 9 tasks
ockham opened this issue Jan 23, 2023 · 11 comments
Closed
2 of 9 tasks

Tracking issue: Remove dependency on WP_HTML_Tag_Processor #47349

ockham opened this issue Jan 23, 2023 · 11 comments
Assignees
Labels
[Priority] High Used to indicate top priority items that need quick attention

Comments

@ockham
Copy link
Contributor

ockham commented Jan 23, 2023

For WP 6.2, we've decided that we are not going to include WP_HTML_Tag_Processor just yet, since it's not quite ready yet (see).

However, this means that we'll have to replace all GB code that depends on WP_HTML_Tag_Processor.

Since Feature Freeze is on Feb 7, we'll need to treat this with relatively hight priority -- it'd be great to get it in within the next 7 days or so so that it can sit a bit before the FF 😄

Here's a tentative list:

@ockham
Copy link
Contributor Author

ockham commented Jan 23, 2023

Flagging this for @tellthemachines @andrewserong @ramonjd @aristath @georgeh since I've noticed y'all have used WP_HTML_Tag_Processor in blocks and/or block-supports. It'd be great if y'all help us remove the dependency from GB 😊

@andrewserong
Copy link
Contributor

Thanks for the ping! It looks like @dmsnell commented more recently than this issue (here: #47187 (comment)) about assessing reverting vs proceeding with the tag processor.

It's been really great being able to use it for some of the features lately (e.g. targeting inner block wrapper in layout, etc), so it'd be a shame to have to revert, but I also totally understand the challenge of determining the right time to merge the class to core, as it will very much lock the API in place. Out of curiosity, is the main concern about stabilising the shape of the API, or the flexibility of speed of development in Gutenberg? If it's more about the latter, one of the things @ramonjd set up in the style engine package in Gutenberg is that we can continue to develop the class in GB where webpack outputs the main class with a _Gutenberg suffix, so things can still change fairly rapidly in Gutenberg if need be, with the knowledge that those changes will eventually be backported at the next major WP release. Of course, it still commits to supporting the API shape in a backwards compatible way, so might not be a solution if the tag processor's overall shape is still in flux.

For replacing usage of WP_HTML_Tag_Processor, what would you recommend as an alternative? It looks like some features (e.g. the rel attribute for social icons that landed in #45469 or the layout classnames on inner block wrappers in #44600) were built using the processor, so might not have an obvious previously working version to revert to.

My main concern at this stage would be the potential for introducing regressions as we work through the existing usages, so when removing the tag processor dependency, we'll likely need to be quite careful with any alternative we put in place.

All that said, happy to help out with the updates! Let us know here if you think we should proceed, and I'm happy to dig into position / layout tasks first since they're the ones I'm most familiar with.

@dmsnell
Copy link
Member

dmsnell commented Jan 24, 2023

@andrewserong for me it's a mixture of the development pace, since we are still making regular updates to that file in concert with the evolution of the HTML parsing tools built on top of it (many changes, such as supporting the modification on markup content inside a tag or specifying a CSS selector to find a tag, are paired with changes to the tag processor file. not only that, but the unit tests for the tag processor are extensive and each time we find something that needs asserting we're adding those tests, again regularly.

so a move to Core makes all this frustratingly inconvenient since we have to pair changes across two repositories, we lose the ease of working with the tests, and it becomes harder to work in parallel as we have been with many ideas as we explore the design evolution to an unsure future.

another part of me is reluctant to expose this outside of the Core Gutenberg work. it's not that the API is unstable still; I don't anticipate making significant changes to it (apart from one PR that's open right now which is exactly proposing a small change in behavior), but I expect this evolution to be the more sure API that people will reach for and I want to avoid pushing out one thing only to tell people to change their code for yet another better system shortly after they initially rewrote their code to use the tag processor.

the use in core has been good and important to help us understand how the API should be shaped, and it has helped us uncover a couple defects in the tag processor, I just wish we could continue to have those handful of cases continue to use it without needing this PHP to live in the WordPress repo 🤷‍♂️

@Mamaduka
Copy link
Member

I just wish we could continue to have those handful of cases continue to use it without needing this PHP to live in the WordPress repo.

I just had an idea about how we can do this.

  • Remove WP_HTML_Tag_Processor usage from packages/block-library. PHP files are bundled with the package, so the final version of the code should be in the Gutenberg repo.
  • Continue using a processor in lib/block-supports files. These are manually migrated. It will be a little more work, but WP_HTML_Tag_Processor usage can be changed/omitted during the backports.

What do you think?

@andrewserong
Copy link
Contributor

Thanks for sharing your thoughts, Dennis!

I just wish we could continue to have those handful of cases continue to use it without needing this PHP to live in the WordPress repo 🤷‍♂️

That seems to get to the heart of it — it's great for us to be able to use the new shiny thing in practice on real world features, it both unlocks easier pathways to implementing those features, as well as helping shape the future of the API. But by doing so, we then unwittingly commit to a snapshot of that very shape. It is frustrating that we don't have a mechanism to keep the APIs private to the Gutenberg work in PHP like we do with all the experimental JS work!

Not sure if it helps, but with the style engine PHP, the class landed in core flagged as being not ready (yet) for extenders / plugins / themes (here), and the current work continues to be done in the Gutenberg repo, along with storing all the tests here. It does mean like you mention, maintaining a thing in two places, and the backport process involves copying the tests and renaming calls to remove the _Gutenberg suffix, etc, so it definitely has some friction. The plus side is that it's still possible to build on it in GB in faster iterations, though, and experiment with things before proposing they be rolled into core.

Continue using a processor in lib/block-supports files. These are manually migrated. It will be a little more work, but WP_HTML_Tag_Processor usage can be changed/omitted during the backports.

I think my main concern with that is that it's helpful if the backports can match the code in Gutenberg as closely as possible. If the backporting process involves swapping the usage for a one-off tricky-to-review regex, I could imagine backport PRs getting held up at the review stage.

Overall, I'm happy to support whatever you decide here, though! Please take my comments as excitement and enthusiasm over all the great work y'all have been doing so far on building a much-desired tool 😊

I suppose the the main thing to figure out will be some reliable one-off code for each of the features we have currently that use the class.

@Mamaduka
Copy link
Member

I think my main concern with that is that it's helpful if the backports can match the code in Gutenberg as closely as possible. If the backporting process involves swapping the usage for a one-off tricky-to-review regex, I could imagine backport PRs getting held up at the review stage.

Agreed, this would make things complicated.

@dmsnell
Copy link
Member

dmsnell commented Jan 24, 2023

As far as "one-off tricky-to-review regex" is concerned I don't consider them a very big deal. We have plenty of those already and we've dealt with the bugs they create.

Since most of the major changes I want to make are cosmetic or can occur through subclassing, I think it's possible to keep some development pace if we merge this. Unit tests will be a pain but I can copy and paste the existing ones.

Honestly I hadn't considered a suffixed class but that seems like a reasonable solution. I'm not even worried about changing the Core code that's currently using the tag processor because I think those uses are good and locked in place. We'd have to be careful about applying updates in Core to the class and porting them into the Gutenberg plugin, but otherwise creating WP_HTML_Tag_Processor_Gutenberg would let us continue to quickly iterate on the tag processor and its evolution without disturbing core.

I'd love to warn people against using this in Core for now. Again it's a weird thing because I'm fairly certain the API is stable and locked in and reliable, but since it turns out that much better things are possible and around the corner, I'd like to hold off encouraging plugin authors to use this because if the more robust and higher-level "HTML processor" works out, then this one (the tag processor) will live happily in the corner as the lower-level library that powers the more usable and friendly API.

if that's confusing I'll try and restate: the tag processor has been working great, but some usage is awkward and intentionally so because its operation matches the real limitations of what we can do with it. however, a more convenient API for interacting with HTML is possible by using it and building over those awkward details. the more convenient API is what I'd rather see people use except for where performance or certain guarantees are more critical (such as Core itself), but it's not ready for this release (might be for the next one).

thanks all - I'm grateful for all your help with this and your advice. if nothing becomes truly clear here by Friday I guess, we can just leave things as they are (and revert the revert that @ockham already made) and merge into Core. it was my mistake, so I can take the developmental inconvenience.

@adamziel
Copy link
Contributor

adamziel commented Jan 24, 2023

WebFonts API was initially shipped without publicly exposing the classes. The trick was to rewrite the classes as variables and lambda functions enclosed in a private scope. I wonder if there's a way to do it here, too? It would save us all the reverting and the difficulty of backporting.

For example:

function _tag_processor_comand( $html, $command, $args = array() ) {
    // Tag processor rewritten using lambdas:
    $next_tag = function() { /* ... */ };
    $get_attribute = function($attr) { /* ... */ };

    // Command handling:
    if($command === 'get_attribute') {
         $next_tag();
         return $get_attribute($args[0]);
    } else if( /* ... */ ) {
        // ...
    }
}

function gutenberg_get_classnames_from_last_tag( $html ) {
    return _tag_processor_comand( $html, 'get_attribute', 'class' );
}

This exposes some API without exposing all the semantics, intricacies, and behaviors of the larget WP_HTML_Tag_Processor class.

Note I chose an easy use-case. Here's a more difficult one:

function block_core_heading_render( $attributes, $content ) {
	if ( ! $content ) {
		return $content;
	}

	$p = new WP_HTML_Tag_Processor( $content );

	$header_tags = array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' );
	while ( $p->next_tag() ) {
		if ( in_array( $p->get_tag(), $header_tags, true ) ) {
			$p->add_class( 'wp-block-heading' );
			break;
		}
	}

	return $p->get_updated_html();
}

I'd rather not expose all these methods via _tag_processor_comand! There's another way:

function _tag_processor_comand( $html, $command, $args = array() ) {
    // Tag processor rewritten using lambdas:
    $next_tag = function() { /* ... */ };
    $get_attribute = function($attr) { /* ... */ };

    if($command === 'block_core_heading_render' ) {
        // do what block_core_heading_render() does
    } else if( /* ... */ ) {
        // ...
    }
}

function block_core_heading_render( $attributes, $content ) {
    return _tag_processor_comand( $html, 'block_core_heading_render', [ $attributes, $content ] );
}

Once WP_HTML_Tag_Processor becomes a public API, the command-handling code would be moved back to block_core_heading_render. The _tag_processor_comand would then get deprecated AND updated to call the original function directly.

What do you think?

@ndiego ndiego moved this to ❓ Triage in WordPress 6.2 Editor Tasks Jan 24, 2023
@dmsnell
Copy link
Member

dmsnell commented Jan 24, 2023

@adamziel I appreciate that as a way to hide the class, but I think I'd reach for a straight-up merge or GB-suffix before tangling things in that way. I'd prefer "abuse" with a clean interface to hiding with the web.

I'm kind of sprinting to see how far we can get right now, merging #46680 and trying to push out some small refactors (mostly renaming things) and see if that makes it feel solid enough to commit to for now.

@ockham
Copy link
Contributor Author

ockham commented Jan 25, 2023

Thank you all for weighing in on this, and for your suggestions for methods to ensure a continued speedy pace of development even after backporting!

After some more discussion with @dmsnell, we're agreeing that the API should be stable enough for inclusion in Core 🎉 Dennis has thankfully agreed to file the backport PR against wordpress-develop 😊

For continued development, we'll use the copy of the file that will remain in GB -- essentially following established practices. (This means we almost certainly won't even encapsulate the code into an npm package (like the Style Engine does), nor into privately scoped variables and lambda functions (as the WebFonts API does), as it's probably overkill for what we expect to change the API only minimally.)

Thank you all for bearing with me, and apologies for the back-and-forth; I think there was some unclarity around the implications of including the relevant code in Core (and its impact on development) that we've now resolved.

I'll close this issue and will file a PR to revert the revert (#47350). I'll also post an update to the 6.2 tracking issue.

@ockham ockham closed this as completed Jan 25, 2023
@github-project-automation github-project-automation bot moved this from ❓ Triage to 🔎 Needs Review in WordPress 6.2 Editor Tasks Jan 25, 2023
@richtabor richtabor moved this from 🔎 Needs Review to ✅ Done in WordPress 6.2 Editor Tasks Jan 25, 2023
@dmsnell
Copy link
Member

dmsnell commented Jan 25, 2023

Agreed. What I didn't realize again is that we can retain the source files in Gutenberg post-merge, and also that we have to in order to support running the plugin on older WP installs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Priority] High Used to indicate top priority items that need quick attention
Projects
No open projects
Development

No branches or pull requests

5 participants