Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML tags in post titles are gone when editing existing posts #38668

Closed
Rottinator opened this issue Feb 9, 2022 · 24 comments · Fixed by #54718
Closed

HTML tags in post titles are gone when editing existing posts #38668

Rottinator opened this issue Feb 9, 2022 · 24 comments · Fixed by #54718
Assignees
Labels
Needs Dev Ready for, and needs developer efforts [Package] Block editor /packages/block-editor [Status] In Progress Tracking issues with work in progress [Type] Bug An existing feature does not function as intended

Comments

@Rottinator
Copy link

Description

Hello,
since the update to wordpress 5.9 we've detected some strange behavior, when editing existing posts, that contains HTML-Tags in their post title.
The problem is, when editing the post, the HTML-Tags in the post title, are not displayed in the editor. The tags are just gone.

The behavior came through wordpress 5.9, i’ve quickly tested the issue with a complete new and empty 5.8.3 version and a complete new and empty 5.9 version. In the 5.8.3 this behavior is not reproducable, in the 5.9 it is. I’ve used the official docker images for that wordpress:5.8.3 and wordpress:latest, completely new setup without any plugins or custom themes.

I've also created a ticket at trac.wordpress.org, where i got the hint, that this may be a Gutenberg issue and i should continue creating a ticket here. https://core.trac.wordpress.org/ticket/55125

Best regards
Christoph

Step-by-step reproduction instructions

1. Create a new post with HTML-Tags in the title for example: "<em>This is my title</em>"
2. Save and publish the post with this title
3. Go back to the "All posts" list
4. Post is correctly displayed with "<em>This is my title</em>" in the list (also the "Quick Edit"-Functionality correctly displays the <em>-Tags)
5. Edit the post
6. post title lost the <em>-Tags
7. Change the post title to "This is my title 2" 
8. Save the post
9. <em>-Tags are totally gone, even in the "All posts" list.

Screenshots, screen recording, code snippet

Post title in the "All posts" list:
image

Post title, when editing the post in Gutenberg (<em>-Tags are gone):
image

Environment info

  • Wordpress-Version: 5.9 (latest docker image wordpress:latest)
  • Browser: Reproducable in latest Firefox and latest Chrome (didn't testet in other browser, but i don't think it is a browser specific issue)
  • Devices: PC with windows 10 and MacOS latest version (didn't testet on other systems, but i don't think it is a system specific issue)

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Yes

@getdave getdave added Needs Technical Feedback Needs testing from a developer perspective. Needs Testing Needs further testing to be confirmed. labels Feb 9, 2022
@bph
Copy link
Contributor

bph commented Feb 9, 2022

I was able to reproduce it, somewhat. The stored title (with HTML tags) is stripped of HTML tags in the block editor so it seems.

The block editor in the edit screen saves entered HTML tags and displays them on the front end. Only on subsequent re-edits are the HTML tags stripped for the view in the edit screen and also not displayed on the front end.

It happens in 5.8.3 as well as 5.9
Also, tested with Gutenberg 12.5.4 and no plugin.

@bph bph added [Package] Block editor /packages/block-editor [Type] Bug An existing feature does not function as intended and removed Needs Testing Needs further testing to be confirmed. Needs Technical Feedback Needs testing from a developer perspective. labels Feb 9, 2022
@ossgeek314
Copy link

ossgeek314 commented Feb 10, 2022

I can verify this behavior. It is when loading previously saved posts that have titles with html tags that the tags are stripped away when the title is put in the block editor. The data flowing from the backend contains the tags and if you "re-add" the tags and save the post they are indeed saved to the backend.

We noticed this after upgrade to 5.9.

@ntsekouras
Copy link
Contributor

Thanks for reporting this @Rottinator! This seems to be a result from this PR: #31569. @ellatrix do you have any thoughts if we can handle this with RichText?

@ntsekouras
Copy link
Contributor

Tried to understand what's going on in this draft PR. I have details in the description and any feedback is welcome.

@MadtownLems
Copy link

MadtownLems commented Oct 26, 2022

Hi there. I'm still experiencing this oddity with the 6.1-RC-3.

Steps to reproduce:

  1. Author a new post.
  2. Switch to Code Editor, and enter a post title such as:
    "This is my <strong>Bold</strong> Post"
  3. Publish the post.
  4. View the post. You'll see that Bold is, in fact, bold.
  5. Edit the post. At this point, the has been stripped out from the post title. If you edit the title and save the post, the title will no longer be bold. You can always add it back, but every time you edit the title, it will get stripped out again.

@ofmarconi
Copy link

No solution for this?
The classic editor doesn't let that happen, what's the difficulty with this Gutenberg?

Do you force us to use these solutions and not fix an error like this in 12 months?

@bozzmedia
Copy link

Related:
#46823 (open)
#38637 (closed, but perhaps should re-open)

@Rottinator
Copy link
Author

Hello,
it's now over a year since i opend this issue and it is still persistent and annoying. Are there any plans to fix this in an upcoming release?

@MadtownLems
Copy link

MadtownLems commented Jul 14, 2023

Confirming this is still the case in 6.2.2. The behavior is definitely far from expected.

On 6.2.2:

Create a new post title: This is my post with <em>Emphasis</em> in the title
Publish the post.
View the post.
The word 'Emphasis' will be emphasised.
Return to edit the post.
The post title bar in the editor will simply say "This is my post with Emphasis in the title." The word emphasis will not be shown to be wrapped in tags, nor will it have emphasis.
Update the BODY of the post, but not the title.
View the post.
The title will still show the word "Emphasis" with emphasis.
Edit the post.
The title bar in the editor will still show no signs of emphasis being emphasized (either with tags or actually displaying as it)
Edit the title by appending " that has been updated" to the title
Update the post.
View the post.
At this point, the <em> tags have finally been stripped. 'Emphasis' is no longer emphasized.

@ellatrix
Copy link
Member

Looks like HTML tags are stripped since #35825. Cc @getdave

@annezazu annezazu added the Needs Dev Ready for, and needs developer efforts label Aug 3, 2023
@getdave
Copy link
Contributor

getdave commented Sep 15, 2023

Can I confirm the expected behaviour here?

Are Posts allowed to have HTML in their titles? If not then this should be validated and sanitized on the REST API.

If Posts should be allowed HTML in their titles then we should revert #35825 and ensure that HTML tags are either:

  • displayed in their raw form - this will allow them to be perceived and removed if necessary
  • displayed in rendered form (i.e. bold would appear as bold) - this will ensure editor matches front of site.

If we're unsure then I'd lean towards a PR reverting #35825 whilst also ensuring that HTML tags are rendered in raw form (e.g. This is a post title with <em>emphasis</em> tags in it rather than "This is a post title with emphasis tags in it").

@MadtownLems
Copy link

Are Posts allowed to have HTML in their titles?

I believe so. It has been allowed in post titles for as long as I can remember. I was digging up old conversations/tickets about it, and it looks like it goes back at least 16 years - and I'm guessing it's the full 20. Regressing this functionality now would be a huge change. Moreover, I believe there are plenty of valid uses of HTML in post titles. The most common we see is including the scientific names of plants. As a University, it's important that we present these correctly (which is as always italicized).

displayed in their raw form - this will allow them to be perceived and removed if necessary

This is my preferred approach. It makes it obvious that there is HTML there, and obvious how to change it. It remains kind of a 'hidden feature' for power users, which I think is reasonable.

displayed in rendered form (i.e. bold would appear as bold) - this will ensure editor matches front of site.

This would still make it very difficult to edit, unless you also start including some kind of rich text editing toolbar for the title (bold, italics, etc) - which I think sounds ugly AND could lead people to editing the title when it's not really necessary or helpful (I'm sure our users would just start bolding all their post titles if they saw a button to do it, even though the theme handles that). If you want to 'ensure editor matches front of site', a fancier (but obviously more complicated) approach could be something like: If HTML is detected in the Post Title, also include an additional bit of information that shows how it would look rendered.

In conclusion, I think the best approach is to get back to a place where HTML tags are rendered in raw form in the editor.

Thank you so much for your time and attention here!

@ellatrix
Copy link
Member

Yes, HTML is allowed in both post title and site titles afaik.

@madfcat
Copy link

madfcat commented Sep 15, 2023

6.3.1
Title does not show html tags. I am using tag. It's not shown in Visual Editor. It's not shown in Code editor 🤦‍♀️
I still can see the tag it from admin Posts page.

@getdave
Copy link
Contributor

getdave commented Sep 19, 2023

I've looked into this further.

Any HTML tags in the Post Title are persisted to the database. Therefore it's possible that HTML is rendered on the front of the site - this will be either raw or rendered depending on if the Theme properly escapes the output.

As a result we must show any stored HTML to the user in some form within the editor in order that they can choose to remove it if they wish (note we will not enforce this).

As discussed above we could show raw HTML tags but this would run counter to other parts of the editor which do not render raw HTML but rather simply render the output. Perhaps we could do the same?

Do do this we could allow the formats to render in the editor, but without any formatting controls available.

This means:

  • you cannot add formatting unless you manually add HTML to the title outside of the editor.
  • you can perceive when there is HTML in the title (this includes other HTML such as anchor tags).
  • you can delete the title to strip the HTML and simply rewrite it again

We should be able to do this by switching the following line to false:

__unstableDisableFormats: true,

Here's how that might look:

Screen Shot 2023-09-19 at 10 12 35

In addition we should retain the change from #35825 which stripped HTML on paste into the visual editor. This will help users to avoid inadvertently creating titles with HTML when copy/pasting from external documents.

I'd very much appreciate @ellatrix's thoughts on this approach.

@MadtownLems
Copy link

This means: you cannot add formatting unless you manually add HTML to the title outside of the editor.

To confirm, you mean the editor would have NO way to add HTML (including basic formatting) to the title? If so, this would be a huge problem for us (and I imagine many others). As mentioned, our content providers need to be able to do things like italicize scientific plant names in Titles. I feel we need to have SOME way to do this via the editor.

@getdave
Copy link
Contributor

getdave commented Sep 21, 2023

To confirm, you mean the editor would have NO way to add HTML (including basic formatting) to the title?

Understood. Just noting however, that this is currently how the editor behaves (i.e. you cannot add formats to the titles).

I believe the immediate objective should be to provide a means for users to perceive that the title contains HTML. In the future we can consider allowing adding formatting if that is a consensus view.

@getdave
Copy link
Contributor

getdave commented Sep 21, 2023

Pinging @scruffian for his opinion also.

@MadtownLems
Copy link

Just noting however, that this is currently how the editor behaves (i.e. you cannot add formats to the titles).

I apologize for belaboring this point, but I feel it's still unclear and important. When you say things like "you cannot add formatting unless you manually add HTML to the title outside of the editor" and "this is currently how the editor behaves (i.e. you cannot add formats to the titles)", are you saying that:

a) it is not possible to use the block editor to add html to the title. you must find another way such as editing the database content or using some third party tool

or

b) the editor provides to toolbars or keyboard shortcuts for formatting, but manually typing in html does allow formatting

Thanks for helping clarify, and again for all your work here 🙏

@getdave
Copy link
Contributor

getdave commented Sep 21, 2023

No problem at all. Clarity is important.

What I'm saying is I wish to take an iterative approach to resolving this problem. The first step is to restore things to how they were prior to #35825.

From there I propose that we make it possible to perceive that any existing HTML is present in the title. This can be done by either:

  • rendering the HTML.
  • displaying the HTML in plaintext (raw).

Once done we can then decide whether it is appropriate for users to be able to add HTML to the post title. If so then we can decide how that is best to be achieved.

I appreciate that this may seem drawn out and laborious but given the previous misstep I"m keen to move forward with caution.

Away from Github @ellatrix has proposed using the new plaintext v2 to render the post title in plaintext form which I believe would display the HTML in raw form.

I think the best next step is to raise a few draft PRs showcasing the various approaches and then assess them on their merits.

@getdave
Copy link
Contributor

getdave commented Sep 22, 2023

Here is a prototype for consideration. It will:

  • render text (including HTML) when in "Visual Mode"
  • render raw text (including raw HTML tags) when in "Code Mode".
Screen.Capture.on.2023-09-22.at.09-53-17.1.mp4

For me this solves two issues:

  1. The ability of authors to perceive that HTML is present in a title.
  2. The ability for authors to remove/edit the HTML in the title.

@MadtownLems
Copy link

I'm assuming that, in the prototype, if you type in HTML tags in Text mode, it will escape them and thus be presented on the front end as text containing tags instead of the rendered HTML, right?

I was really hoping for a solution that didn't require Code Editor to enter HTML in the title, because we have Code Editor disabled for anyone but Network Admins in our environment. That being said, I fully understand that the final implementation might not be perfect for every single use case; just sharing ours.

@getdave
Copy link
Contributor

getdave commented Sep 25, 2023

More information about HTML in Post Titles:

From what I can see in the Classic Editor you can add raw HTML tags to post titles. Obviously the block editor is different, but I think we can draw a rough parallel between Classic Editor's post title field and the Code View of the block editor.

So I believe my PR is going in the right direction.

I'm assuming that, in the prototype, if you type in HTML tags in Text mode, it will escape them and thus be presented on the front end as text containing tags instead of the rendered HTML, right?

I understand that by "Text mode" you may mean "Code view" mode? If so then anything entered into the Post Title field would not be escaped in the editor. This means you would be able to add HTML to post titles and have that rendered on the front of the the site (i.e. not in raw form).

Note that when it comes to saving and persisting the post, the same rules should apply as documented above about only privileged users being able to add <script> tags...etc. This would be handled by WP Core via the REST API which is how posts are saved in the block editor.

I was really hoping for a solution that didn't require Code Editor to enter HTML in the title

I understand. However, I think for now the best option is to not allow formatting HTML in Visual Mode as that would be a big change.

That doesn't mean it can never happen, but it won't land for the 6.4 release and we need a bug fix asap.

I think we should consider carefully whether providing a UI to add HTML to post titles is a good idea. The best way to do that would be outside of a major release cycle, thereby affording sufficient time for the change to be test in the Gutenberg Plugin before being released to Core.

You should also be able to test the PR using this link.

@MadtownLems
Copy link

You should also be able to test the PR using this link.

Everything is working here, as I'd expect it to based on the decisions made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Dev Ready for, and needs developer efforts [Package] Block editor /packages/block-editor [Status] In Progress Tracking issues with work in progress [Type] Bug An existing feature does not function as intended
Projects
None yet