Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow pasting tables with list content inside the cells. #46775

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

mpkelly
Copy link
Contributor

@mpkelly mpkelly commented Dec 23, 2022

(recreated this PR from #46512 after polluting the commit log)

Fixes #45774

What?

It's based on requirement 2 from #45774. If you paste a table which includes ul or ol elements, these elements will be converted to simple content and formatted with whitespace to try and keep the original list structure. This avoids having to support complex nested content in tables.

Why?

Increase parity between Gutenberg and other editors.

How?

Allow the list schema to be embedded into td and th. Add a transform function for tables that can convert these tables into simple content that is already supported in table cells.

Testing Instructions

  1. Check out this branch and create a table in Google Docs that has ordered or unordered in its cells. See the screenshots below.
  2. Paste the table into Gutenberg to see the result
  3. Try adding content before/after the list.
  4. Try nesting lists to any depth.

Testing Instructions for Keyboard

Screenshots or screencast

Source table (Google docs)

image

Gutenberg editor after pasting the table above

image

How the td which contained the list looks in the DOM when previewing

image

Example of nested order lists

Google Docs
image

Gutenberg
image

… is converted to simple content which is formatted to look like a list using whitespace.
… is converted to simple content which is formatted to look like a list using whitespace.
@mpkelly
Copy link
Contributor Author

mpkelly commented Dec 23, 2022

Copying @danielbachhuber's comment here from the old PR.

Sweet! This seems to work pretty well to me from a product perspective. I'll defer to others on the codes.

For Example 4 (below), do we want to alternate between letters and numbers?

Example 1

Google Docs

image

Paste Into Gutenberg

image

Example 2

Google Docs

image

Gutenberg Paste

image

Example 3

Google Docs

image

Paste Into Gutenberg

image

Example 4

Google Docs

image

Paste Into Gutenberg

image

Example 5

Google Docs

image

Paste Into Gutenberg

image

Originally posted by @danielbachhuber in #46512 (review)

@mpkelly
Copy link
Contributor Author

mpkelly commented Dec 23, 2022

For Example 4 (below), do we want to alternate between letters and numbers?

@danielbachhuber, I added some logic to do this.

@danielbachhuber
Copy link
Member

FYI - I merged trunk because I had troubles with npm install

@danielbachhuber danielbachhuber self-requested a review January 3, 2023 14:18
Copy link
Member

@danielbachhuber danielbachhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really cool, @mpkelly. Thanks for your continued work on it!

One last issue I noticed: when indentation decreases again, the switch between alpha and numeric is incorrect.

Google Doc

image

Paste into Gutenberg

image

Edit: it also seems like there's a space character inserted unexpectedly:

image

Maybe it would be helpful to further abstract the code so we could have unique unit tests against all of the various list HTML -> text transformations?

@danielbachhuber
Copy link
Member

Requested feedback on the overall approach: https://wordpress.slack.com/archives/C02QB2JS7/p1672921120174529

@mpkelly
Copy link
Contributor Author

mpkelly commented Jan 11, 2023

Thanks for the feedback request, @danielbachhuber. I also tried to get some here. I get the feeling not everyone is cool with this change. Maybe the way I explained it in the PR is making it scarier than it sounds.

I have fixed that glaring bug so the numeric bullets now work. I will remove the space and then look at the tests.

@danielbachhuber
Copy link
Member

I get the feeling not everyone is cool with this change.

@mpkelly Out of curiosity, what gives you that sense? I don't have a similar sense...

@mpkelly
Copy link
Contributor Author

mpkelly commented Jan 11, 2023

I get the feeling not everyone is cool with this change.

@mpkelly Out of curiosity, what gives you that sense? I don't have a similar sense...

There was an earlier PR (two actually), which proposed adding the same thing, so this change has been on the table for a good while now, but there hasn't been much interest outside of us two. Maybe I'm just not used to the pace of open-source projects.

@annezazu
Copy link
Contributor

@ellatrix might you have time to review this? You come to mind as a go-to expert for these sorts of PRs related to the writing experience, including raw handling.

const tableContentPasteSchema = ( { phrasingContentSchema } ) => ( {
tr: {
allowEmpty: true,
children: {
th: {
allowEmpty: true,
children: phrasingContentSchema,
children: getListContentSchema( { phrasingContentSchema } ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like this. Why not use '*' for children and let raw handling parse the cell contents as blocks, then transform that to the markdown syntax. OR instead of parsing as blocks, just convert the HTML to mark down?

{
type: 'raw',
schema: () => ( {
blockquote: {
children: '*',
},
} ),
selector: 'blockquote',
transform: ( node, handler ) => {
return createBlock(
'core/quote',
// Don't try to parse any `cite` out of this content.
// * There may be more than one cite.
// * There may be more attribution text than just the cite.
// * If the cite is nested in the quoted text, it's wrong to
// remove it.
{},
handler( {
HTML: node.innerHTML,
mode: 'BLOCKS',
} )
);
},
},

Suggested change
children: getListContentSchema( { phrasingContentSchema } ),
children: '*',

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, when it's parsed as block, it's harder to convert all blocks to something textual (like heading to plain text etc.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we build this into the paste handler and generally convert lists to plain text if the allowed children is phrasingContentSchema? This behaviour could be useful in other places. What if there's a list in a caption? Generalising this would be great.

So inside the paste handler, I think we need a filter before removeInvalidHTML that converts list that are not allowed in these places (not in schema) to the text version of the list.

Let me know if you need help here.

row.cells.forEach( ( cell ) => {
transformContent( cell );
} );
} );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not transform node before running getBlockAttributes? That way we don't need to parse the HTML of every cell.

Copy link
Member

@ellatrix ellatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrasing my earlier comments:

The idea is good! We should just generalise it more inside of the paste handler (could be useful elsewhere like captions). This list format is nice in places where the list element is not part of the schema (and it should remain omitted from the schema in the table block).

@jordesign
Copy link
Contributor

Just checking in on #45774 - and wanted to see how progress was going on this PR?

Copy link
Member

@ellatrix ellatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpkelly are you planning to continue work on this PR? Otherwise I can create an alternative.

What I'd like to see: change filterInlineHTML inside the paste handler in the blocks package to convert the lists to a pseudo inline list so this behaviour is generalised and works for all blocks (not just tables, but also captions etc.)

@mpkelly
Copy link
Contributor Author

mpkelly commented Oct 10, 2023

@mpkelly are you planning to continue work on this PR? Otherwise I can create an alternative.

@ellatrix, I won't work on it until next week at the earliest, so go ahead and implement your recommendation instead if you have bandwidth.

@ellatrix
Copy link
Member

@mpkelly In that case, I'll wait for you to adjust it. :)

@mpkelly
Copy link
Contributor Author

mpkelly commented Oct 26, 2023

Thanks, @ellatrix. I am going to work on this today. I will follow your advice regarding filterInlineHTML.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Table Affects the Table Block [Feature] Paste
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Raw Handling: unexpected results when pasting table with list from Google Docs
5 participants