Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow html without \n after #1438

Merged
merged 4 commits into from
Mar 25, 2019
Merged

allow html without \n after #1438

merged 4 commits into from
Mar 25, 2019

Conversation

UziTech
Copy link
Member

@UziTech UziTech commented Mar 7, 2019

Marked version: master

Markdown flavor: Markdown.pl|CommonMark|GitHub Flavored Markdown|n/a

Description

Fix html without a new line after

marked('<img src="sdfg">\n');
// expected: '<img src="sdfg">'
// actual:   '<img src="sdfg">'

marked('<img src="sdfg">');
// expected: '<img src="sdfg">'
// actual:   '<p><img src="sdfg"></p>'

Contributor

  • Test(s) exist to ensure functionality and minimize regression

Committer

In most cases, this should be a different person than the contributor.

  • Draft GitHub release notes have been updated.
  • CI is green (no forced merge required).
  • Merge PR

@styfle
Copy link
Member

styfle commented Mar 8, 2019

There are some merge conflicts

@@ -27,8 +27,8 @@ var block = {
+ '|<![A-Z][\\s\\S]*?>\\n*' // (4)
+ '|<!\\[CDATA\\[[\\s\\S]*?\\]\\]>\\n*' // (5)
+ '|</?(tag)(?: +|\\n|/?>)[\\s\\S]*?(?:\\n{2,}|$)' // (6)
+ '|<(?!script|pre|style)([a-z][\\w-]*)(?:attribute)*? */?>(?=\\h*\\n)[\\s\\S]*?(?:\\n{2,}|$)' // (7) open tag
+ '|</(?!script|pre|style)[a-z][\\w-]*\\s*>(?=\\h*\\n)[\\s\\S]*?(?:\\n{2,}|$)' // (7) closing tag
Copy link
Member Author

@UziTech UziTech Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Feder1co5oave I'm not sure what the \\h* is supposed to do. As far as I know \H isn't a special character in regex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\h has meaning in some languages.

Language Meaning of \h
Ruby Hex chars, e.g. [0-9a-fA-F]
Java, PHP, Perl Horizontal whitespace, cf. \v

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...so I imagine this regex originated in a language (or in someone's head) where \h had meaning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...I've written a research paper on this topic if @UziTech @styfle @Feder1co5oave would be interested.

Copy link
Contributor

@davisjam davisjam Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is there a test case that includes horizontal whitespace where the \h was?
  • It's possible the original regex should have used something like [ \t] instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intended to be [ \t]* I guess, I just don't remember why I used \h instead, and why it required a newline after the tag :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added [ \t]* in place of \h but with some manual testing it doesn't seem to change anything. All of the tests I can think of still pass with or without [ \t]*

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be due to the fact that we strip white space before we check the output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still strip whitespace when testing? Wasn't html-differ supposed to fix this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were only using html-differ for the spec tests. The files were still using the manual method.

I fixed it in #1449

@UziTech UziTech requested review from davisjam and joshbruce March 9, 2019 17:37
@UziTech UziTech mentioned this pull request Mar 9, 2019
12 tasks
@UziTech
Copy link
Member Author

UziTech commented Mar 13, 2019

@davisjam @joshbruce This should be ready to merge. Once this is merged we can release v0.6.2

@UziTech
Copy link
Member Author

UziTech commented Mar 21, 2019

@davisjam @joshbruce We need one of you to approve this in order to merge it and release a new version of marked.

@joshbruce
Copy link
Member

Sorry. I'm gonna be a little slow on things for at least the next few days (family-related keeping me offline for long stretches). Maybe we should consider an alternative to keep from deadlock. Though not sure it will happen often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants