Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use correct options in specs #1511

Merged
merged 13 commits into from
Jul 2, 2019
Merged

use correct options in specs #1511

merged 13 commits into from
Jul 2, 2019

Conversation

UziTech
Copy link
Member

@UziTech UziTech commented Jul 1, 2019

Marked version: master

Description

I am trying to implement the changes from @Feder1co5oave in #1407

  • CommonMark tests are running with gfm: false
  • GFM tests include CommonMark tests but are using gfm: true
  • Original tests are running with pedantic: true
  • BREAKING CHANGE Move fences to CommonMark
  • BREAKING CHANGE Move tables to GFM
  • Improve heading, lheading, and paragraph

These changes improve CommonMark compliance significantly:

--------------------------------------------------------
|                        CommonMark                    |
|                                         before  after|
| Tabs                                       91%   91% |
| Precedence                                100%  100% |
| Thematic breaks                           100%  100% |
| ATX headings                               83%   94% |
| Setext headings                            81%   89% |
| Indented code blocks                      100%  100% |
| Fenced code blocks                         97%   97% |
| HTML blocks                               100%  100% |
| Link reference definitions                 89%   89% |
| Paragraphs                                100%  100% |
| Blank lines                               100%  100% |
| Block quotes                               88%   92% |
| List items                                 73%   73% |
| Lists                                      50%   58% |
| Inlines                                   100%  100% |
| Backslash escapes                          85%   85% |
| Entity and numeric character references    76%   76% |
| Code spans                                 91%   91% |
| Emphasis and strong emphasis               64%   64% |
| Links                                      78%   78% |
| Images                                     68%   68% |
| Autolinks                                  79%  100% |
| Raw HTML                                   90%   90% |
| Hard line breaks                          100%  100% |
| Soft line breaks                          100%  100% |
| Textual content                           100%  100% |
--------------------------------------------------------

Issues

closes #1407
fixes #1510

Contributor

  • Test(s) exist to ensure functionality and minimize regression

Committer

In most cases, this should be a different person than the contributor.

  • Draft GitHub release notes have been updated.
  • CI is green (no forced merge required).
  • Merge PR

@UziTech UziTech requested review from styfle, davisjam and joshbruce July 1, 2019 14:01
@UziTech
Copy link
Member Author

UziTech commented Jul 1, 2019

I also agree with @Feder1co5oave that we should remove the tables option. I can't see why anyone would want GFM without tables.

lib/marked.js Outdated Show resolved Hide resolved
hr: /^ {0,3}((?:- *){3,}|(?:_ *){3,}|(?:\* *){3,})(?:\n+|$)/,
heading: /^ *(#{1,6}) *([^\n]+?) *(?:#+ *)?(?:\n+|$)/,
nptable: noop,
heading: /^ {0,3}(#{1,6}) +([^\n]*?)(?: +#+)? *(?:\n+|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK

table: noop,
lheading: /^([^\n]+)\n {0,3}(=|-){2,} *(?:\n+|$)/,
paragraph: /^([^\n]+(?:\n(?!hr|heading|lheading| {0,3}>|<\/?(?:tag)(?: +|\n|\/?>)|<(?:script|pre|style|!--))[^\n]+)*)/,
lheading: /^([^\n]+)\n {0,3}(=+|-+) *(?:\n+|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

lheading: /^([^\n]+)\n {0,3}(=+|-+) *(?:\n+|$)/,
// regex template, placeholders will be replaced according to different paragraph
// interruption rules of commonmark and the original markdown spec:
_paragraph: /^([^\n]+(?:\n(?!hr|heading|lheading|blockquote|fences|list|html)[^\n]+)*)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

fences: /^ {0,3}(`{3,}|~{3,})([^`\n]*)\n(?:|([\s\S]*?)\n)(?: {0,3}\1[~`]* *(?:\n+|$)|$)/,
paragraph: /^/,
heading: /^ *(#{1,6}) +([^\n]+?) *#* *(?:\n+|$)/
nptable: /^ *([^|\n ].*\|.*)\n *([-:]+ *\|[-| :]*)(?:\n((?:.*[^>\n ].*(?:\n|$))*)\n*|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-linear.

  1. The construct .*\|.* can be exploited through a run of |||...|. Can we replace with [^|]*|[^|]* ?
  2. I believe the same problem applies later in the (?:.*[^>\n ].* section.

paragraph: /^/,
heading: /^ *(#{1,6}) +([^\n]+?) *#* *(?:\n+|$)/
nptable: /^ *([^|\n ].*\|.*)\n *([-:]+ *\|[-| :]*)(?:\n((?:.*[^>\n ].*(?:\n|$))*)\n*|$)/,
table: /^ *\|(.+)\n *\|?( *[-:]+[-| :]*)(?:\n((?: *[^>\n ].*(?:\n|$))*)\n*|$)/
Copy link
Contributor

@davisjam davisjam Jul 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same genre of problem as the nptable.

if (/^ *\|(.+)\n *\|?( *[-:]+[-| :]*)(?:\n((?: *[^>\n ].*(?:\n|$))*)\n*|$)/.exec(' | \n' + ' '.repeat(50000))) {
  console.log('match');
}

(Note, here and elsewhere I'm just checking regexes, not full exploitability).

Copy link
Contributor

@davisjam davisjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To discuss: super-linear regexes.

Not sure if the regexes are new or just re-org'd (in which case we should note them but they shouldn't block this PR).

@UziTech
Copy link
Member Author

UziTech commented Jul 2, 2019

I updated the fences regex but nptable and table kept breaking tests when I was trying to fix them.

nptable and table were just moved from block.tables anyway.

The only changed regexes were heading, lheading, and paragraph

Copy link
Contributor

@davisjam davisjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM then

@UziTech
Copy link
Member Author

UziTech commented Jul 2, 2019

I did remove the tables option since this PR would make it do nothing anyway.

@Lmillan123

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

md title parse error
4 participants