Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: flatten apidoc headers #21936

Closed
wants to merge 1 commit into from

Conversation

rubys
Copy link
Member

@rubys rubys commented Jul 22, 2018

ensure optional parameters are not treated as markedown links by
replacing the children of headers nodes with a single text node
containing the raw markup;

Fixes issue identified in #21490 (comment)

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

ensure optional parameters are not treated as markedown links by
replacing the children of headers nodes with a single text node
containing the raw markup;

Fixes issue identified in nodejs#21490 (comment)
@nodejs-github-bot nodejs-github-bot added doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory. labels Jul 22, 2018
@vsemozhetbyt
Copy link
Contributor

@vsemozhetbyt
Copy link
Contributor

Node.js Collaborators, please, add 👍 here if you approve fast-tracking.

@vsemozhetbyt
Copy link
Contributor

Thank you. We already had issues of the similar type with the old toolchain:
#19913
#20087
So it is good to be on the safe side with these cases.

Copy link
Member

@Trott Trott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Aside: Does this workaround a bug in our markdown and ultimately should be fixed in the markdown itself? Or is it more ambiguous than that?

@rubys
Copy link
Member Author

rubys commented Jul 23, 2018

@Trott a short answer is that a more complete solution might be to write a remark processor to explicitly convert 'broken' links everywhere to text.

Longer answer, markdown is more of a meme than a standard.

Consider the following input:

Following is a reference style link:

[a][b]

[b]: foo

Markdown processors tend to ignore broken links:

[options]

Not all markdown processors ignore broken reference style links:

https.get(url[, options][, callback])

Here's the output that $ remark --use remark-html test.md produces:

<p>Following is a reference style link:</p>
<p><a href="foo">a</a></p>
<p>Markdown processors tend to ignore broken links:</p>
<p>[options]</p>
<p>Not all markdown processors ignore broken reference style links:</p>
<p>https.get(url<a href="">, options</a>)</p>

For the moment, I observed that none of the headers in any of the API documents are intended to use embedded markdown, so I opted to replace any headers that appear to contain potential markdown with text nodes.

As to what the right answer is: ¯\_(ツ)_/¯. Writing a processor that converts unknown references with their original source as text might work a greater percentage of the time, but is an accident waiting to happen. All it takes is for somebody to define [options]: https://... and boom. Perhaps a better answer it to decide as a project that references strictly match /^\w+$/. That won't catch the options case, but most optional parameter definitions contain a comma.

@rubys
Copy link
Member Author

rubys commented Jul 24, 2018

@Trott follow-up: I found an explicit test in the unified/remark/rehype codebase that ensures that there is no fallback for unresolved link references: https://github.com/syntax-tree/mdast-util-to-hast/blob/f2e3ec5fc6476c3628a85cd07456c14d1e887c61/test/link-reference.js#L22 . That doesn't mean that it isn't wrong, but does mean that the behavior is intentional.

@rubys
Copy link
Member Author

rubys commented Jul 24, 2018

Issue opened: syntax-tree/mdast-util-to-hast#20

vsemozhetbyt pushed a commit that referenced this pull request Jul 25, 2018
ensure optional parameters are not treated as markedown links by
replacing the children of headers nodes with a single text node
containing the raw markup;

Fixes issue identified in
#21490 (comment)

PR-URL: #21936
Reviewed-By: Vse Mozhet Byt <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
@vsemozhetbyt
Copy link
Contributor

Landed in 83474ae
Thank you!

targos pushed a commit that referenced this pull request Jul 26, 2018
ensure optional parameters are not treated as markedown links by
replacing the children of headers nodes with a single text node
containing the raw markup;

Fixes issue identified in
#21490 (comment)

PR-URL: #21936
Reviewed-By: Vse Mozhet Byt <[email protected]>
Reviewed-By: Minwoo Jung <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
@targos targos mentioned this pull request Jul 31, 2018
@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Aug 1, 2018

@rubys It seems we have a regression with this change.

Previously we had some escapes in headings to prevent underscores to be treated as markdown emphases or computed properties to be treated as markdown links. Now, these escapes are rendered:

https://nodejs.org/download/nightly/v11.0.0-nightly20180731be322bd9ad/docs/api/globals.html#globals_dirname

https://nodejs.org/download/nightly/v11.0.0-nightly20180731be322bd9ad/docs/api/url.html#url_urlsearchparams_symbol_iterator

I have tried to fix this:

  1. Grep .md files for the RegExp ^#.*\\ and remove the escapes.

  2. Replace this code:

node/tools/doc/json.js

Lines 445 to 448 in 02badc4

// To include constructs like `readable\[Symbol.asyncIterator\]()`
// or `readable.\_read(size)` (with Markdown escapes).
const simpleId = r`(?:(?:\\?_)+|\b)\w+\b`;
const computedId = r`\\?\[[\w\.]+\\?\]`;

with this one:

// To include constructs like `readable[Symbol.asyncIterator]()`
// or `readable._read(size)`.
const simpleId = r`(?:_+|\b)\w+\b`;
const computedId = r`\[[\w\.]+\]`;

In HTML files, all seems fixed. In JSON files, the removed underscore escapes also seem to not produce any troubles, but these cleaned computed properties are wrong in the new JSON files:

https://nodejs.org/download/nightly/v11.0.0-nightly20180731be322bd9ad/docs/api/url.html#url_urlsearchparams_symbol_iterator

https://nodejs.org/download/nightly/v11.0.0-nightly20180731be322bd9ad/docs/api/stream.html#stream_readable_symbol_asynciterator

Could you look into this issue? Maybe we should do something like this PR for JSON heading processing as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants