Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: all.html is seriously broken link-wise #20100

Closed
vsemozhetbyt opened this issue Apr 17, 2018 · 3 comments
Closed

doc: all.html is seriously broken link-wise #20100

vsemozhetbyt opened this issue Apr 17, 2018 · 3 comments
Labels
doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory.

Comments

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Apr 17, 2018

Cause of the state

all.html is assembled from all.md by preliminary gluing all .md sources in one blob as they are. This is made by chain of tools\doc\generate.js -> tools\doc\preprocess.js -> tools\doc\html.js.

That means that all the bottom references are merged as they are without collisions resolved.

Types of collisions

Currently, we have many homonym bottom references in .md files. They can be grouped into 3 types by the severity of collision impact:

  1. Completely identical bottom references with no danger, like:
[Android building]: https://github.com/nodejs/node/blob/master/building.md#androidandroid-based-devices-eg-firefox-os
[Android building]: https://github.com/nodejs/node/blob/master/building.md#androidandroid-based-devices-eg-firefox-os

[Common System Errors]: errors.html#errors_common_system_errors
[Common System Errors]: errors.html#errors_common_system_errors

[debugger]: debugger.html
[debugger]: debugger.html

[MSDN-Rel-Path]: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247.aspx#fully_qualified_vs._relative_paths
[MSDN-Rel-Path]: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247.aspx#fully_qualified_vs._relative_paths

[Native Abstractions for Node.js]: https://github.com/nodejs/nan
[Native Abstractions for Node.js]: https://github.com/nodejs/nan

...Many of them...
  1. Bottom references with negligible differences (producing the same result by redirecting, referring to the same section inside all.html or a separate [module].html doc, referring the doc (above doc TOC) or its top heading (below doc TOC)):
[Chrome Debugging Protocol]: https://chromedevtools.github.io/debugger-protocol-viewer
[Chrome Debugging Protocol]: https://chromedevtools.github.io/debugger-protocol-viewer/

[`'finish'`]: #stream_event_finish
[`'finish'`]: stream.html#stream_event_finish

[`'uncaughtException'`]: #process_event_uncaughtexception
[`'uncaughtException'`]: process.html#process_event_uncaughtexception
[`'uncaughtException'`]: process.html#process_event_uncaughtexception

[`__dirname`]: #modules_dirname
[`__dirname`]: modules.html#modules_dirname

[`__filename`]: #modules_filename
[`__filename`]: modules.html#modules_filename

[REPL]: repl.html
[REPL]: repl.html#repl_repl

[stream]: stream.html
[stream]: stream.html
[stream]: stream.html
[Stream]: stream.html#stream_stream

[TTY]: tty.html
[TTY]: tty.html#tty_tty

...Many of them...
  1. Bottom references referring to different places, either close enough (doc top vs doc main class) or completely different (not the same docs or different outer pages). This is all of them I can collect for now:
Almost 30 collisions:
[Caveats]: #crypto_support_for_weak_or_compromised_algorithms
[Caveats]: #fs_caveats

[Duplex]: #stream_class_stream_duplex
[Duplex]: stream.html#stream_duplex_and_transform_streams

[ICU]: http://icu-project.org/
[ICU]: intl.html#intl_internationalization_support
[ICU]: intl.html#intl_options_for_building_node_js

[Punycode]: https://tools.ietf.org/html/rfc3492
[Punycode]: https://tools.ietf.org/html/rfc5891#section-4.4

[Readable]: #stream_class_stream_readable
[Readable]: stream.html#stream_readable_streams
[Readable]: stream.html#stream_readable_streams

[Writable Stream]: stream.html#stream_class_stream_writable
[Writable Stream]: stream.html#stream_class_stream_writable
[Writable Stream]: stream.html#stream_writable_streams

[Writable]: #stream_class_stream_writable
[Writable]: stream.html#stream_writable_streams
[Writable]: stream.html#stream_writable_streams

[`'checkContinue'`]: #http2_event_checkcontinue
[`'checkContinue'`]: #http_event_checkcontinue

[`'close'`]: #dgram_event_close
[`'close'`]: #net_event_close

[`'data'`]: #net_event_data
[`'data'`]: #stream_event_data

[`'drain'`]: #net_event_drain
[`'drain'`]: #stream_event_drain

[`'end'`]: #net_event_end
[`'end'`]: #stream_event_end

[`'error'`]: #child_process_event_error
[`'error'`]: #net_event_error_1

[`'exit'`]: #child_process_event_exit
[`'exit'`]: #process_event_exit
[`'exit'`]: process.html#process_event_exit

[`'message'`]: child_process.html#child_process_event_message
[`'message'`]: process.html#process_event_message

[`'request'`]: #http2_event_request
[`'request'`]: #http_event_request

[`Agent`]: #https_class_https_agent
[`Agent`]: #http_class_http_agent

[`Buffer`]: buffer.html
[`Buffer`]: buffer.html
[`Buffer`]: buffer.html#buffer_buffer
[`Buffer`]: buffer.html#buffer_class_buffer
[`Buffer`]: buffer.html#buffer_class_buffer

[`ChildProcess`]: #child_process_child_process
[`ChildProcess`]: child_process.html#child_process_class_childprocess

[`EventEmitter`]: events.html
[`EventEmitter`]: events.html
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter
[`EventEmitter`]: events.html#events_class_eventemitter

[`require()`]: globals.html#globals_require
[`require()`]: modules.html#modules_require

[`response.end()`]: #http2_response_end_data_encoding_callback
[`response.end()`]: #http_response_end_data_encoding_callback

[`response.setHeader()`]: #http2_response_setheader_name_value
[`response.setHeader()`]: #http_response_setheader_name_value

[`response.socket`]: #http2_response_socket
[`response.socket`]: #http_response_socket

[`response.write()`]: #http2_response_write_chunk_encoding_callback
[`response.write()`]: #http_response_write_chunk_encoding_callback

[`response.writeContinue()`]: #http2_response_writecontinue
[`response.writeContinue()`]: #http_response_writecontinue

[`response.writeHead()`]: #http2_response_writehead_statuscode_statusmessage_headers
[`response.writeHead()`]: #http_response_writehead_statuscode_statusmessage_headers

[`server.close()`]: #net_server_close_callback
[`server.close()`]: net.html#net_event_close
[`server.close()`]: net.html#net_server_close_callback

[`URL`]: url.html#url_class_url
[`URL`]: url.html#url_class_url
[`URL`]: url.html#url_the_whatwg_url_api
[`URL`]: url.html#url_the_whatwg_url_api
[`URL`]: url.html#url_the_whatwg_url_api

What are collision results?

  1. The first type of collision is completely safe.

  2. The second type is safe data-wise but may be confusing or may have performance penalty: all.html doc is huge and links referring now internal section, now another document may baffle or cause regular reloading of the big page (even cached, it overload the browser significantly during reparsing). But it seems we cannot do anything simple to resolve this.

  3. The third collision is severe and may cause many misunderstandings. The reference from the last included doc wins and rewrites all previous links. This can be easily checked:

What can we do?

  • Manually diverge at least all the links from the third type. It may produse too verbose and cumbersome link texts and this does not prevent future collisions, but this is a quick workaround.
  • Set doc linting rule to prevent inter-docs link collisions.
  • Make doctools to pretransform links inside docs before merging. Our internal URL hash system is safe enough for that.

I can do the first job if this is wanted, but others are currently above my knowledge of doc system tooling.

Maybe there are some other ways to fix this.

@vsemozhetbyt vsemozhetbyt added doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory. labels Apr 17, 2018
@vsemozhetbyt
Copy link
Contributor Author

cc @nodejs/documentation

@BridgeAR
Copy link
Member

Make doctools to pretransform links inside docs before merging. Our internal URL hash system is safe enough for that.

I think this is the only viable way to handle this. Should it not already be enough to just fully generate each doc and then merge the content?

@vsemozhetbyt
Copy link
Contributor Author

I will try to investigate this later if somebody does not manage this sooner.

rubys added a commit to rubys/node that referenced this issue Jun 26, 2018
Combine the toc and api contents from the generated doc/api/*.html
files.  This ensures that the single page version of the documentation
exactly matches the individual pages.

Fixes nodejs#20100
rubys added a commit to rubys/node that referenced this issue Jun 30, 2018
Combine the toc and api contents from the generated doc/api/*.html
files. This ensures that the single page version of the documentation
exactly matches the individual pages.

Fixes nodejs#20100
targos pushed a commit that referenced this issue Jul 3, 2018
Combine the toc and api contents from the generated doc/api/*.html
files. This ensures that the single page version of the documentation
exactly matches the individual pages.

PR-URL: #21568
Fixes: #20100
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Vse Mozhet Byt <[email protected]>
rubys added a commit to rubys/node that referenced this issue Jul 7, 2018
Notes:

1) Removed a number of root properties that did not seem relevant: source,
   desc, and introduced_in.  There no longer is a source, and the other two are
   from the first include and do not reflect the entire API.

2) As with nodejs#20100, the current "desc"
   properties sometimes contained in-page links, other times referenced another
   page, and often did not match the links in the original HTML or JSON file.
   I chose to standardize on external links as "desc" values are isolated
   snippets as opposed to all.html which can be viewed as a standalone and self
   contained document.

3) Eliminated preprocessing for @include entirely, including the test case
   for this function.

4) _toc.md was renamed to index.md.

5) index comments no longer appear in embedded TOCs (left hand side column in
   the generated documentation.
vsemozhetbyt pushed a commit that referenced this issue Jul 9, 2018
Notes:

1) Removed a number of root properties that did not seem relevant:
   source, desc, and introduced_in.  There no longer is a source, and
   the other two are from the first include and do not reflect the
   entire API.

2) As with #20100, the current
   "desc" properties sometimes contained in-page links, other times
   referenced another page, and often did not match the links in the
   original HTML or JSON file. I chose to standardize on external links
   as "desc" values are isolated snippets as opposed to all.html which
   can be viewed as a standalone and self contained document.

3) Eliminated preprocessing for @include entirely, including the test
   case for this function.

4) _toc.md was renamed to index.md.

5) index comments no longer appear in embedded TOCs (left hand side
   column in the generated documentation.

PR-URL: #21637
Reviewed-By: Vse Mozhet Byt <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
targos pushed a commit that referenced this issue Jul 12, 2018
Notes:

1) Removed a number of root properties that did not seem relevant:
   source, desc, and introduced_in.  There no longer is a source, and
   the other two are from the first include and do not reflect the
   entire API.

2) As with #20100, the current
   "desc" properties sometimes contained in-page links, other times
   referenced another page, and often did not match the links in the
   original HTML or JSON file. I chose to standardize on external links
   as "desc" values are isolated snippets as opposed to all.html which
   can be viewed as a standalone and self contained document.

3) Eliminated preprocessing for @include entirely, including the test
   case for this function.

4) _toc.md was renamed to index.md.

5) index comments no longer appear in embedded TOCs (left hand side
   column in the generated documentation.

PR-URL: #21637
Reviewed-By: Vse Mozhet Byt <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory.
Projects
None yet
2 participants