Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anchor links in generated markdown are not unique #300

Closed
danielparks opened this issue Sep 22, 2022 · 4 comments · Fixed by #303
Closed

Anchor links in generated markdown are not unique #300

danielparks opened this issue Sep 22, 2022 · 4 comments · Fixed by #303
Labels

Comments

@danielparks
Copy link
Contributor

Describe the Bug

The anchor names in markdown generated documentation for a module are not unique. This is a particular problem for attributes such as ensure. Each attribute with the same name is given the same anchor link name, which breaks linking.

For example, see the docs for my rustup defined type. The ensure attribute links to #ensure, but the browser (Safari, in my case) can’t distinguish between the ensure attribute for the rustup::global class and the rustup defined type.

You can also conflicting anchor names for types as well, since it just ignores characters like :.

Expected Behavior

All anchor names should be unique.

Steps to Reproduce

❯ cat >test.pp     
# @summary test::ab class
# @param ensure attribute on test::ab
class test::ab ( $ensure ) {
}

# @summary test::a::b class
# @param ensure attribute on test::a::b
class test::a::b ( $ensure ) {
}
^D
❯ puppet strings generate --format markdown test.pp >/dev/null
❯ fgrep name= REFERENCE.md                                
### <a name="testab"></a>`test::a::b`
##### <a name="ensure"></a>`ensure`
### <a name="testab"></a>`test::ab`
##### <a name="ensure"></a>`ensure`

Environment

  • Puppet 7.19.0 (installed from brew)
  • pdk 2.5.0.0(installed from brew)
  • macOS 12.6.0
@danielparks danielparks changed the title Anchor links in generated markdown are not uniqe Anchor links in generated markdown are not unique Sep 22, 2022
danielparks added a commit to danielparks/puppet-rustup that referenced this issue Sep 22, 2022
There is [a bug in puppet strings][] that makes anchors non-unique. This
particularly affects parameter names, because many of them are shared by
multiple resources.

This adds a script that rewrites REFERENCE.md to make the anchor names
to be more likely to be unique.

[a bug in puppet strings]: puppetlabs/puppet-strings#300
danielparks added a commit to danielparks/puppet-rustup that referenced this issue Sep 22, 2022
There is [a bug in puppet strings][] that makes anchors non-unique. This
particularly affects parameter names, because many of them are shared by
multiple resources.

This adds a script that rewrites REFERENCE.md to make the anchor names
to be more likely to be unique.

[a bug in puppet strings]: puppetlabs/puppet-strings#300
@binford2k
Copy link
Contributor

Hey buddy!

Interestingly, the anchors appears to be created at https://github.com/danielparks/puppet-rustup/blob/f405dac71b966d875b0d2e96984b0a2166032fcf/REFERENCE.md#ensure-1, but the link to it doesn't keep the index

@danielparks
Copy link
Contributor Author

Hey Ben! Been awhile!

I think that’s an artifact of GitHub generating the HTML. Looks like it appends a number onto each successive anchor, but as you say, it doesn’t fix the links (presumably it would have to guess).

Looks like Forge does basically the same thing, though it also doesn’t fix the links.

Probably it makes sense to look at the underlying markdown: https://raw.githubusercontent.com/danielparks/puppet-rustup/f405dac71b966d875b0d2e96984b0a2166032fcf/REFERENCE.md


I actually fixed this (mostly) with post-processing: https://github.com/danielparks/puppet-rustup/pull/15/files#diff-1aacbb5834541d8fb2daa76bafbe0339f4f637bdea56c4c6411fe5d29c603995

In retrospect, maybe the issue “anchor links not unique” is too large — to deal with all the (uncommon) cases you basically have to construct a big index as you go and make sure you’re referring to the right one — or, make some sort of reversible encoding that supports whatever characters are allowed in an anchor.

The script I linked above is a giant hack but it probably deals with the vast majority of cases that don’t involve type names like test::a::b and test::ab. All it does is:

  1. Construct anchor names to parameters including the type, e.g. my::class::$param (though as you can see it does it incorrectly… oops).
  2. Replace each invalid character with _, e.g. my__class___param.

You can see the results in GitHub HTML and raw Markdown.

@danielparks
Copy link
Contributor Author

I made a few improvements to my script. I’m pretty sure it’s impossible for it to generate non-unique anchor names, but I’m not 100%.

danielparks added a commit to danielparks/puppet-strings that referenced this issue Sep 23, 2022
Previously, anchors in generated Markdown were not unique. There were
two problems:

  1. Each parameter in a class, type, function, etc. had its own anchor,
     but the `name` of the anchor was just the normalized name of the
     parameter. This meant that all parameters with the same name would
     have the same anchor.

  2. Classes, types, etc. had their `::`s removed, whiched opened the
     (rare) possibility of having non-unique anchor names for unique
     type names.

This fixes those problems by:

  1. TODO: Using the full name of the parameter, e.g.
     `$my::class::param`, to generate an anchor name.

  2. Using a better anchor normalization routine. Each invalid
     character, i.e. not `\[a-zA-Z0-9_-\]`, is converted to `-`. This
     should always result in unique anchor names for standard Puppet
     identifiers, since the onlid invalid characters sequences are `::`
     and `$`, and `-` never appears.
danielparks added a commit to danielparks/puppet-strings that referenced this issue Sep 23, 2022
Previously, anchors in generated Markdown were not unique. There were
two problems:

  1. Each parameter in a class, type, function, etc. had its own anchor,
     but the `name` of the anchor was just the normalized name of the
     parameter. This meant that all parameters with the same name would
     have the same anchor.

  2. Classes, types, etc. had their `::`s removed, whiched opened the
     (rare) possibility of having non-unique anchor names for unique
     type names.

This fixes those problems by:

  1. TODO: Using the full name of the parameter, e.g.
     `$my::class::param`, to generate an anchor name.

  2. Using a better anchor normalization routine. Each invalid
     character, i.e. not `[a-zA-Z0-9_-]`, is converted to `-`. This
     should always result in unique anchor names for standard Puppet
     identifiers, since the onlid invalid characters sequences are `::`
     and `$`, and `-` never appears.
danielparks added a commit to danielparks/puppet-strings that referenced this issue Sep 23, 2022
Previously, anchors in generated Markdown were not unique. There were
two problems:

  1. Each parameter in a class, type, function, etc. had its own anchor,
     but the `name` of the anchor was just the normalized name of the
     parameter. This meant that all parameters with the same name would
     have the same anchor.

  2. Classes, types, etc. had their `::`s removed, whiched opened the
     (rare) possibility of having non-unique anchor names for unique
     type names.

This fixes those problems by:

  1. Using the full name of the parameter, e.g.  `$my::class::param`, to
     generate an anchor name.

  2. Using a better anchor normalization routine. Each invalid
     character, i.e. not `[a-zA-Z0-9_-]`, is converted to `-`. This
     should always result in unique anchor names for standard Puppet
     identifiers, since the only invalid characters sequences are `::`
     and `$` (which only appears once at the beginning of the string).
     Furthermore, `-` never appears in a valid Puppet identifier, so
     those two cases are the only way for it to be generated.
chelnak pushed a commit to danielparks/puppet-strings that referenced this issue Sep 26, 2022
Previously, anchors in generated Markdown were not unique. There were
two problems:

  1. Each parameter in a class, type, function, etc. had its own anchor,
     but the `name` of the anchor was just the normalized name of the
     parameter. This meant that all parameters with the same name would
     have the same anchor.

  2. Classes, types, etc. had their `::`s removed, whiched opened the
     (rare) possibility of having non-unique anchor names for unique
     type names.

This fixes those problems by:

  1. Using the full name of the parameter, e.g.  `$my::class::param`, to
     generate an anchor name.

  2. Using a better anchor normalization routine. Each invalid
     character, i.e. not `[a-zA-Z0-9_-]`, is converted to `-`. This
     should always result in unique anchor names for standard Puppet
     identifiers, since the only invalid characters sequences are `::`
     and `$` (which only appears once at the beginning of the string).
     Furthermore, `-` never appears in a valid Puppet identifier, so
     those two cases are the only way for it to be generated.
danielparks added a commit to danielparks/puppet-strings that referenced this issue Sep 26, 2022
Previously, anchors in generated Markdown were not unique. There were
two problems:

  1. Each parameter in a class, type, function, etc. had its own anchor,
     but the `name` of the anchor was just the normalized name of the
     parameter. This meant that all parameters with the same name would
     have the same anchor.

  2. Classes, types, etc. had their `::`s removed, whiched opened the
     (rare) possibility of having non-unique anchor names for unique
     type names.

This fixes those problems by:

  1. Using the full name of the parameter, e.g.  `$my::class::param`, to
     generate an anchor name.

  2. Using a better anchor normalization routine. Each invalid
     character, i.e. not `[a-zA-Z0-9_-]`, is converted to `-`. This
     should always result in unique anchor names for standard Puppet
     identifiers, since the only invalid characters sequences are `::`
     and `$` (which only appears once at the beginning of the string).
     Furthermore, `-` never appears in a valid Puppet identifier, so
     those two cases are the only way for it to be generated.
chelnak added a commit that referenced this issue Sep 26, 2022
@chelnak chelnak linked a pull request Sep 26, 2022 that will close this issue
@chelnak
Copy link
Contributor

chelnak commented Sep 26, 2022

Fixed in #300 - Thank you @danielparks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants