Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hugo Pipes: Add link checker as a post-render transformer #5080

Open
rdwatters opened this issue Aug 15, 2018 · 14 comments
Open

Hugo Pipes: Add link checker as a post-render transformer #5080

rdwatters opened this issue Aug 15, 2018 · 14 comments

Comments

@rdwatters
Copy link
Contributor

I appreciate this was previously considered out of scope three years ago, but I'm putting in this request in hopes that a link checking feature could be set as a flag to either (a) check only internal links within a site (i.e. if a user isn't already using all of Hugo's awesome cross reference shortcodes, etc) or (b) check both internal and external links (this seems like the bigger perf killer).

Managing links is a critical part of content management. I'd love for Hugo to take care of this for me rather than adding additional build steps, external tools, third-party apps, etc.

Thank you.

See #1430 and related uncontained.io issue.

@kaushalmodi
Copy link
Contributor

The internal link checking sort of auto-works now if you use the ref or relref shortcode. If you use the shortcode to point to an invalid file, the build will fail.

@rdwatters
Copy link
Contributor Author

rdwatters commented Aug 16, 2018

The internal link checking sort of auto-works now if you use the ref or relref shortcode.

Yup. They're fantastic. That's why I wrote the following above:

(i.e. if a user isn't already using all of Hugo's awesome cross reference shortcodes, etc)

But I don't think that all sites are going to use the shortcodes, especially if the sites are ported over from other generators or perhaps just a series of typical commonmark-ish .md files.

@bep
Copy link
Member

bep commented Aug 16, 2018

Also recently added:

refLinksErrorLevel (“ERROR”)
When using ref or relref to resolve page links and a link cannot resolved, it will be logged with this logg level. Valid values are ERROR (default) or WARNING. Any ERROR will fail the build (exit -1).

refLinksNotFoundURL
URL to be used as a placeholder when a page reference cannot be found in ref or relref. Is used as-is.

@rdwatters
Copy link
Contributor Author

Thanks @bep. Just to clarify, I think the way Hugo handles cross references/link management, etc, is best in class. The use case here is definitely for sites that have been ported over or for content authors who stick to just basic markdown (i.e. authors who want to keep their content as portable as possible). For example, if a Hugo user brings over something as simple as a blog he/she has been writing for the last 5 years with 500 posts, it’s likely that some of those internal and especially external links are going to be broken. Broken links are no bueno for both UX and SEO, and the ability to manage them is table stakes for any content manager in 2018. Of course, Hugo doesn’t have to handle this with all the other tools out there, but I’d definitely love the feature added for this new era of all-in-one-pure-Hugo workflow with asset pipeline, images, fingerprinting, etc 😄

I’m going to give this golang project a whirl during lunch as well:

https://github.com/raviqqe/muffet

@earthboundkid
Copy link
Contributor

I have had in the back of my mind for a while the idea of writing a link checker for this purpose. As you said, internal links are pretty painless to check, but external can take time. So far, I've been using https://github.com/baltimore-sun-data/linkcheck when I want to check links, but I'd like to do a rewrite that is specifically designed for static sites.

@Jos512
Copy link

Jos512 commented Aug 31, 2018

htmltest (https://github.com/wjdp/htmltest) also offers broken link checking (including external links) and a bunch of other features to test the generated HTML that Hugo makes.

I use it with Hugo for a while (~6 months) now and it works great. It also runs quite fast, even in Gulp:

[17:03:19] Starting 'htmltest'...
htmltest started at 05:03:19 on deploy
====================================
✔✔✔ passed in 588.0336ms
tested 48 documents
[17:03:19] Finished 'htmltest' after 670 ms

@rdwatters
Copy link
Contributor Author

Thanks for the intel @Jos512 and @carlmjohnson. I'll be sure to check these out too (btw, muffet reference up top is pretty cool). This request is mostly to have that sorta all-in-one awesome sauce by including this feature within Hugo rather than relying on the external tools 😄 Cheers!

@kaushalmodi
Copy link
Contributor

FWIW, I tried out both muffet and htmltest, and htmltest is far better in accuracy and depth of checks.

Muffet ended with with lot of false timeouts, 400, 404, 503, etc errors when in fact all of those links worked fine.. even after setting its --timeout to 100 (seconds). Looks like it just boasts of speed, but without accuracy.

htmltest on the other hand is super-accurate and has a good set of configuration options too.

@stale
Copy link

stale bot commented Dec 30, 2018

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

@stale stale bot added the Stale label Dec 30, 2018
@stale stale bot closed this as completed Feb 6, 2019
@rdwatters
Copy link
Contributor Author

Bummer. Looks like HTMLTest it is!
Thanks y’all for the feedback and protips!

@Merovex
Copy link

Merovex commented Mar 28, 2019

This should be re-opened as the feature still has interest. Just because people got quiet is not sufficient to silence it.

@budparr
Copy link

budparr commented Mar 28, 2019

@Merovex I think this issue is solved by using htmltest. It works really well.

@bep bep changed the title Hugo linkchecker (enhancement) Hugo Pipes: Add link checker as a post-render transformer Mar 28, 2019
@bep bep reopened this Mar 28, 2019
@stale stale bot removed the Stale label Mar 28, 2019
@bep
Copy link
Member

bep commented Mar 28, 2019

I reopened this. This ties tightly into #5632 -- when that is solved (it will) we can look at this.

@stale
Copy link

stale bot commented Jul 26, 2019

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants