Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Disable Indexing on Non-Wiki Pages #133

Closed
bwfiq opened this issue Aug 28, 2024 · 1 comment
Closed

Suggestion: Disable Indexing on Non-Wiki Pages #133

bwfiq opened this issue Aug 28, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@bwfiq
Copy link

bwfiq commented Aug 28, 2024

Currently, Google has indexed all the URLs available on my otterwiki instance. Normally this would be fine, but a lot of the search results point to sub-URLs such as the specific revision pages (URL including ?revision=) and the source pages.

I've solved this on my own system by modifying nginx confs to provide noindex headers, but this might be a good option to have for less technically inclined users. A settings dropdown could disable indexing for the non-wiki pages or even the whole wiki, for cases where an instance is only meant for personal documentation.

@redimp redimp added the enhancement New feature or request label Aug 29, 2024
@redimp
Copy link
Owner

redimp commented Aug 29, 2024

Hey @bwfiq,

Thank you for bringing this up! An Otter Wiki up to version 2.5.2 sets <meta name="robots" content="noindex, nofollow"/> only for page history, page attachments and page blame. Even for sane defaults this is not enough. Will add this to at least the changelog and pages displayed with a given `revision.

To amke this configureable in a convient way, I follow what you propsed. My first idea to implement this is to add a settings option that controls the generated /robots.txt.

When allowed the robots.txt is generated as

User-agent: *
Allow: /

else

User-agent: *
Disallow: /

For more complex configurations users should provide a custom robots.txt.

redimp added a commit that referenced this issue Aug 29, 2024
…e revisions

This was brought up in #133. Removing the changelog and forms from
search indices makes sense in general. Adding the meta tag to pages with
a specific revision helps to make sure that traffic coming from search
enginges is directed to the most recent version of a page.
@redimp redimp closed this as completed in 782baff Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants