Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be more careful about special-casing index.html #604

Conversation

dscho
Copy link
Contributor

@dscho dscho commented Apr 23, 2024

Pagefind has code to strip a trailing index.html from a URL. The intention is obviously to avoid showing /folder/index.html when /folder/ would already do.

However, the code is a bit lax about checking that index.html is the full file name: It would also strip that suffix for, say, /folder/my_index.html. That, however, leads to a broken link...

I noticed this because I want to convert https://git-scm.com/ to use Pagefind, but any search hitting /docs/git-checkout-index.html would mistakenly link to /docs/git-checkout-.

... and only if it is the full file name...

I noticed this because I want to convert https://git-scm.com/ to use
Pagefind, but any search hitting `/docs/git-checkout-index.html` would
mistakenly link to `/docs/git-checkout-`.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho force-pushed the be-careful-about-stripping-index.html-from-the-url branch from f7bc8a2 to 3d68de2 Compare April 23, 2024 21:35
@dscho
Copy link
Contributor Author

dscho commented Apr 24, 2024

Hrm. The tests pass for me on Windows... These failures all seem to be variations of this symptom:

      Step failed: \\?\D:\a\pagefind\pagefind\pagefind\features\anchors.feature:58:9
      Captured output: 
      Step panicked. Captured output: called `Result::unwrap()` on an `Err` value: Timeout

However, the output does not suggest that anything timed out?

      Civilization {
          tmp_dir: Some(
              TempDir {
                  path: "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\.tmp6ItJKt",
              },
          ),
          last_command_output: Some(
              CommandOutput {
                  stdout: "\nRunning Pagefind v0.0.0 (Extended)\nRunning from: \"C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\.tmp6ItJKt\"\nSource:       \"public\"\nOutput:       \"public\\\\pagefind\"\n\n[Walking source directory]\nFound 4 files matching **/*.{html}\n\n[Parsing files]\nFound a data-pagefind-body element on the site.\n↳ Ignoring pages without this tag.\n\n[Reading languages]\nDiscovered 1 language: en\n\n[Building search indexes]\nTotal: \n  Indexed 1 language\n  Indexed 3 pages\n  Indexed 49 words\n  Indexed 0 filters\n  Indexed 0 sorts\n\nFinished in 0.028 seconds\n",
                  stderr: "",
              },
          ),
          browser: None,
          assigned_server_port: Some(
              18145,
          ),
          threads: [
              JoinHandle {
                  id: Id(
                      5,
                  ),
              },
          ],
          handles: [
              ServerHandle {
                  cmd_tx: UnboundedSender {
                      chan: Tx {
                          inner: Chan {
                              tx: Tx {
                                  block_tail: 0x000001a38f51fa10,
                                  tail_position: 1,
                              },
                              semaphore: Semaphore(
                                  1,
                              ),
                              rx_waker: AtomicWaker,
                              tx_count: 1,
                              rx_fields: "...",
                          },
                      },
                  },
              },
          ],
          env_vars: {
              "PAGEFIND_SITE": "public",
          },
      }

I also notice that main fails in the same way. @bglw any ideas?

@bglw
Copy link
Contributor

bglw commented Apr 24, 2024

Yeah that won’t be you — the test suite is a little flaky right now, especially on the windows runners lately. I have a new testing framework on the way which should resolve that.

I’ll give the tests some pokes before I merge, which will be next week as I’m out for the weekend.

(Change looks good by the way, thank you! 🙏)

@dscho
Copy link
Contributor Author

dscho commented Apr 24, 2024

I’m out for the weekend.

On a Wednesday? CloudCannon must be a nice place to work at, do you have openings? 😁

Yeah that won’t be you — the test suite is a little flaky right now, especially on the windows runners lately. I have a new testing framework on the way which should resolve that.

Great!

@dscho dscho changed the title Only strip index.html if it is at the end Be more careful about special-casing index.html May 4, 2024
@bglw
Copy link
Contributor

bglw commented May 5, 2024

Sorry for the delay! My test suite has caught up with me (and/or GitHub is only giving me the runt-of-the-litter machines 😓)

I'll run the tests locally on my mac and merge tomorrow.

@bglw bglw merged commit b09760a into CloudCannon:main May 7, 2024
2 of 4 checks passed
@bglw
Copy link
Contributor

bglw commented May 8, 2024

@dscho this is now released on 1.1.1-alpha.1 if you're fine using the prerelease for a while :)

@dscho dscho deleted the be-careful-about-stripping-index.html-from-the-url branch June 21, 2024 13:36
@bglw
Copy link
Contributor

bglw commented Sep 3, 2024

Released in v1.1.1 🙂

@dscho
Copy link
Contributor Author

dscho commented Sep 3, 2024

Released in v1.1.1 🙂

Thank you so much for all your help @bglw!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants