Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help: generate dynamic sitemap and include external sitemap #359

Closed
gsabater opened this issue Nov 23, 2024 · 6 comments
Closed

help: generate dynamic sitemap and include external sitemap #359

gsabater opened this issue Nov 23, 2024 · 6 comments
Labels
help wanted Extra attention is needed

Comments

@gsabater
Copy link

📚 What are you trying to do?

Hello,

i would like to create a new sitemap merging the current project routes and an external sitemap from an api.

🔍 What have you tried?

I’m using Nuxt Sitemap to generate a list of URLs with the crawler module, and it's working as expected with the following configuration:

sitemap: {
  enabled: true,
  exclude: ['/tabler*', '/account/**', '/dev/**'],
  cacheMaxAgeSeconds: 3600 * 24,
}

However, I want to include a large and expensive sitemap from an API, which contains over 40,000 game links. I’ve partially achieved this by adding the API URL as a source. Even though the URL is .xml, the data is dynamically loaded from the database:

sources: [
  'https://api.backlog.rip/sitemap.xml',
]

Finally, during the build process, I want to generate a static sitemap.xml file that merges the crawled URLs with the external API URLs so that the application no longer queries the API at runtime.

I’ve tried consulting GPT, reviewing the documentation, and checking GitHub issues, but I haven’t been able to find a definitive solution. I’ve experimented with several approaches, but I’m unsure if this is achievable with the current setup.

Thanks for your time and any guidance you can provide!

ℹ️ Additional context

No response

@gsabater gsabater added the help wanted Extra attention is needed label Nov 23, 2024
@dtogias
Copy link

dtogias commented Jan 5, 2025

I would also like to fetch sitemap.xml from an external source (proxied sitemap).

I tried using nitro.routeRules to force a redirect on a specific file, but this is not working for sitemap. I believe the sitemap module is bypassing route rules.

@gsabater
Copy link
Author

gsabater commented Jan 5, 2025

@dtogias
In case my experience helps you, I ended up ditching the seo module and setting up a sitemap with an index pointing to other dynamic sitemaps from my api. here is an example

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://api.example.com/sitemap/pages.xml</loc>
    <lastmod>2024-12-22T10:00:00+00:00</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://api.example.com/sitemap/games.xml</loc>
    <lastmod>2024-12-22T10:00:00+00:00</lastmod>
  </sitemap>
</sitemapindex>

@dtogias
Copy link

dtogias commented Jan 5, 2025

@gsabater Interesting approach, but in my case one of the sitemaps points to our blog website and i don't want to point to another domain. I need all sitemaps to be under our main website.

@harlan-zw
Copy link
Owner

Hi, the issue isn't exactly clear. Are you saying you want to be able to have an XML source that gets merged?

At the moment the source only supports JSON response of sitemap entries, <loc> data.

@dtogias
Copy link

dtogias commented Jan 15, 2025

@harlan-zw In our scenario, we have a wordpress website that is being proxied through our www website.

{{baseURL}}/blog -> fetches through a cloudflare worker the page from wordpress and serves it under wwww.

In that scenario the wordpress sitemap.xml needs to be included in the top level sitemap.xml, since it part of www

Here is our top level XML

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>{{baseURL}}/brands-sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>{{baseURL}}/static-sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>{{baseURL}}/category-sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>{{baseURL}}/products-sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>{{baseURL}}/wordpress-sitemap.xml</loc>
</sitemap>
</sitemapindex>

For the wordpress-sitemap.xml we have the following entry in nuxt.config.ts.

 wordpress: {
   urls: [],
 },

which is a dummy entry to generate the top level sitemap entry. The actual file is being served by workpress.

We would suggest the following options:

  1. Use an external XML source that can be downloaded in build time? As is. Currently only JSON is supported and you have to setup other sitemap attributes
  2. Be able to setup a proxy in nitro routerules. Currently this does not work. We tried with a middleware and with a routeRule and by looking at the code sitemap module is overriding this option.

Hope this helps

@harlan-zw
Copy link
Owner

You can now use XML sitemaps as source paths in the latest so I'll consider this closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants