Skip to content

fix(sitemap): add lastmod, prioritize people pages, drop low-value URLs#1206

Merged
ericgregory merged 1 commit into
mainfrom
sitemap-2026-6
Jun 23, 2026
Merged

fix(sitemap): add lastmod, prioritize people pages, drop low-value URLs#1206
ericgregory merged 1 commit into
mainfrom
sitemap-2026-6

Conversation

@LiamRandall

Copy link
Copy Markdown
Member

Summary

Sitemap optimization centered on the new /people/ pages, plus best-practice cleanup found while auditing the generated sitemap.xml.

  • Enable lastmod (Docusaurus defaults it to null, so it was absent on all 681 URLs). This is the freshness signal search engines actually act on for crawl scheduling and discovery of new pages — far more impactful than priority/changefreq.
  • Accurate lastmod for /people/ routes. These are synthesized via addRoute() with no source file, so the sitemap plugin can't derive a date. The people-pages plugin now attaches metadata.lastUpdatedAt = each profile's most recent blog post / meeting date; the /people/ index uses its latest active profile.
  • Drop /docs/0.82/* — they're Disallow'd in robots.txt, so listing them in the sitemap is a contradictory signal Search Console flags (99 URLs).
  • Drop blog & community pagination + tag archives — thin, paginated, near-duplicate listing pages with no standalone ranking value (~57 URLs).

/people/ (priority 0.8) and all 15 maintainer profiles (0.7) remain in the sitemap, now with accurate lastmod dates.

Result

Before After
Total sitemap URLs 681 525
URLs with lastmod 0 514
/docs/0.82/* (robots-blocked) 99 0
blog/community pagination + tags 57 0
/docs/v1/* (kept — still in use) 190 190
/people/* 16 16

The 11 remaining URLs without lastmod are the blog/community index pages and 9 auto-generated /docs/v1/category/* index pages — no source file to date from, all real indexable pages, so expected.

Notes / context

  • /people/ was already in the sitemap on this branch; the real gaps were the missing lastmod and the robots-blocked + thin pages.
  • Ahrefs check showed ~0 branded demand for "wasmcloud maintainers/team" and that individual maintainer-name volume is dominated by namesakes — so no per-maintainer priority tiering was added (it would chase the wrong people). Flat 0.8/0.7 is intentional.
  • Image/video sitemap extensions were considered and intentionally skipped: the site already emits VideoObject/ImageObject JSON-LD (91 pages), which is the interchangeable alternative.

Verification

  • npm run build passes (exit 0, SEO post-build checks included) after each change.
  • Generated build/sitemap.xml inspected: counts above confirmed, 0 noindex pages leak into the sitemap.

Follow-up (operational, not in this PR)

  • Submit sitemap.xml in Google Search Console + Bing Webmaster Tools.
  • After deploy, URL-inspect a couple of /people/<slug>/ pages and request indexing.

@LiamRandall LiamRandall requested a review from a team as a code owner June 22, 2026 23:38
@netlify

netlify Bot commented Jun 22, 2026

Copy link
Copy Markdown

Deploy Preview for dreamy-golick-5f201e ready!

Name Link
🔨 Latest commit 5caac98
🔍 Latest deploy log https://app.netlify.com/projects/dreamy-golick-5f201e/deploys/6a39ca570b647d0008c8b7f3
😎 Deploy Preview https://deploy-preview-1206--dreamy-golick-5f201e.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

🔎 Structured data validation

  • Mode: PR-incremental (base: origin/main)
  • Files checked: 0
  • JSON-LD payloads: 0
  • Errors: 0
  • Warnings: 0

✅ All checks pass and no authoring warnings to backfill.

⚙️ Generated by ci_structured_data.ymlstructured-data spike

- Enable lastmod (was off by default) so search engines get the freshness
  signal that actually drives crawl/discovery of new pages.
- Emit accurate lastmod for the synthesized /people/ routes (no source
  file) by attaching each profile's most recent blog post / meeting date
  as route metadata; the /people/ index uses its latest active profile.
- Drop /docs/0.82/* from the sitemap — they are Disallow'd in robots.txt,
  so listing them is a contradictory signal Search Console flags.
- Drop blog & community pagination and tag archives — thin, paginated,
  near-duplicate listing pages with no standalone ranking value.

/people/ (0.8) and the 15 maintainer profiles (0.7) remain in the sitemap
with accurate lastmod dates. Sitemap: 681 -> 525 URLs.

Signed-off-by: Liam Randall <liam@cosmonic.com>
@ericgregory ericgregory merged commit fcb294f into main Jun 23, 2026
8 checks passed
@ericgregory ericgregory deleted the sitemap-2026-6 branch June 23, 2026 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants