Sitemap And Robots: Technical SEO Basics That Still Matter
How to keep indexing clean with valid sitemap.xml and predictable robots.txt rules.
Sitemap should reflect reality, not intention
Only include canonical, indexable URLs that return 200 status. Remove redirects, blocked pages, and noindex URLs.
Treat sitemap as a strong hint to crawlers. Noise in sitemap wastes crawl budget.
Robots rules must be simple and explicit
Keep wildcard rules minimal. Broad disallow patterns often block assets or pages by mistake.
When in doubt, add a test URL path in Robots.txt Tester and verify for user-agent star and targeted bots.
Use monitoring, not one-time checks
Validate sitemap and robots after every deployment. Route changes and framework upgrades frequently create silent regressions.
A small CI check on generated URLs can prevent weeks of indexing loss.