What is an XML Sitemap?
An XML Sitemap is a structured file (typically sitemap.xml
) that lists URLs of a website along with metadata (e.g., last modified date, priority) to help search engines like Google discover and index pages efficiently.
Technical Structure
A basic XML Sitemap follows this schema:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/page1</loc>
<lastmod>2023-10-01</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
Why Use an XML Sitemap?
- Improves crawlability for large/dynamic websites.
- Prioritizes critical pages (via
<priority>
and<changefreq>
). - Handles duplicate content by specifying canonical URLs.
- Required for Google News and video/image SEO.
How to Generate an XML Sitemap?
- Automated Tools:
- CMS plugins (e.g., Yoast SEO for WordPress).
- Crawlers like Screaming Frog or SiteBulb.
- Custom Scripts: Python (
xml.etree.ElementTree
) or JavaScript. - Dynamic Sitemaps: For frequently updated content (e.g., e-commerce).
Best Practices
- Limit to 50,000 URLs per file (split into multiple sitemaps if needed).
- Submit via Google Search Console.
- Exclude non-indexable pages (e.g.,
/admin/
). - Use
lastmod
accurately to reduce unnecessary crawls.
Advanced: Sitemap Index Files
For large sites, use a sitemap index (sitemap-index.xml
) to group multiple sitemaps:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-posts.xml</loc>
</sitemap>
</sitemapindex>
You May Also Like