Building an XML Sitemap
An XML sitemap feed lists all of the pages on your website that you want the search engines to know about. While theoretically the search engines should be able to find all of your pages by following links, it still helps to have it there for completeness and to take advantage of the benefits that the webmaster tools offer.
For SEO purposes, it is essential that you (a) build an XML sitemap and (b) keep it up-to-date in order to help improve spiderability and ensure that all the important pages on your site are crawled and indexed. XML sitemaps give the search engines a complete list of the pages you want indexed, along with supplemental information about those pages, including how frequently the pages are updated. This does not guarantee that all pages will be crawled or indexed, but it can help.
It’s worth pointing out that an XML sitemap is different from the standard sitemap that you include on your site. XML sitemaps are feeds designed for search engines; they’re not for people. They are merely lists of URLs with some optional
XML sitemaps were designed to help sites that historically could not be crawled by the search engines (sites with dynamic content, Flash or Ajax) get their content spidered and listed in the index. That’s not to say that using an XML sitemap is a way around building a spiderable website, however, since all it does is hand a list of available URLs to the search engines. When creating a new site, you want to make sure that you are creating it from a sound search engine optimization standpoint. Creating a sitemap will not pass on any link popularity, nor will it help with subject theming.
An XML sitemap is created using XML (Extensible Markup Language), which is a type of markup language commonly used on the web where tags can be created to share information. The required XML tags are: <urlset>, <url>, and <loc>. The tags <urlset> and <url> are for formatting the XML, and <loc> is for identifying the URL.
Optional meta data tags are:
- <lastmod> – last modified date
- <changefreq> – how often the page changes (such as hourly, daily, monthly, never)
- <priority> – how important the page is from 0 (the lowest) to 1 (the highest)
Site owners aren’t required to use these tags, but the engines may consult them when deciding how often they should re-crawl pages. Google states in their Webmaster Guidelines that while they take these tags into consideration, they do not base their spidering decisions on
An XML sitemap listing for a URL looks like this:
< urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> < url> < loc>https://www.bruceclay.com/jp//</loc> < lastmod>2008-01-01</lastmod> < changefreq>monthly</changefreq> < priority>1.0</priority> </url> </urlset>
If you don’t want to have to type that out for each of your site’s pages, fear not. There are quite a few sitemap generators that can spider your site and automatically build an XML sitemap file for you. Two of our favorites are:
Be careful to set up the sitemap generator tool properly to avoid spidering pages you do not want indexed.
For very large websites, your XML sitemap feed should be broken up into multiple files. Google has set a limit of 50,000 URLs and a file size of
User-agent: * sitemap: http://www.your-domain-name.com/sitemap.xml
Google and Bing also offer engine-specific ways for you to alert them to your XML sitemap.
| Google: | You can submit your sitemap through Google Search Console (formerly known as Webmaster Tools). This will allow you to see when Google last downloaded your sitemap and any errors that may have occurred. Once you have validated your site, you can also view information such as Crawl Errors (including pages that were not found or timed out), Google |
| Bing: | M |
| Yahoo: | When you submit your XML |
| Ask: | Ask supports |
There are many benefits to creating an XML sitemap. If you launch a new site, issue a redesign or perform a large update, submitting a sitemap is
Once you create your XML sitemap and let the search engines know about it, make sure to keep it up-to-date. If you add or remove pages, make sure your sitemap reflects that. You should also check Google Search Console frequently to ensure that Google is not finding any errors in your sitemap.
The goal of XML sitemaps is to help search engines crawl efficiently and thoroughly. So help them by using these tags appropriately to help them understand how to best crawl your site. You can find more information about the Sitemaps protocol and XML schema at http://www.sitemaps.org.