XML Sitemaps: Why URL Sequencing Matters Even if Google Says It Doesn’t
There are a great many things that SEOs debate – do this, don’t do that, this makes a difference but that doesn’t.
No one knows the secrets of Google’s algorithms except Google (and sometimes I wonder if the algorithm is even too complex for some at Google to know how it works).
That said, there are some things that Google flat-out says doesn’t matter if we do them or not. Does that mean we shouldn’t do them? No, it doesn’t.
In a perfect world, our websites and Google will perform exactly how they should and in our favor. In reality, any number of things can go sideways when search engines crawl a site.
If it doesn’t harm a website to implement something that may make it better for search engines to crawl and understand – and it’s easy enough to do – then why not do it?
XML sitemaps and priority tags, change frequency tags, and URL ordering are some of those debated SEO tactics. Let’s discuss:
- XML sitemap basics
- Priority tags
- Change frequency tags
- So, should we use the tags?
- URL sequencing
- FAQ: How does the priority tag affect Google’s crawling behavior?
XML Sitemap Basics
An XML sitemap is a file that webmasters create and put on their site to tell search engines like Google and Bing about the pages, images, and videos that are on the site.
The sitemap works like a map, helping ensure more thorough crawling and indexing. However, an XML sitemap does not guarantee that search engines will index or crawl all pages, nor will a sitemap impact your rankings.
(Check out our primer on XML sitemaps for more information.)
There are optional tags that some people like to include in their sitemap, such as the priority tag and the change frequency tag. Google’s official stance is that it ignores both.
Then there is the practice or URL sequencing (ordering the URLs in your sitemap in order of your priority). Google says it ignores this, too.
The priority tag tells Google how important the page is from 1 (the highest) to 0 (the lowest).
Sitemap priority values go from 1.0 to 0.0 with the 1.0 value indicating the most important page on the site. Priority values look like so: 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1 and 0.0. For instance, 1.0 would likely be the homepage.
If you use this tag, it doesn’t mean that Google will crawl the pages you think are important the most. It is up to the algorithm to determine what is most relevant and priority may not match. You can try to figure out which pages get crawled the most by looking at server logs.
In general, when using the priority tag you’d assign values like this:
- 1.0 – 0.8 = Category pages, homepage, top landing pages
- 0.7 – 0.4 = Blog articles, secondary category pages, subcategory pages
- 0.3 = 0.0 = Pages that are not as important like outdated content or utility-type pages
Change Frequency Tags
The change frequency tag is meant to tell Google how often pages are updated on your website. In theory, Google would view it and make a decision on whether to crawl that page again based on when it was last updated.
Again, Google’s official stance is that it ignores the change frequency tag. And, Google’s John Mueller has clarified that it is better to “specify the time stamp directly so that we can look into our internal systems and say we haven’t crawled since this date therefore we should crawl again.”
This tag is most certainly irrelevant if it’s not accurate. As Mueller mentions in the video above, “we see a lot of sites they give us this information in the sitemap, they said it changes daily or weekly, and we look in our database and it hasn’t changed in a month or years.”
So, Should We Use the Tags?
The fact that the tags are mentioned in Google’s XML sitemap documentation at all as “optional” is confusing – especially when they say they ignore them. Does Googlebot really ignore them every time? It’s hard to know. It is always better to use them, especially last revision dates (<lastmod>).
I believe there is another option you can do with your sitemap to indicate how you want the search engines to crawl. Google also says it ignores this method, though, but stick with me.
URL sequencing is the practice of ordering the URLs in your sitemap according to priority – but not in the way the priority tag works. Google ignores the priority tags, so Google thinks all pages are “equal” thus sequence does not matter … at least for priority. But the pages are not equal in real life, especially if you have poor crawl budgets.
We strongly recommend that you sequence the URLs in your XML sitemap in a way that addresses both the concepts of priority and change frequency without using the tags. Here, the goal is to get key pages indexed faster.
And BTW, only use URLs that match the canonical tags found somewhere on your site or that are important pages!
For example, this is a sequence emphasizing the most recently modified pages:
- One-day-old recently changed entries (new redirect targets, new or revised pages) [500 entries per XML page]
- One-week-old entries as above, just a bit older [500 entries per XML page]
- Remaining 200 code pages (sorted descending by impressions) [1000 entries per page]
- Any other redirecting pages (30x codes) [5k per page]
- Images and videos [500 per page]
- 404 pages [10k per page]
- The rest [10k per page]
Essentially what you are doing is providing a roadmap to the search engines about the pages you believe need to be crawled based on newness. You want all pages to get into the index and believe that the pages spidered last week are already there – but new pages are not. Use our sequencing directives above.
Why bother doing this if Google has explicitly said it ignores priority and sequence?
Given unlimited crawl budget, few website redirects, and no errors, all files get crawled. Priority and sequence do not matter. Google is right if assumptions hold.
But they don’t hold. There is a crawl budget, and it is eaten away by redirects and 404s. If you have anything other than a page (image, video, hreflang, etc.), the time estimates are wrong. Throw in any significant errors and the remaining files in the XML sitemap are ignored.
While in an ideal world priority and sequence do not matter, in a crawl budget world they do matter. In my experience, the sequence of the URLs to the search engine is all that matters.
We have found that this increases the number of pages spidered and decreases the “abandoned due to error” issues.
Yes, it is up to the bots to decide how to handle XML sitemaps. However, implementing something that could potentially help search engines crawl and index your website content – if easy enough – is never a bad idea.
Our SEO experts can help you optimize your site so that it can be easily crawled and indexed. Reach out to us for a free consultation.
Priority tags within XML sitemaps play a crucial role in determining the importance of web pages to search engines like Google. As an expert in SEO, I can attest to their significance in optimizing website crawling and indexing.
When it comes to Google’s crawling behavior, priority tags act as signals that inform the search engine about the relative importance of different pages on a website. By assigning values from 1.0 (highest) to 0.0 (lowest) to each page, webmasters can guide Google to focus on crawling the most critical content first. However, it’s essential to understand that while priority tags provide guidance, the final decision on crawling and indexing still lies with Google’s complex algorithms.
Webmasters assign higher priority values (usually between 1.0-0.8) to pages they consider important such as category pages, homepages, and landing pages in order to ensure they are crawled often and therefore more visible to users. Lower priority values tend to be given for less significant pages such as blog articles or utility pages.
Despite the option to use priority tags, it’s crucial to recognize that Google may not always adhere strictly to these assigned values. The search engine’s algorithms continuously evaluate various factors, such as the relevance and freshness of content, user engagement, and crawl budget constraints, to determine the optimal crawling strategy.
As an expert in the field, I advise webmasters to complement the use of priority tags with other SEO best practices. One such approach involves optimizing URL sequencing to further guide Google’s crawling behavior. By structuring URLs in a sequence that reflects the importance and freshness of content, webmasters can potentially influence crawl frequency positively.
To leverage priority tags effectively, webmasters should regularly monitor server logs and analyze how Google’s bots interact with the site. This data can provide valuable insights into which pages are crawled more frequently, helping webmasters refine their XML sitemap strategy over time.
While it’s true that priority tags can be a valuable tool for guiding Google’s crawling behavior, they should not be relied upon in isolation. Expert SEO practitioners understand that a holistic approach, combining priority tags, URL sequencing, and other SEO best practices, yields the best results. By staying informed about Google’s evolving algorithms and maintaining a user-focused approach, webmasters can optimize their website’s crawling and indexing, ultimately leading to improved search engine visibility and user experience.
Step-by-Step Procedure: Optimizing Priority Tags for Google’s Crawling
- Identify Critical Pages: Analyze your website’s structure and content to determine the most important pages that require frequent crawling.
- Assign Priority Values: Assign priority values from 1.0 to 0.0 to each page in your XML sitemap, with 1.0 indicating the highest priority.
- Prioritize Top Landing Pages: Ensure that your homepage and other critical landing pages receive high-priority values, reflecting their importance.
- Categorize Content: Group similar pages into categories and assign appropriate priority values based on their significance to your website’s goals.
- Consider Freshness: Factor in the freshness of content when assigning priority values. Newly updated or time-sensitive content may warrant higher priorities.
- Monitor Server Logs: Regularly review server logs to track Googlebot’s crawling activity and identify pages with low crawl rates.
- Analyze User Behavior: Use web analytics to understand user behavior on your site, helping prioritize pages that attract higher user engagement.
- Optimize URL Sequencing: Structure URLs in a sequence that aligns with content importance and freshness, reinforcing priority tag signals.
- Implement Last Revision Dates: Include lastmod tags in your XML sitemap to provide additional information on content freshness.
- Regularly Update XML Sitemap: Keep your XML sitemap up to date with new content and changes to ensure accurate priority signaling.
- Focus on User Experience: Keep your website user-friendly and engaging, as positive user signals can influence crawling behavior indirectly.
- Address Crawl Errors: Identify and fix crawl errors promptly to ensure that valuable pages receive appropriate attention from Googlebot.
- Regularly Review Crawl Budget: Monitor your website’s crawl budget and adjust your priority tag strategy accordingly.
- Avoid Keyword Stuffing: Use priority tags strategically and avoid keyword stuffing, as it can lead to ranking penalties.
- Utilize XML Sitemap Index: If your website has a large number of pages, consider using an XML sitemap index to manage crawling signals efficiently.
- Consult Google’s Webmaster Guidelines: Stay informed about Google’s recommendations and updates related to XML sitemaps and priority tags.
- Test and Iterate: Continuously monitor the effectiveness of your priority tag strategy and make adjustments as needed.
- Foster High-Quality Backlinks: Quality backlinks can enhance your website’s authority and may indirectly impact crawling behavior.
- Leverage Structured Data: Implement structured data to provide additional context to search engines about your content.
- Seek Professional Advice: If in doubt, consult with SEO experts or professionals to fine-tune your priority tag strategy and improve crawling efficiency.
By following these steps and adopting a comprehensive approach to SEO, you can optimize priority tags effectively and improve Google’s crawling behavior on your website. Remember to regularly review and adjust your strategy based on performance metrics and industry best practices.