Get Free Quote

BACK TO BASICS: XML Sitemaps Defined

Part Two: Building Google Specific XML Sitemaps

By Susan Esparza, May 15, 2009

In our previous article, we focused on the difference between traditional HTML site maps and XML Sitemaps and established that both serve valuable purposes in your search engine marketing strategy. Google has taken XML Sitemaps a step further and developed protocols for specialized XML Sitemaps for news, mobile, code, video and geographic content. These special protocols can seem daunting at first, so we're going to break them down and explain what each are, whether your site needs them and how to build them if you do.

News Sitemaps

News organizations often are feeding out multiple stories during the day. A traditional RSS feed is a good way to get this content out to multiple sources, but if you want to feed content directly into Google News, consider building and maintaining a News Sitemap.

News Sitemaps are very similar to regular XML Sitemaps but they include a second namespace for the news schema. Google provides an example of this:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">

In order to ensure that your News Sitemap will be accepted, you'll need to take care of a few precautions first when it comes to formatting your URLs and Sitemaps:

  • Verify your site. If you haven't verified your site in Google Webmaster Tools, you can't submit URLs for that site. However, you only need to verify the domain, sub-domains can be submitted together. For example, sports.dallasnews.com, and entertainment.dallasnews.com, politics.dallasnews.com can all be submitted in the same News Sitemap.
  • Drop off any session IDs. Session IDs in the URL can lead to duplicate content and may result in your pages not being indexed.
  • Keep it current. Any URL older than three days should be taken out of the Sitemap. Format your URLs to comply with the W3C format. Google will use that to determine if the article is recent enough for inclusion. Here's Google's example of an article publication date tag:
    <news:publication_date> 2006-08-14T03:30:00Z </news:publication_date>
  • Keep it concise. A News Sitemap can have no more than 1000 URLs. If you have more than 1000 URLs, you'll need to generate multiple Sitemaps.
  • Make it readable. Readable means that both your server and Google will be able to read it. Google cannot read some ASCII characters or other special characters. Submitting a sitemap with these errors will result in an error message.

Each entry in the Sitemap can also include a comma-separated list of keywords describing the content of the article. Considering making the keywords match the current Google News categories to give them a better idea of where the article belongs. The code example below (courtesy Google) demonstrates what an XML Sitemap with just one news article listing would look like:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">
   <url>
      <loc>http://example.com/article123.html</loc>

      <news:news>
         <news:publication_date> 2006-08-14T03:30:00Z </news:publication_date>
         <news:keywords>Business, Mergers, Acquisitions</news:keywords>         
      </news:news>
   </url>

</urlset>

Mobile Sitemaps

Every year for the last several years has been declared the year of mobile Web. And while it seems that the true mobile revolution is still in the future, savvy web marketers will be getting ready for it now. Don't leave your company playing catch up when the revolution comes.

Mobile Sitemaps are very similar to regular XML Sitemaps but they use a specific tag and namespace requirement to identify mobile content. Feeding a Mobile Sitemap to Google will enable your mobile site to be found and indexed in Google's Mobile Search Index. While mobile browser are becoming more sophisticated and require less tailoring of their formats, it's still a good idea to remember that your customers will be browsing on a very small screen, usually while on the go.

Build your Mobile Sitemap as you would your regular XML Sitemap and include only mobile pages in it. Never mix types of content in your specialty Sitemaps; they will just be ignored. Right now the protocol recognizes four types of content: non-mobile (which include most types of files), XHTML mobile profiles, WML and cHTML so program carefully.

This sample piece of code from Google shows what a Mobile Sitemap file would look like with just one entry included.

<?xml version="1.0" encoding="UTF-8" ?>
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0">
    <url>
        <loc>http://mobile.example.com/article100.html</loc>
        <mobile:mobile/>
    </url>

</urlset>

Code Search Sitemaps

Code Search Sitemaps are a relatively new function from Google. Code Search Sitemaps allow you to include code on your Web site in the Google Code Search function. This type of Sitemap protocol is targeted towards Web sites that host public source code. This means that you can find function definitions or sample code from sites that host publicly accessible Code. This a useful tool for any IT department, but only use one of these for your site if you want your code to be accessible for a Google Code Search.

A Code Search Sitemap is like a regular Sitemap, but it has some Code Search-specific information. You can submit them like a regular XML Sitemap, but it's not recommended that you use the Sitemap Generator to create one. Google has outlined the accepted formats as the following:

Suffix Archive Type
.tar tarfile (Tape Archive)
.tar.z tarfile compressed with "compress"
.tar.gz tarfile compressed with gzip
.tgz tarfile compressed with gzip
.tar.bz2 tarfile compressed with bzip2
.tbz tarfile compressed with bzip2
.tbz2 tarfile compressed with bzip2
.zip Zip archive

You should use the suffix value to create your Code Search Sitemap.

A Code Search Sitemap uses the XML Sitemap protocol, with the additional Code Search-specific tags as defined below in this sample. A Code Search Sitemap entry using Code Search-specific tags looks something like this:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:codesearch="http://www.google.com/codesearch/schemas/sitemap/1.0">
<url>
   <loc>http://mysite.org/download/myfile.c</loc>
   <codesearch:codesearch>
       <codesearch:filetype>C</codesearch:filetype>

       <codesearch:license>LGPL</codesearch:license>
   </codesearch:codesearch>
</url>

<url>
   <loc>http://mysite.org/download/myproject.tgz</loc>

   <codesearch:codesearch>
       <codesearch:filetype>archive</codesearch:filetype>
       <codesearch:license>Apache</codesearch:license>
       <codesearch:packagemap>packagemap.xml</codesearch:packagemap>

   </codesearch:codesearch>
</url>
</urlset>

Each URL in a Code Search Sitemap can point to an archive file or a code file.

Video Sitemaps

The integration of video content and other Engagement Objects™ into your site's content is a necessary and important factor to search engine marketing success. The indexing and ranking of video files provides a strong opportunity for gaining greater traffic and mindshare in the search results.

Video Sitemaps are built using the Media RSS protocol (mRSS) which allows a great deal of information about the videos to be passed to Google and helps them identify the content and relevancy of the videos to various search queries.

Building your Video Sitemap can be done using the XML Sitemaps protocol so long as it also includes the notation that identifies it as video content:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">

Other video specific tags provide further indexing information. Some key information provided by a Video Sitemap includes:

  • The location of the video (where it can be watched by a user)
  • The titles and description of each video
  • How long the video is
  • The location of a thumbnail image for the video that can be shown in SERPs
  • The location of an embeddable version of the video

Here is a sample of a Video Sitemap entry using video-specific tags:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url> 
  <loc>http://www.example.com/videos/some_video_landing_page.html</loc>

    <video:video>     
      <video:content_loc>http://www.site.com/video123.flv</video:content_loc>
      <video:player_loc allow_embed="yes">http://www.site.com/videoplayer.swf?video=123</video:player_loc>
      <video:thumbnail_loc>http://www.example.com/thumbs/123.jpg</video:thumbnail_loc>

      <video:title>Grilling steaks for summer</video:title>  
      <video:description>Get perfectly done steaks every time</video:description>
      <video:rating>4.2</video:rating>
      <video:view_count>12345</video:view_count>

      <video:publication_date>2007-11-05T19:20:30+08:00.</video:publication_date>
      <video:expiration_date>2009-11-05T19:20:30+08:00.</video:expiration_date>
      <video:tag>steak</video:tag>
      <video:tag>meat</video:tag>

      <video:tag>summer</video:tag>
      <video:category>Grilling</video:category>
      <video:family_friendly>yes</video:family_friendly>
      <video:expiration_date>2009-11-05T19:20:30+08:00</video:expiration_date>

      <video:duration>600</video:duration>
    </video:video>
</url>
</urlset>

Geo Sitemaps

Google Geo Sitemaps enable companies to publish content directly to Google Maps and Google Earth that is explicitly geographically located in nature. Geolocated content is a major advantage for any location based company. With the recent trend in Google algorithms to favor personalization over globalization, local searches are becoming more and more prominent in the search results.

Any company with multiple instances of location-based content would be wise to have a Geo Sitemap included in their feeds to Google. A pizza franchise could include all their restaurant locations, a travel company might want to mark-up popular travel destinations and packages.

As with all other XML Sitemaps, Google does not guarantee that they will add the information but they welcome the further information as a way to keep their index up to date.

If you're looking to build a Geo XML Sitemap then the first thing you're going to have to do is figure out which format you're going to be using. Google currently supports KML and GeoRSS formats, so focus on those two for now. They do plan to add support for more formats in the future.

Building an XML Sitemap for geospatial content is very similar to building a traditional XML Sitemap. You'll need to be sure to add attribution tags, as those will show in the Google results for your content. Optimize those tags and then post your content to the Web. Create a Sitemap file as you would normally do and post that file in the root of your Web site.


For permission to reprint or reuse any materials, please contact us. To learn more about our authors, please visit the Bruce Clay Authors page. Copyright 2009 Bruce Clay, Inc.