Focusing SEO Efforts with Server Log Data
Focusing SEO Efforts with Server Log Data
by Virginia Nussey, November 16, 2009
|
|
The ability to track a consumer’s interactions with a brand or business from initial exposure to completed conversion is one of the most exciting and useful developments to happen to commerce since the advent of the Internet itself. Finally, business owners are able to see, in records generated straight from the source as they happen, what’s working or not working with customers visiting a Web site.
One of the most basic sources of this data is a Web server log. A server is a computer or computer program that delivers Web content. A server log is a file automatically produced and maintained by the server of all the activity it has performed. There are usually multiple server logs generated by a server. An access log records the activity of successfully delivered content, while an error log records activity of when things go wrong.
Analyzing the data stored within a Web server log provides a business owner or Web site optimizer with an educational resource about the behavior of visitors to the site. Knowing where visitors to the site are coming from and what they are interested in once they’re there is an invaluable step in the optimization of any site.
By knowing the most popular pages, products or resources on the site, an SEO can devote time to bettering these profitable areas. By identifying stumbling blocks or less popular areas of a site, an SEO can manage their time when improving these resource drains of the site. By understanding current interactions on the site there can be improvement of future interactions and experiences.
What the Server Log Reveals
Through analysis of a server log, a site owner or optimizer can learn:
- When a request to the server was made.
- The IP address of the system that requested the information.
- The Web address that referred the system making the request.
- The domain from which the request was made.
- The type of browser making the request.
- The amount of data transferred in response to the request.
- Any errors that occurred in response to the request.
With this information, a site can be improved and tuned to best serve visitors and to drive more traffic. This insight into consumers’ actions can be capitalized upon as the lessons learned are applied to the site.
How to Obtain a Server Log
Many hosting companies allow access to the raw logs through a control panel or FTP, but some may not allow any access at all. For those instances, it may be beneficial to use popular log analyzers such as AWStats or Webalizer, which can generate reports on relevant data points and allow for quick and easy access. If the host doesn’t allow access to the raw logs and doesn’t allow stats through a free log analyzer, we recommend finding a new host.
Further information about locating and configuring Apache server log files can be obtained from the Apache Software Foundation. For an IIS server, Microsoft has outlined a series of steps about how to find and view log files.
Reading a Server Log
The server log file looks complicated on first glance. It appears as a jumble of numbers and letters, with a typical recorded action looking something like this:
216.147.35.400 - - [16/Nov/2009:12:06:03 -0800\ "GET /examples/ HTTP/1.1" 200 2462 "http://example.net/index.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" "example.net"
Server log analyzing software can help users parse and distinguish the unique pieces of information contained with server log files. However, this raw data can be analyzed without analyzing software by understanding what the information means:
Example | Explanation |
216.147.35.400 | The IP address of the system that requested the information. |
[16/Nov/2009:12:06:03 -0800\ | Time stamp in Universal Time Coordinates of when a request to the server was made. “-0800\” indicates that the server is located 8 hours west of 0 longitude. |
"GET /examples/ HTTP/1.1" | A request to the server to deliver something. |
200 | The server response to the request. “200” and “304” are common response codes that indicate normal delivery. |
2462 | The number of bytes of data transferred in response to the request. |
"http://example.net/index.html" | The referring Web address or URL. |
"Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" | The type of browser and the operating system from which the request was made. |
"example.net" | The domain used to access the resource and display the data delivered. |
With the information provided in the server log at hand, an SEO can focus optimization efforts. An optimizer can devote time to the pages that receive the most traffic, to the sites referring the most visitors and to the common conversion paths on the site.
Identify traffic patterns: From a server log, an SEO can discover the pages that get the most and the least traffic, the sites that are referring visitors, the pages that visitors are viewing most often, the most popular browsers and operating systems among visitors, and when search engine spiders visit the site. Along with human visitors, spider visits can be identified through the IP address field of the log recording. Take advantage of this information by focusing efforts and capitalizing upon opportunities.
Analyze each piece of data reported in a server log, paying special attention to the potential for optimization. For instance, identify what time of day the majority of visitors are coming to your site. If you find that the bulk of visits occur at one point in the day, keep on eye on that time of day to make sure that all requests to the server are being satisfied. Analysis of the time data may lead to the discovery that the server is being overloaded with requests at some point in the day. With this knowledge, an SEO or webmaster can optimize server capacity by balancing the load.
Establish connections: Another opportunity presented by the data provided in server logs is the potential to build relationships on the Web. If a webmaster recognizes that a single site continually appears in the referring sites portion of server log data, he or she may want to contact that site and begin building a relationship. The data seems to point to the fact that the referring site’s visitors are finding value in the content available on your site.
One possible way to grow this relationship is to volunteer to write articles for that site, including a link back to your site. A relationship of mutual respect between the sites can join the two communities, driving additional traffic to your site and providing your site visitors with a worthwhile resource.
Control robot behavior: Along with optimizing the human visitor experience of your site, you’ll also want to optimize non-human interactions with your site. By tracking the spiders visiting your site you’ll be able to block spiders you don’t like and clear the way for spiders you like. Some bots, like those from the search engines, are welcomed guests to the site; however, some bots do more harm on your site than good, such as email harvesters and other content scrapers. Blocking bad bots via the .htaccess file will reserve your site’s bandwidth and resources for more desirable traffic.
Additionally, it may be beneficial to pay attention the browsers accessing your site. Log file analysis will indicate the popular browsers, as well as the upcoming popularity of new browsers as they are launched and adopted by user. When new, popular browsers are discovered, an SEO can devote time to making sure the site displays as desired within that browser.
Develop site content: Along with optimizing visitor experience, bot herding and relationship building, the site content itself can be optimized thanks to the data provided in server log files. The log file indicates what content from the site is being requested. Once the most popular content is identified, contextual text can be optimized and the popular content can be featured in prominent locations on the site and outside the site.
For example, an image may be a big draw on the site. Knowing this, an SEO can optimize the Meta data for that image, can create additional opportunities on the site to showcase that image, and can submit links to that image within online social communities that are interested in that image. Additional content along the same theme can be created to build upon the topic as well.
All the above actions may never be taken if the webmaster does not monitor server logs for potential issues and areas of improvement. Server logs are a reliable record of the actions taken on the site and the content viewed by visitors. The data held within server logs points to areas that need optimization as well as sections of the site that are performing well. And with this information an SEO can focus time and energy to making beneficial improvements to the site.
For permission to reprint or reuse any materials, please contact us. To learn more about our authors, please visit the Bruce Clay Authors page. Copyright 2009 Bruce Clay, Inc.