Web Server Log Files
A web server log file is a text file that is written as activity is generated by the web server. Log files collect a variety of data about information requests to your web server. Some examples of the data collected and stored are: Date, Time, Client IP Address, Referrer, User Agent, Service Name, Server Name, Server IP, etc.
Your server logs act as a visitor sign-in sheet. They can answer questions such as: Who visits your website? What browsers do they use? Where do they go in your site? What pages do they view? Your server log files can tell you:
- What pages get the most and the least traffic
- What sites refer visitors to your site
- The pages that your visitors view
- The browsers and operating systems used to access your site
- When search robots and directory editors visit your site
The above data can help you identify specific problems on your website. If you have many visitors but few sales, check your server logs to learn the number of visitors that view your product offerings. Do you need ROI on search marketing campaigns? Your server logs can reveal the traffic and conversions generated by your marketing campaigns.
Advantages of Web Server Log Files
Data Ownership: Your web server creates the log file as requests are served, and the data is collected and stored on your own equipment. Regardless of server location, your log files can be stored on the same network serving your web pages (unless your website is on a shared hosting environment).
Data Collection Flexibility: Your web servers can be instructed to collect specific data while ignoring other data. This gives you the ability to choose the information, file types, server errors, redirects, etc., that you want to analyze.
Easy Implementation: No page tagging or other page coding is needed.
Database Integration: Some web servers permit direct requests to a database application. If you have advanced SQL developers you can answer many of your web analytics questions without using an expensive analytics application.
Ability to Measure Robot Traffic: It is important to exclude robot and spider traffic from your reports on website use. However, it can be useful to have the robot activity data when analyzing the effect of SEO efforts as this indicates indexing frequency.
Disadvantages of Web Server Log Files
Proxy Caching: Proxy servers speed delivery of web pages to users but have a negative effect on web server log files because the request for content never actually comes through to the web server. In caching, the requested information is kept on the Internet Service Provider's machines, which are closer to the user (to speed delivery) as opposed to the content's original site.
Browser Caching: Browser caching refers to the ability of your web browser to store frequently or recently viewed information on your computer's hard drive for speedy retrieval. This is why you get immediate delivery with your "back" and "forward" buttons. Because the information is stored on the user hard drive during this process, the information is not recorded, thus lost to the web server log files. Since users frequently use their back/forward buttons, information about visitor navigation is not available in the web server log file.
IP Address Unique Identified: Since the IP address is always available to the web server log file; one would think this would be a good way to determine visitor uniqueness. This is not so because proxy servers are frequently used to pass requests for information to web servers. The result is that many different users are identified by the same IP address.
This issue, and proxy caching and browser caching issues, are the most serious disadvantages of using web server log files as a data source. Estimates of the information lost have been pegged at 40 percent or higher. This would translate to a 40 percent traffic undercount to your website if log file analysis is used by your web analytics provider.
Upfront Costs: When using a web server log file analyzer you must purchase all the software, hardware and expertise in advance. This differs from the ASP model used for client-side data collection where you pay a monthly fee.