Error Log Analysis

If you have suggestions for improvement, please let me know.

Introduction

This will help you find problems people are having accessing your site. The reason for summarizing the errors is that going through the logs by hand, you come across duplicates errors and have to remember if you have dealt with them yet. In the analysis, they are already nicely sorted by type and filename.

NOTE: If you have fixed the error since the last time you received an error log analysis, the reported errors may have been from after the report, but before you fixed the problem.

Why both this and MOMspider?

Whereas MOMspider walks through your site, this program uses your weekly error log. MOMspider will help you find and fix internal links on your site and bad links to external sites, this will help you locate external sites that have bad links to your site.

Referrers

When a page has a link to another page contains an image and a user loads the page with autoload of images enabled(for an image) or clicks on the link, the browser will send, as part or the HTTP standard, what page the reference to the requested image or page came from.

[NONE]

There are a couple of ways for the server to not have received the referring page for a request:
  • If someone types in a URL by hand or uses a bookmark, then the referring page will show up as '[NONE]'.
  • Some robots and spiders will also not send a referrer, so it will show up as '[NONE'] for these also(Excite's webcrawler is one of these).

    robots.txt

    You will see references to a file 'robots.txt' with a referrer of '[NONE]'. Most robots, spiders, etc. will ask for this file to determine what pages they are allowed to access. If the file does not exist, they will assume that any pages on the system are fair game. See A Standard for Robot Exclusion for more details.
If '[NONE]' is the only referrer, then someone may be snooping around your site, or just mistyped a URL. It could also be a robot/spider looking for a file that you used to have on your site. These are included to give you an idea of what is being for on your site.

Sample output

==================================================
Error Log report for error_log
==================================================
Lost connection          : 168
Connection refused       : 4
Timed out                : 56
Network unreachable      : 184
CGI script warning lines : 5
 
------------------------------
Invalid URI              : 5
------------------------------
       1 :  image/jpeg, image/pjpeg
       1 : , image/x-xbitmap, image/jpeg, image/pjpeg
       1 : age/gif, image/x-xbitmap, image/jpeg, image/pjpeg
       1 : e/jpeg, image/pjpeg
       1 : eg
 
------------------------------
Bad headers              : 3
------------------------------
       3 : /home/clarke/public_html/cgi-bin/ads.cgi - referred by:
           1 : [NONE]
           2 : http://www.clarkecomputer.com/~clarke/test.html
 
 
------------------------------
Does not exist           : 24
------------------------------
       2 : /submit_tips.html>Good Search Engine Tips</A><LI><a HREF= - referred by:
           2 : http://www.badhtmlhome.com/search_engines.html
 
       2 : /home/clarke/public_html/jkafjdjkljdf.jpg - referred by:
           2 : http://www.clarkecomputer.com/~clarke/foo.html
 
       1 : /home/clarke/public_html/not_here_or_there.html - referred by:
           1 : http://www.clarkecomputer.com/~clarke/foo.html
 
       1 : /www/htdocs/oldsearch.html - referred by:
           1 : [NONE]

      18 : /www/htdocs/robots.txt - referred by:
          18 : [NONE]

Types of errors

The first part of the output will include some general notices. Not much can be done about these. This is more just to let you know what is happening.

Lost connection
The connection was dropped for some reason before it was properly closed by both ends. They may have clicked on a link before the page was finished loading, hit reload, etc. Or there may have been some networking problems between their machine and the server.
Timed out
A response or close of the connection was not received within the timeout period. Some browsers don't respond properly and could cause this, or it could just be a network problem.
Network unreachable or No route to host
A networking problem occurred and we couldn't talk to the browser.
CGI script warning lines
This is the number of lines in the error log that don't appear in the standard format. These are produced by CGI scripts and are usually warnings, but could be errors(are definitely errors if you have "Bad headers"). You will have to look at the error log for more details.
Does Not Exist
This file was not found on the system. Look at the referrer for what page thinks it should be there. If it is referred from a search engine, you may be able to tell the search engine to delete references to the file. Check with the search engine's pages to find out how.
Can't read directory
The server has permission to access files in the directory, but not permission to read it. The main reason to try and read the directory is for an index of that directory.
Invalid URI
There was garbage where the server expected a request to be. The main form I've seen this take is when the browser send a goofed up Accept: header(what MIME types the browser supports).
Filename too long
Just what it says. Usually caused by an unterminated HREF.
Bad headers
A CGI program produced some output before it produced valid HTTP headers. Usually caused by an error in the program.
Permission denied
Either a page doesn't have read permission for the web server, or a CGI doesn't have execute permission set. Also happens when it is not a normal file or directory..
Unable to include
Couldn't include the file in a Server Side Include for some reason. Most often because the file doesn't exist.
Bad user
Not a recognized name for a page requiring authorization.
Password mismatch
An incorrect password was entered for the user.
To visit any of the sites below without leaving this site hold down the shift key when you click on the link.


Google
 
Web http://www.clarkecomputer.com

Domain Hosting Error Log Analysis Submittal Engines Free Web Hosting
What's New Domain Resources B&N BookStore Privacy Policy

Please send any questions or comments to: clarke@clarkecomputer.com
Phone: (970) 482-6785.
© 1995-2015 Clarke Computer Company