Random stuff in my head. Updated Occasionally

I discovered a new tool today and in discovering it, I found that I needed to make some corrections to some of my sites.

The file robots.txt that sits quietly at the root of most web sites tells spiders what they can index and what they cannot. All responsible search engines respect what the robot.txt file tells it to search and what not to search. This file doesn’t, in any way, actually restrict your web site’s subdirectories from prying eyes but it does help the good guys to concentrate on the good stuff and not waste their time on the stuff you don’t want out there.

There is a particular syntax for robots.txt. It is not extensive but I was surprised that some of the software that I was installing didn’t respect what little syntax there is.

I discovered a web page that will check the syntax of your robots.txt file and let you know what needs to be corrected. Just go to http://tool.motoricerca.info/robots-checker.phtml and enter your web site name followed by robots.txt and you will receive a page that verifies the syntax of your robots.txt file and let’s you know what, if anything, needs to be corrected.