An interesting news items has been circulating this week. WebmasterWord.com is a highly respected news and discussion forum for webmasters, run by search engine optimization guru Brett Tabke. Brett decided to ban all search engine spiders from spidering the webmasterworld site because out-of-control spiders were causing him to lose bandwidth and were impacting his server performance. The ban he implemented on spiders using his robots.txt file also included the big search engine spiders from Google, Yahoo! and MSN.
According to Brett, he expected to have about 60 days before his pages started getting dropped from the major search engines. As it turns out, it took only two days before Google dropped his site completely. MSN currently only shows one link (without any description), whereas Yahoo! still shows 134,000 matches as of today.
What does this mean to you?
Firstly, don’t panic. Brett’s site was dropped from Google because he changed his settings in his robots.txt file. The robots.txt file is a file that can be placed in a website’s root directory for the purpose of excluding some or all search engine spiders from crawling your site or parts of your site. For example, you might not want a spider to crawl your shopping cart page or secure pages of your website.
No one but you (or your webmaster or web designer etc, and assuming your site hasn’t been hacked) can implement or change a robots.txt file. So don’t be afraid that your worst competitor can change your robots.txt file and get your site dropped from Google – this cannot happen.
However, the story does point out the importance of the information in your robots.txt file and the speed with which your information gets acted upon. Unless you have specific problems like Brett’s load issues, I’d recommend against excluding any spiders from your site. If you want to exclude particular areas of your site and are unsure of how to do this, consult an expert first before making avoidable (and very costly!) mistakes!