Yahoo! recently announced that they have indexed 20 billion objects within its index. My colleague Nancy Hackett, recently wrote about this announcement in her article Search Engine Technology: Yahoo! – 20 Billion Objects Indexed . To compare this to the “mere” 8.2 billion web pages indexed by rival Google.
Whilst this makes for great PR for Yahoo! (public relations and not page rank) it doesn’t do necessarily do anything for the consumer. The consumer being the person who is typing in the keywords and expecting a quality result. A bigger search engine index does not necessarily mean a quality search engine index. In fact, I would say that is it quite opposite.
Just having an object in the search engine index doesn’t make it a quality object. If it isn’t a quality object, it is highly unlikely to ever be presented in the results to the user. i.e. deemed irrelevant or so poorly optimized that it ranks in the bottom 10% of the SERPS.
It also does not surprise me that Yahoo! have managed to increase their index size. Yahoo! Slurp (the hybrid best of breed spider technology which grew out of the Inktomi, AltaVista and Fast spiders) has been very aggressive of late and requesting many pages from servers around the world.
Further more, Yahoo! has one huge advantage over Google. Through Yahoo! Search Submit Pro, Yahoo! is provided with many tens of thousand of pages that it might never have found during traditional crawling methods. Google does not have – and according to all of their statements on public record – will never have, a Paid Inclusion program. This does put Google at a disadvantage in the bragging rights over whose index is bigger.
The most accurate measure (I believe) is on search user satisfaction, which according to the latest study is still Google. Although the gap is narrowing with Yahoo catching up to Google, with a score of 80 out of 100, verus the Google score of 82%.
This reinforces my point that many of the extra objects in the Yahoo! index are adding no value, other than bragging rights. So is often the case, bigger doesn’t mean better.
Note: an object is defined as being either a web page, an image or any other file type (video, MP3 etc).