Google have given us an insight into the effectiveness of sitemaps through a recent study.
Sitemaps are an easy way for web site owners to help search engines index pages on their web sites. Google initially announced sitemaps as an experiment back in 2005, with Yahoo! and Microsoft jumping on board not long after.
Until now, there has been little research to conclude how effective sitemaps really are, well until now that is.
The study was based around 3 website case studies – Amazon, CNN, and Pubmed.
Amazon’s sitemaps include around 20 million URLs. They also take effort to indicate the canonical, or best URL versions, of product pages in their XML sitemap.
CNN’s approach to XML sitemaps focuses upon helping a search engine find the many new URLs which are added daily, and also addressing canonical issues with their pages.
Pubmed has a huge archive of URLs listed in their XML sitemaps; however they don’t update these often (the change rate of URLs listed is monthly).
It was quite a detailed study with the final paper being 10 pages long. Knowing that your lives are no doubt too busy to read this, here is a rundown on some interesting facts:
- Approximately 35 million Sitemaps were published, as of October 2008.
- The 35 million Sitemaps include “several billion” URLs.
- Most popular Sitemap formats include XML (77%), Unknown (17.5%), URL list (3.5%), Atom (1.6%) and RSS (0.11%).
- 58% of URLs in Sitemaps contain the last modification date.
- 7% of URLs contain the change frequency field.
- 61% of URLs contain the priority field.
If you would really like to read the 10 page summary of the study, then you can find it here. Otherwise, check out the great summary from Barry Schwartz at Search Engine Land.
If you’re not using XML sitemaps on your web site, then this study highlights the need for you to consider adding them.