Google have launched a new tool under their Google Webmaster Centre that can help to reduce duplicate content issues on your website. The feature, called parameter handling, enables you to specify up to 15 parameters that Google should ignore when crawling your site.
URL Parameters (such as session IDs, source or language codes) can result in many different listings in Google’s index which all essentially point to the same content. To give you an example, Google might index both the following URLs in their index:
In this case, the second URL simply has some source tracking, which does not alter the content of the website. Using the site parameter setting will allow you to specify “source” as a parameter for Google to ignore to ensure you don’t have any duplicate content issues.
Google will suggest some parameters they’ve found on your site and you can simply choose to ignore or confirm these:
As search engine land points out, this can have a number of benefits for your site by resolving the issues below:
- Crawl efficiency problems: if search engine bots crawl the same page via multiple URLs, they may not have resources to crawl as many unique pages on the site
- PageRank dilution that can lead to lowered search rankings: if external sites link to multiple versions of a page, each page has less Page Rank value than if all links were to one version
- Display and branding problems: search engines display only one version of the URL; you ideally want the canonical version of a URL to display (mysite.com/goldfish) rather than a version with extraneous parameters (mysite.com/goldfish?adid=1205123&sid=452006&sort=high-rating&loc=sea)