Last week, Google announced it is now using RSS and Atom Feeds to find and index new web pages on the internet.
Historically Google has relied on URL submission and links on other web pages as its primary way to find new online content. Using these submissions and links, Google’s spiders crawl and index the relevant content it uncovers.
RSS and Atom Feeds aren’t new, having come to prominence with the introduction of blogging. Most blogging platforms include some form of RSS and Atom feed service as a way for publishers to push their content online.
For those of you unfamiliar with RSS feeds, Wikipedia explains them as:
RSS (most commonly translated as “Really Simple Syndication” but sometimes “Rich Site Summary”) is a family of web feed formats used to publish frequently updated works—such as blog entries, news headlines, audio, and video—in a standardized format. An RSS document (which is called a “feed”, “web feed”, or “channel”) includes full or summarized text, plus metadata such as publishing dates and authorship. Web feeds benefit publishers by letting them syndicate content automatically.
Google has been using RSS and Atom feeds to index content into its Blog and News search indexes for some time, so it seems this latest move will simply be an extension of this practice, with the goal to feed it’s core web search database.
As pressure in the real time search arena heats up, the use of RSS and Atom feeds to find content was an obvious next step – and in real terms, probably long overdue.
Unlike its Blog and News search engines, where feeds are submitted for consideration, Google is suggesting it will use existing aggregators to find content.
We may use many potential sources to access updates from feeds including Reader, notification services, or direct crawls of feeds. Going forward, we might also explore mechanisms such as PubSubHubbub to identify updated items.
So how should you ensure your content is being found? If your site publishes an RSS feed for new content, be that web pages or blog posts, you should seek to have it aggregated as much as possible.
As a starter, get your new web content syndicated or published through the following services
While I am sure there’ll be much debate as to the spam risks of this indexing via RSS, it’s a method that lies at the foundation of Google’s real time search plans. Accordingly expect to see this practice grow, with Google sure to find a way to ensure it maintains the integrity of its search database.
Let us know…
- Do you currently have a site that publishes an RSS feed?
- Will this make you consider changing your CMS platform to one that includes RSS
- Is indexing via RSS a better way for Google to index pages than links or submissions.
we look forward to reading your thoughts via our comments section below.