Contact Us

Welcome to ineedhits Blog

Welcome to the ineedhits Search Engine Marketing blog, where we share the latest search engine and online marketing news, releases, industry trends and great DIY tips and advice.

Tuesday, August 22, 2006

AOL’s Search Data Blunder – Heads Are Rolling

Posted by @ 2:49 am

What Happened?

In late July, AOL made a data set of 20 million search queries by 658,000 of its users publicly available. These searches were conducted between March and May 2006, and users were not asked for their permission to release this data. The data sample represents about 1.5% of AOL’s total users in May 2006 and about 0.33% of total searches conducted on AOL in the time period in question.

The data had not been sufficiently anonymized, so that theoretically individual users can actually be identified. AOL user names had been replaced with a random ID number, but the searches can by analyzed by user ID number, so that you can get a picture of all of the searches that an individual person conducted. Since people often search for their own name or include social security numbers in searches, it is possible to trace a search trail back to an individual.

The search trails can be embarrassing for the people who searched or can even indicate the possibility that illegal or criminal activity may have been or could be committed. So really, the data can be potentially explosive!

AOL’s original intent was to release this data to provide a data set as a tool for the research community wanting information on search behavior. After all, researchers and marketers would love to have better information on search patters, amongst other purposes in order to be able to target advertising offers better.

After the enormity of the privacy breach was picked up and reported in blogs, AOL was quick to take down the data and apologize, but the damage was done – hundreds of people had already downloaded the data set, and duplicates of the data set were available on other websites.

Why is AOL’s Search Data Blunder Such a Big Deal?

Well, because the fundamental issue of online privacy is at the core of the problem. The release of personally identifiable data without permission constitutes a breach of privacy rights. It also raises the question of what data companies should be allowed to store about their users, and for how long this data may be stored. The level of security surrounding storage of such data is also under scrutiny – how easy is potential abuse, and how likely is another screw-up or malicious breach of privacy?

There are strong conflicting interests at play – privacy, civil liberties, security, crime prevention, and lastly commercial interests of marketing companies and advertisers.

Is There Anything Good About the Search Data Blunder?

Interesting, AOL’s slice of search data reveals how much there is still to learn about behavioral patterns in searching:

  • Is everyone really searching for themselves or could they be searching on behalf of others?
  • Over how much time can a search pattern stretch? How do you communicate to a searcher who is searching related keyword terms over periods of several months or longer?
  • Are you really sure a search trail can be attributed to one single person? What about multiple people using the same computer?
  • How do you best guess the intent behind very broad keyword searches?

As a marketer, I find these questions fascinating, and I can see how better understanding of these kind of behavioral issues could lead to the development of more relevant (for the users) and effective (for the advertiser) advertising products.

As a search engine user, I understand that a huge amount of data is collected about my search behavior every single day. I still don’t like the thought that “Tell Me What You Search and I’ll Tell You Who You Are” could be applied to me personally. The thought of this information in the wrong hands is quite chilling.

What Will Be The Fallout For AOL and Others?

It remains to be seen what long term damage this search data blunder may or may not do to AOL. The blunder has, with some delay, caused some heads to roll internally: This Monday, it was reported that Chief Technology Officer Maureen Govern and two other employees have been suspended by AOL. Two bodies involved in civil liberties and privacy advocacy, the Electronic Frontier Foundation and the World Privacy Forum, also filed complaints against AOL last week over AOL’s violation of the privacy of its users.

The risk of search data either unintentionally or maliciously landing in the wrong hands has definitely been highlighted – and this should not only concern AOL, but also Google, Yahoo!, MSN and any other online entities collecting and storing behavioral data of users.

Discussion (1 - comment)

In light of AOL’s release of its users search information, two free services just launched to help users protect their privacy while searching online. The first, called, allows you to register your search engine cookie for AOL,, Google, MSN, or Yahoo. The lost in the crowd servers then run random queries on your behalf on a regular basis. The second, called Track Me Not, works as a Firefox extension that will submit queries directly from your browser for random things. Both service work on the idea that if you submit enough random “noise” any “signal” which may reveal your personal identity will get lost making it difficult for the search engines, or anyone who may subpoena their data, to figure out who you really are.

By Eric - August 22, 2006

Add Your Comments


Keep up to date with the latest from our blogs.

Subscribe to all blog posts

The Newsletter

  • New Posts
  • Popular
  • Comments


More in Search News (1498 of 1797 articles)

Forums threads started appearing on Thursday morning about a potential Google update underway. As many webmasters started noticing movements in ...