Friday, June 12, 2009

The KnowPrivacy Project: Google and Yahoo (and many others) are watching you

We're all vaguely aware that our use of the internet isn't really private. However, you may be surprised to find out who is monitoring you, how they're doing it, and what they're doing with the information.

UC Berkeley's KnowPrivacy Project ("KnowPrivacy," of course, also sounds like "No Privacy") has recently issued a report of its findings on internet data collection, and it makes for some pretty troubling reading. Here are summaries of a few of their findings:

1. Collecting information with web bugs and beacons: Everyone knows about cookies--the small text files that sites like Amazon and Gmail place on your computer to personally identify you when you visit them. And if you want to order something online or read your web-based e-mail, you have to enable cookies from those sites. However, when you do that cookies can also be placed on your machine by third parties such as advertisers; since an advertiser can have ads on many sites, they can collect personally identifiable information about your browsing behavior across the web.

But even if you set your browser to disable third-party cookies, your internet use can be monitored by web bugs. Web bugs are typically 1 x 1 pixel images that are invisibly embedded in the background of a web page, an ad, or an e-mail. Whenever that web page is opened in your browser, the bug informs the server of your IP address (that is, your computer), the time, and the URL (that is, what you're viewing). The only way to disable web bugs is to block all third-party content, but that means (for example) you couldn't view a YouTube video embedded in someone's blog post. Web bugs can also be embedded in e-mails, alerting the sender when the message has been opened (so a spammer can discover that your e-mail address is valid, even if you don't click on any link and immediately delete the e-mail).

Often, of course, you may want to share information about your internet use with a website operator--it enables the site to be customized to your preferences. Examples include things like Amazon's recommendations for users based on their browsing and purchasing patterns, or Netflix's recommendations based on previous rentals and ratings. Consenting to such information-sharing can make these sites more functional for users. However, the whole point of web bugs is that they do not require your consent to gather information about you.

2. Who is collecting your information: Dozens of advertisers and website operators attempt to track your internet use, but among the leaders in web bug placement is Google. The KnowPrivacy project found that Google-owned sites are saturated with web bugs--in March 2009, 100 separate web bugs were found on Blogspot, 44 on Google and 31 on Blogger. (Typepad, a rival blogging site, had 75 separate web bugs.)

While blogs can contain ads that have their own bugs, many of the bugs on the blogging sites are placed by the bloggers themselves in order to track their traffic. But popular tracking bugs such as Google Analytics, for example, allow bloggers to share that information with the parent company (and in fact, Google offers incentives to do so). Google-owned trackers--Analytics, DoubleClick, AdSense, FriendConnect, and Widgets--appeared on more than 88% of the 394,000 distinct domains visited by the KnowPrivacy Project participants. Clearly, a lot of information on web use is being gathered without the explicit consent of users.

3. Sharing of information: So what's being done with this information? For one thing, companies like Google use it to sell ads targeted to specific users. I have a Gmail account, and my messages are obviously bugged and scanned for keywords. When I open a message containing specific keywords, a text ad that has been matched to those words appears on my screen. But I have to confess that I don't even consciously notice most of these, and the ads were part of the deal I accepted when I set up the account.

But information gathered about you by websites can be shared--that is, sold, rented, or offered as part of a commercial agreement--with other companies without your knowledge or consent. In order to protect yourself, you might try to read websites' privacy policies. But many privacy policies have language that refers to things like "affiliates," "marketing partners" and "third parties." It is almost impossible to find out which companies are getting information about you, and under what constraints.

Of 50 privacy policies analyzed by the KnowPrivacy Project, 36 stated that third-party tracking is allowed, but "the data collection practices of these third parties were outside the coverage of the privacy policy" (p. 27). And as for affiliates, the report points out that "it appears that users have no practical way of knowing with whom their data will be shared" (p. 28). As an example, "MySpace, one of the most popular social networking sites (especially among younger users), is owned by NewsCorp, which has over 1500 subsidiaries....Information pulled from these websites could potentially find its way to all of these affiliated companies" (p. 28).

In my own experience, Yahoo's privacy policy seems particularly confusing and unclear. For example, it states that Yahoo doesn't share personal information about you with without your consent...except "to trusted partners who work on behalf of or with Yahoo! under confidentiality agreements. These companies may use your personal information to help Yahoo! communicate with you about offers from Yahoo! and our marketing partners."

In other words, Yahoo can share any information they gather about you with any entity working "on behalf of or with" Yahoo, although these "trusted partners" aren't supposed to further share your information. Who are these "trusted partners"? "Yahoo! works with vendors, partners, advertisers, and other service providers in different industries and categories of business." Clicking on the offered "reference links" takes you to a page that includes more than 100 links detailing the "privacy practices" of various Yahoo products and services, including more than a dozen "Acquired Companies with Different Privacy Policies." Presumably, the use of your information by these acquired companies--which include AltaVista,, Flickr, and Yahoo Search Marketing--is governed by their "different privacy policies," even though they are owned by Yahoo.

There's also some other troubling language in the Yahoo policy about how merely viewing an ad implies consent. "Yahoo! displays targeted advertisements based on personal information....[B]y interacting with or viewing an ad you are consenting to the possibility that the advertiser will make the assumption that you meet the targeting criteria used to display the ad."

So based on personal information it has collected about you, Yahoo sells display ads on pages you visit. Merely by viewing these ads, which display automatically, you are consenting to the assumption that you fit the profile of users at whom the ad is aimed. Among Yahoo advertisers are "financial service providers (such as banks, insurance agents, stock brokers and mortgage lenders)." The assumptions that financial companies such as insurance companies and lenders make about you can have potentially huge impacts, of course.

I recommend reading the full KnowPrivacy report; it is available through the KnowPrivacy website, which summarizes the report's findings. How personal information is being collected and disseminated should be of concern to everyone who uses the web.

If you're interested in additional resources, the Electronic Frontier Foundation monitors important developments in internet privacy, as well as other issues like free speech, innovation, government transparency, and intellectual property--see my response to Memsaab's comment below.

Joshua Gomez, Travis Pinnick, and Ashkan Soltani, "KnowPrivacy" (June 1, 2009)


  1. As my brother often says: the best protection against privacy invasion and identity theft is the total and utter insignificance of our lives.


  2. Your brother's comment is hilarious, Memsaab. But I think the concern is not that there is someone out there in an office monitoring how many times you're watching Dharmendra videos on YouTube--though if you're doing it at work, there is someone doing just that. And as the RIAA music downloading lawsuits show, sometimes copyright owners do care about your computer use--a lot--and a subpoena can enable them to find out about it in detail.

    My concern is more that data about sites you visit and products you purchase are being aggregated automatically to create a profile of you (and users like you) which can have real-world consequences. Medical information is an example that's often used; insurance companies would certainly be interested in knowing who has visited breast cancer sites, for example. As more and more of our lives are conducted online, this information is potentially very detailed and valuable. It can also be highly misleading (what if you've visited that breast cancer site to find information for a friend?), but you have no way of correcting any misinformation or unwarranted assumptions.

    There are also the issues of knowledge, consent and control. Web bugs collect information about you without your knowledge or consent, and it is impossible for an ordinary user to discover what happens to this information after it has been gathered.

    If you share my concerns about these technologies, the Electronic Frontier Foundation ( monitors important developments in internet privacy, as well as other issues like free speech, innovation, government transparency, and intellectual property.

    And they have an excellent privacy policy. The one worrisome aspect of that policy is that the EFF site's search capability is provided by Yahoo. Although EFF states that no personally identifiable information such as IP address is transmitted to Yahoo when you search EFF's site, it also says that "Information submitted to the search function is...subject to the search engine provider's privacy policy." Be forewarned.

    (I'm going to post the information about EFF on the main entry for the benefit of visitors who don't read the comments.)

  3. The more I hear about internet applications, cell phones, digital cable, etc., the more I feel like shutting myself up in a stone-age mansion and never using any modern gadgets! Since that is not possible, I wonder what we can do to fight against such invasion of privacy. It just isnt possible to exist without email and search engines (funny, a decade ago I didnt thinks so!) and its impractical to think I can read (and understand!) the privacy agreements of any and every web-page I visit.

  4. Bollyviewer, you're absolutely right that technologies such as e-mail have become essential for conducting modern life. And while it's obviously impossible to completely protect your privacy, you can do a few things that may help to some extent:

    1. Block cookies from all sites except those that you trust, and if you can block all third-party cookies. Clear your browser cache of stored cookies at the end of every session.

    2. It's not possible to read the privacy policies of all sites you visit, but you should read the policies of sites that you buy things from, log in to, or otherwise explicitly share personal information with.

    3. Try to use sites that protect your privacy. Yauba ( is a search engine that claims not to use cookies or store your personal information (including searches), and says that if you access third-party websites through their "anonymiser privacy filter" those sites will also be unable to collect your personal information. I haven't yet seen these claims assessed independently, but they're certainly worth looking into.

    4. Support organizations (like the Electronic Frontier Foundation) that are trying to raise awareness about these issues and that are active in trying to represent the privacy interests of internet users.



  5. You can also hide your ip address if you want by surfing through a site like It may not be perfect (or maybe it is) but it's something (and it's easy and free)...

  6. Great suggestion, Memsaab. The homepage shows an example of the data that is gathered automatically by the websites you visit, and how it looks when you use their anonymizer. I can't vouch for personally, but it's worth investigating.

    Interestingly, the day I visited the homepage of Anonymouse, it carried an ad for the Google Chrome web browser--and as the KnowPrivacy report shows, web bugs can be embedded in ads. So Google might well be tracking visitors to Anonymouse!