Google Analytics — the deep web

I was talking to a friend of mine the other day about [[|Google's]] [[|purchase of Urchin]] — a San Diego based stats company a few months back. What we were trying to figure out was this: what did Google stand to gain by making Urchin free (as [[|Google Analytics]])?

I immediately said “ads, of course!”. But my friend pointed out that the consumers or audience for Analytics was different than the regular search engine user or gmail user. Analytics users are people who have their own websites and this fraction is arguably order of magnitudes small than the number of users of Google the search engine. So even if Analytics were to put ads on its pages (which thankfully it doesn’t, yet) the revenue generated will likely be much less their regular ad revenue (adsense and adwords combined).

Anyways, so as we were talking about this, it suddenly struck me: Urchin’s acquisition may probably be the single most strategic acquisition by Google in recent times. Just imagine, by using Analytics, you are effectively handing over your web server logs to Google. Sure Javascript based logging doesn’t capture everything the webserver sees, but Analytics still has a handle on a //lot// of information. Things like which link on your web page is the most popular, which is the most popular //outgoing// link, which is the top referer for your website and so on.

This is what I call the “deep” web information that regular search queries and regular web crawls don’t provide. The Google crawler essentially creates a //snapshot// of the web, which they then post process to power Google search. With Analytics, however, they’ve enabled a way of tracking user behavior “live” — as it happens around the web. I think the potential for this kind of in-depth information is just **immense**. The more I think about it, the more I get excited, and scared.

For example, AdSense works fine for most people in its current state — using contextual information to place relevant ads. But now if Google incorporated information it is capturing from Analytics to improve your AdSense ads (because it now really nows exactly what people do on your website), your earnings will improve. Analytics knows //where// your users are coming from, so Ads could be made geography aware.

Imagine if Google starts piecing together the Analytics information across all the web — that will probably be the largest corpus of user behavior ever created. Everything from browsing habits to think times to which page layouts work best could potentially be answered with that kind of data.

Perhaps this is common knowledge, but for me it was truly a revelation. I was happy at my insight :-) but at the same time a little scared thinking about all the increasing amount of my data Google has access to :-(


  1. Boris Gruschko

    Actually I think, It’s not about your site at all. The thing is, that analytics sets a cookie on every user’s browser. Much more powerfull source of information is, that now google gets to know, which pages an individual visits, regardless of his referal source.

    Consider the following scenario: User U visits sites A, B and C. A and C use google analytics. Now google has got to know, that U visited A and C. Without A and C being actually affiliated with google. Now, this certainly can be used, to optimise adds for the user U. His identity can be easily determined, by for example correlating analytics data with U’s gmail account.

    The power of this, is that probably a good portion of websites will be using google analytics, due to it’s origins (everyone loves google products). The benefits google gets from all this information is certainly exponential to the number of sites using analytics.

  2. Diwaker Gupta

    *@boris*: perhaps I’m missing how cookies work, but I don’t see why a cookie set by Analytics should store any information about what other websites I visit. My understanding is that cookies are website specific. So if I visit a web page that has the Analytics JS snippet, it may set a cookie on my browser. But why would that cookie be updated if I visit some other web page?

    Actually scratch all the above. I just read your example, and I think you’re basically saying the same thing in a different way :-) I would be really interested in finding out if Google is doing this kind of reverse lookup — figure out the browsing profiles of individuals.

    As for Google being able to figure out my identity — I don’t think thats a big concern at this point. Google already knows more about me than even I probably do.

    *@gregor*: Thanks for that pointer, I didn’t know the PDF was online already. I was just searching for the big table paper a couple of days back :)

  3. Boris Gruschko

    @Diwaker: it’s not like this is a big concern for me either. Google has all my personal mail written in the last two years. This should be sufficient, to know pretty much everything about me :)

Leave a Reply