Tagged: analytics

Looking glass: search keywords


I’ve been meaning to post some analysis data on the traffic that drives my site for a while now but never get around to it. This is first in a series of posts which looks at what people have been reading on my site, what works and what doesn’t. This data has been gleaned from [[http://analytics.google.com|Google Analytics]] with the following caveats:

* there was no monitoring for most of 2005–2006
* there is no data for at least half of 2006–2007

So really the only solid data set is from 2007–2008. Nevertheless, some data is better than no data so I’m looking at it anyways. Finally, I will also try to correlate this with the top keywords as reported by WordPress Stats and AWStats to see how well they match. Without further ado, here are the top ten searched keywords for the past three years:

^ Rank ^ 2005–2006 ^ 2006–2007 ^ 2007–2008 ^
| 1 | procastrination | airtel call home | quickstar |
| 2 | quickstart amway | airtel callhome | bww |
| 3 | apahran | airtel india call | airtel call home |
| 4 | riya | airtel call india | wordpress widgets |
| 5 | udai | amazon ec2 | __macosx |
| 6 | dell linux irda 8600 | wordpress widgets | reservation in higher education |
| 7 | fluxiom | udai | airtel india call |
| 8 | “web2.0 office”, | mapmyindia | udai |
| 9 | amway quickstart | mit sketching | airtel call india |
| 10 | cryptography | latex beamer | web based password manager |

Clearly [[http://floatingsun.net/2006/12/18/more-on-airtel-call-home/|my]] [[http://floatingsun.net/2006/12/18/airtel-call-home/|posts]] on Airtel’s [[http://airtelcallhome.com|Call Home]] service stole the show in 2006–2007. It remains a strong contender the next year as well. However, in 2007, almost everything else was completely drowned out by my [[http://floatingsun.net/2005/05/06/quickstartamwaybww/|Quixstar/BWW post]]. What I’m not showing here, however, is by //how much// do the ranks differ. Suffice to say that the number of hits on the Quixstar post far outnumbered the number of hits on everything else put together pretty much.

There don’t seem to be any surprises here either. I’m not sure if I should feel good or bad about it. The hope is that by taking stock of these keywords, I can get a better sense of exactly what is it that people come to this site for. Not that I’m writing for an “audience” per se, but there’s no harm in writing what people actually like to read :-) As I said before, I’ll try to follow up with some more stats soon.

Google Analytics — the deep web


I was talking to a friend of mine the other day about [[http://google.com|Google's]] [[http://www.seroundtable.com/archives/001729.html|purchase of Urchin]] — a San Diego based stats company a few months back. What we were trying to figure out was this: what did Google stand to gain by making Urchin free (as [[http://analytics.google.com|Google Analytics]])?

I immediately said “ads, of course!”. But my friend pointed out that the consumers or audience for Analytics was different than the regular search engine user or gmail user. Analytics users are people who have their own websites and this fraction is arguably order of magnitudes small than the number of users of Google the search engine. So even if Analytics were to put ads on its pages (which thankfully it doesn’t, yet) the revenue generated will likely be much less their regular ad revenue (adsense and adwords combined).

Anyways, so as we were talking about this, it suddenly struck me: Urchin’s acquisition may probably be the single most strategic acquisition by Google in recent times. Just imagine, by using Analytics, you are effectively handing over your web server logs to Google. Sure Javascript based logging doesn’t capture everything the webserver sees, but Analytics still has a handle on a //lot// of information. Things like which link on your web page is the most popular, which is the most popular //outgoing// link, which is the top referer for your website and so on.

This is what I call the “deep” web information that regular search queries and regular web crawls don’t provide. The Google crawler essentially creates a //snapshot// of the web, which they then post process to power Google search. With Analytics, however, they’ve enabled a way of tracking user behavior “live” — as it happens around the web. I think the potential for this kind of in-depth information is just **immense**. The more I think about it, the more I get excited, and scared.

For example, AdSense works fine for most people in its current state — using contextual information to place relevant ads. But now if Google incorporated information it is capturing from Analytics to improve your AdSense ads (because it now really nows exactly what people do on your website), your earnings will improve. Analytics knows //where// your users are coming from, so Ads could be made geography aware.

Imagine if Google starts piecing together the Analytics information across all the web — that will probably be the largest corpus of user behavior ever created. Everything from browsing habits to think times to which page layouts work best could potentially be answered with that kind of data.

Perhaps this is common knowledge, but for me it was truly a revelation. I was happy at my insight :-) but at the same time a little scared thinking about all the increasing amount of my data Google has access to :-(