Interesting (but disappointing) stats

Website stats have long been a [[|point of debate]] over at [[|Textdrive]]. While its possible to install Awstats, it takes a [[|non-trivial amount of effort]], and depending on your configuration, it might actually create problems on a shared hosting environment. Google did re-launch [[|Urchin]] for free later as [[|Google Analytics]], but my initial experiences with it were mostly moderate — I have since abandoned it for no particularly strong reason. [[|Mint]] looked good, but I didn’t feel like shelling out 30 bucks, specially since mine is not a commercial website and my interest in stats is purely for curiosity and is not motivated by financial or other reasons. And so for a while now (since November 2005, to be precise), I have been using [[|Stephen Wettone's]] excellent [[|Slimstat]]. I really like it so far — its almost like mint, its server side (so clients don’t have to load some Javascript sitting on Google) and its free!

Anyhow, back to the subject of this post. I was looking at the aggregate stats for my website (primarily my blog, because I haven’t coded up the static content to be monitored by Slimstat) and the stats were kind of interesting and somewhat disappointing. Interesting because thats the nature of stats — things people (or machines) read most on my blog are not what I expect them to be and such. And disappointing for the same reason.

Here are a few:

* The number of hits and visits to my blog has been slowly but steadily increasing (I’m not putting the exact numbers in here, because they are embarassingly small still :-) )
* The [[|aside on Paritrana]] is the **second most popular** link on my blog. Remember that it was posted **less than a month** back and it was completely devoid of **any** content. I simply linked to a news story. It also happens to be one of the most commented entries on my blog. Furthermore, if you [[|look at the comments]], you will see that majority of the comments ([[|also on this post]]) don’t seem to come from the regular blog-reader-poster type people. More on why this is interesting later in the post.
* The next most popular page is the page for the [[|tag "movies"]]. This is sort of disappointing because this blog doesn’t even remotely target movies (if at all anything!). All I do is write occasionally about the movies I’ve seen. More interestingly perhaps, the vast majority of these hits come from Google Image search, which is a bit surprising to me. Still don’t fully know how or why.
* I’m happy to see that [[|my post with some Vim7 screenshots]] has quickly risen among the top 10 on my blog
* Among the search strings that led people to my blog, “Partirana” is the most searched keyword and by a **huge** margin. I mean, we’re talking two orders of magnitude here.
* A little sadly, 51% of the viewers of this blog are still using IE. Only about 30% are using Firefox. Come on people, whats wrong with you?

Alright, so about Paritrana. The extremely high level of interest in Paritrana on my blog (and so I infer, elsewhere as well) is interesting for two reasons. One, that it //is// getting a lot of attention — people are regularly Googling for it, which hopefully means that people are interested enough to find out more about the party and the people. Secondly and perhaps more importantly, as the comments on my posts show, this interest seems to be wide spread among not just the “web savvy” people, which I think is very encouraging. A political party has to reach out to the masses, and the number of people who read and comment on blogs in India is a miniscule fraction of the population.

Moreover, the fact that these people took the pains to leave a comment shows that they do want to get involved and are trying to reach out to the party. I hope the folks at Paritrana are aware of this, and that they provide people with the right channels to approach them and let them express their support and enthusiasm. At the same time, it worries me slightly that despite my strongly asserting the fact that I am not involved with Paritrana, a lot of the commenters seemed to not take notice of that and implicitly assume otherwise. I just hope that people will try to read the fine print, slow down and digest the facts and not blindly get on the band wagon.


  1. diwaker

    Those things are ok, but have several limitations:

    # You can only track HTML pages — no way of tracking non-HTML content like images, tar-balls, CSS files etc
    # You can’t track the bandwidth consumed — this is important in a shared hosting environment
    # You can’t track HTTP error codes. So lets say 30% of the hits you get result in a 404 (not found) error, you would want to know what resource is that and fix the problem. Similarly for 403 (access denied).
    # You can’t easily track the different bots that might scrape your site.

    Tools like awstats have a significant advantage in that they work directly with server logs. Script based tools such as statcounter can only use HTTP headers for their stats.

Leave a Reply