Happy new year y’all!
The past couple of days my feed reader has been chock full of posts about one of the following: the year in review, predictions for 2008, reflections and introspections. So much so that I got tired of reading about the “new year” and never got around to writing MY end of the year post, but I’m sure the world didn’t miss much. But I did run into an interesting problem as I was thinking about what could have been my end of the year post: exactly what all did I do last year?
So I started by writing down all the months, the idea being that I would put down all the significant events that happened in any given month next to it. The hope is that there aren’t that many of them so the list should be fairly manageable. Now, I have always known that my memory is not that great, and that is why I tend to rely on tools to do the dirty book keeping for me: calendars, todo lists, reminders etc. But it was still a little shocking when I couldn’t immediately recall what I did in lets say May of last year. Of course I did remember things once I thought about it a little bit, often relying on context (what happened before May, after May etc).
The bottom line is that it wasn’t as easy as I thought it would be. For some months, I actually had to go back to my email inbox and other digital archives to figure out the salient happenings. This got me thinking about **personal information analysis and visualization**. And the more I thought about it, the more excited I became.
I was actually surprised to find such little information on the web about this. With our increasing information overload, cheap storage, and tons of archived data (online and offline), I think this space has tremendous potential for both academic and commercial ventures. For instance, here’s a really simple thing I want to be able to do: for a given time period (say 2007), I want to analyze and visualize all of my emails so that I can quickly figure out:
* who did I communicate with the most?
* what were the main topics I wrote about?
I couldn’t find any open source tool to do even this. And my initial Googling hasn’t turned up much in commercial offerings either. The closest thing I could find was a project called [[http://alumni.media.mit.edu/~fviegas/projects/themail/study/index.htm|themail]] from MIT Media Labs, but there’s no code that I can download. Then there is [[http://carohorn.de/anymails/|Anymails]], but it seems just a cool visualization, and not a lot of information (specially the kind I want).
If you know about any free or paid tools that can do this kind of analysis, please drop a line in the comments. And while you are at it, try to think about what YOU did all of last year :-)