November 14th, 2007

Bittorrent and the rise of streaming media

**DISCLAIMER**: I am //not// advocating piracy. I am //not// suggesting that I indulge in illegal content sharing, or that you should. I am merely making observations.

We are all greedy by nature, and despite the best intentions of the MPAA and RIAA, we will always try to get by without paying any money (if you missed the sarcasm, shame on you). We would like to watch our movies for free, get all our music and all our TV shows without having to shell out a single penny from our pockets. Ok, may be not all of us, but lets say a non-trivial fraction.

Back in the day, IRC was the king of sharing such content. A very small number of very adventurous people just put up MP3s and videos on their public web servers. While there is still a fairly large underground IRC community for file sharing, just putting up copyrighted content on public domain is not possible, for a variety of social and technical reasons. For one, you will surely hear from the friendly folk at MPAA/RIAA very soon. And if you don’t, then you will quickly become bankrupt paying your hosting/bandwidth bills.

For a while then Napster and friends (Kazaa, Limewire, DC++, eDonkey) ruled the stage. The idea was simple — you scratch my back and I’ll scratch yours. End users donated bandwidth as well as content. However, most of these systems were plagued by leachers — people who never uploaded anything — simply because there was no incentive for a user to share.

Enter [[wp>Bittorrent|BitTorrent]] — the poster child of peer to peer technology. What started out as an experiment in content distribution, soon became the most widely deployed technology for sharing illegal content. In all fairness, BitTorrent is also used for distributing a variety of legitimate content (Linux ISOs, for instance). However, the amount of BitTorrent traffic due to pirated content probably dominates traffic due to legal content by a huge margin. In any case, recent years have seen an [[http://www.readwriteweb.com/archives/p2p_growth_trend_watch.php|explosion]] in [[http://torrentfreak.com/bittorrent-dominates-internet-traffic-070901/|P2P traffic]] to the point that [[http://torrentfreak.com/comcast-throttles-bittorrent-traffic-seeding-impossible/|ISPs are actually taking measures]] to curb BitTorrent traffic in their networks.

But things have changed a lot since then. A lot more people have Broadband connections, so bandwidth is becoming less of a premium to home users. [[http://youtube.com|YouTube]] and friends have revolutionized the way we think about video as a medium. People are a lot more comfortable with watching streaming videos online than they were a few years back.

Which makes me think — why would I ever want, or even //need// to download anything (using BitTorrent or anything) ever again? Not withstanding the fact that I absolve my self of a lot of potential mental and financial stress by abstaining from such activities, it is simply so much easier and quicker to access content online and stream it home.

If you haven’t already, check out some of these sites and you will see what I mean:
* [[http://southparkzone.com|Southpark, online]]
* [[http://alluc.org|everything, online]]
* [[http://stage6.divx.com|High quality videos. Coupling. Hustle.]]
* [[http://arresteddevelopment.msn.com|Arrested Development, online. **officially**]]

I have read scattered reports of streaming functionality being incorporated in the BitTorrent protocol, but honestly I don’t see the point. What do you think?

February 21st, 2006

FeedTree is half the solution

I mentioned [[http://feedtree.net|FeedTree]] yesterday — I’ve been thinking about it on and off since then, trying to judge the magnitude of the problem, and the value of the solution.

Let me step back a second. For those of you who don’t know about FeedTree, its basically a tool to help the distribution of RSS feeds (what they call “micronews”). The premise is that these days there are a large number of blogs, an equally large number of people who read these blogs, and a significant portion of readers come via RSS feeds.

Infact, this is an over simplication of the problem, for RSS feeds are becoming ubiquitous these days. Pretty much any kind of content can have an associated RSS feeds — newspapers, magazines, calendars, project management tools, software releases, email. Things are further complicated because RSS feeds are trivially easy to embed in other content (thanks to the large number of tools and libraries available for almost all languages). So you can see my del.icio.us tags on my home page, fetched via RSS. Ditto with flickr.

So the basic problem FeedTree is trying to address is that of efficient utilization of network bandwidth and efficient distribution of content. This is useful for pubishers (I pay for my bandwidth, so I want to get the best value out of it); useful for ISPs (if RSS feeds end up eating a substantial chuck of the backbone traffic, like BitTorrent does right now, you can be sure ISPs will be interested); and end users (I don’t have to wait for my aggregator to “pull” feeds every so often, news will be pushed to me instantly).

The key idea is that a large chunk of this traffic is redundant (say 50% of the participants read Slashdot) — so instead of everyone pulling data from Slashdot directly, we can share the data among each other in a peer-to-peer fashion. The technology is not new, but the idea is. You can read a lot more about the [[http://trac.feedtree.net/project/wiki/FeedTree|gory details and some pretty pictures]] on FeedTree’s website.

So if everyone starts using FeedTree, will all our problems disappear? Leaving the social aspects of adoption etc aside, I think there will still be problems.

First of all, the distribution model for feeds is not as simple as everyone pulling feeds from the publisher. A good example of this are the various [[http://www.planetplanet.org/|Planets]]. Technorati just released Favorites. Then there is Memeorandum and other meme-trackers. So some kind of network aggregation is already happening, though its hardly anything as organized as what FeedTree proposes.

Secondly, these feeds are growing richer in content. Podcasts are just the beginning. Imagine if all feeds had some rich content (audio/video) embedded in them (basically, when the size of feeds grows to more than just a few KB), a P2P distribution system will not necessarily solve the problem (we have already seen the problems that BitTorrent has run into, because its a **highly effective** P2P content distribution mechanism).

Thirdly, I would expect content distribution giants such as Akamai to step into this business soon (if they haven’t already). If NYT and CNN can use Akamai for delivering their audio/video content, they most certainly can use them for RSS feeds. Akamai has a huge, robust content distribution infrastructure that can simply be plugged in.

And finally, as an end user, my biggest gripe with RSS feeds is not the ‘pulling’ aspect. Its the way they are aggregated and presented. Its the way meta data is maintained. I don’t want to have to statically maintain my OPML — I want it to evolve dynamically, learning from my reading patterns and automatically picking up stuff that I might find interesting. I don’t want to have to read feeds in order of their arrival, or categorized by some tags — I want Google news like clustering across my feeds. When I am overwhelmed by the number of feeds, I want to see the largest clusters only and zoom in as I feel like reading more.

I’m looking forward to further results from FeedTree!