giovedì, ottobre 23, 2008

BitTorrent + HTML = free bandwidth!

Here I am, trying to write something English trying to reach the most people I can.
I saw that I have not the possibilities nor the capabilities to code everything I want, so I decided to write some of my intuitions on this blog hoping that some of them can interest somebody that might possibly, one day, translate them into actions.

I start with an idea that I was thinking last few days.
I thought about rush hour of the release date of Firefox 3, and how there was millions of people downloading at the same time from the same site (that was a drupal site I think, by the way).
In the same time I'm trying to coding a new site to my own, and I saw what a pain can be to be the webmaster of a site hosted in some of the free hosting you can find on the web.

I have found this great free hosting with PHP and MySQL with InnoDB activated, and it is totally free except for some banners. The problem is that they also cannot do miracles, the bandwidth is somehow limited, and the CPU resources are even worse.

Then I thought about a protocol that works fine especially in rush hours. It has been a long time since Red Hat distribution of the Linux started using bit torrent to distribute their software.

So I asked myself: what about improving some how HTML including the possibility to specify the download path in a torrent way?
You know that specifying the URL of a resource (like an image) the web master force the browser to go and download separately the image or whatever static content you're telling it to download. Normally that path indicated by URL is a relative path for static content placed on the same server, so it is practically the same bandwidth than that you are going to consume (the server bandwidth and the same client bandwidth).

With this BitTorrent extension to the HTML you, webmaster, could be able to have the first client to download the whole package of images from you, that are the tracker of the bit torrent system. Then supposedly that we are in rush-hour, the second one will try to download same things, and the third one, and the forth one, and so on.
Since this is a rush-hour all of the requests will be made almost at the same time, so it is rather probable that the whole package of images that the first client downloaded one moment ago is still in his cache, while he is just reading the page, not using his bandwidth.
The second and the third (and so on) client could simply try to contact the first one to have the static content delivered, without surcharging the server.

I don't think this should be too hard to realize because I saw that a plugin to transform Firefox into a bit torrent client was already realized for the version 2.x, and on the other hand there are plenty of trackers made in PHP and MySQL.
The work to do to implement all that stuff on the client side is to:
  • automatically intercept the URL in the HTML code
  • translate the new kind of URL as an address to the torrent file
  • start downloading the content.
Then, when the content has been downloaded, just display like the DOM structure says.

On the server-side you just had to put all the static content in a directory shared with the bit torrent protocol, then you have to manage a simple tracker of the clients that download from you all that content.

So even if I don't have the time nor the competencies necessary to do all this stuff on my own, I think that this should be of no effort at all for somebody already expert in the bit torrent client and server sides, and maybe somebody already able to program Firefox plug-ins.

On the other hand this could be useless if we can show that the bandwidth used from the tracker is more than the bandwidth spent in the simple case of normal HTML and HTTP transactions.
Since I'm no expert of bit torrent I simply try to download a torrent file from a well-known tracker, just to see which was its size.
It was a torrent with 20 actual seeders, and the size of the torrent file was 20 kB.

It seems like if you use my method to distribute static content of less than 20 kB it is totally not worth it. I don't know if the tracker has something more to do after that the torrent file has been downloaded, but probably there is something to do after, but then I took this little test:
I went to just to see what was the size of the average image to download in that page.
In the center of the page there was a small image of just 156×117 pixels, worth 27kB, and there were four of them.

The page of Yahoo was just the first thing that came to my mind, and the model was already usable to reduce the bandwidth use of the main page.

I wrote this entry of this blog because I don't want nobody to copyright this idea (I know it is almost impossible, but you never know), instead I would be very pleased if anybody would carry on this idea and realize it: I'd like to have it open source.

Let's see if somebody is interested in it.