Category: Software

Experiences with Google App Engine


I’ve been playing around with [[http://appengine.google.com|Google AppEngine]] for the past two weeks, and the experience has been mixed so far. First, the good:

* really easy to build something simple and get started.
* no need to worry about scaling, backup, replication etc. I haven’t verified this obviously, but at least thats the claim.
* the integration with Google accounts is nice.
* good documentation, lots of sample code available.
* dev server really helps with most of the development.
* the sort of restrictive resource usage limits (see below) forced us to think carefully about our code and heavily optimize certain operations to make them work on GAE.

{{ http://code.google.com/appengine/images/appengine_lowres.jpg}}

And now, the bad:
* too many limits: 1 million is their favorite number. No files over 1MB, no request should take more than 1 million CPU cycles (whatever that means) and who knows what other limits they impose internally. While developing, this was the biggest barrier for us. Things would randomly fail, and then our application would be disabled for several hours.
* The dev server doesn’t replicate the constraints in production. So everything would run fine and dandy locally, and the minute we upload, it would fail. Since we can only debug in production, and our application exceeded the quota every time we ran it, debugging was extremely slow and painful.
* local data store is excruciatingly slow. But this is not that critical, since it is only for testing anyways.
* even the remote data store is very flaky and slow at times. Any query involving more than a few hundred elements exceeds the quota.
* the bulk uploader is very useful, but again it is really really slow. If you want to upload anything in “bulk”, you’ll have a hard time. The parameters have to be chosen carefully as well. Even for very simple data models involving 3-5 fields (mostly strings), we had to reduce the batch size to 2-4 to make it work. And despite that we got a few HTTP 500 errors while uploading.

But its been fun so far. Hopefully most of these issues will get ironed out moving forward. As for what we are building? That will have to wait for another post ;-)

Still around


I was shocked to see how infrequently I’ve blogged since the [[http://floatingsun.net/blog/archives/|beginning of 2007]]. Maybe its good in a way — I had enough things to keep me busy in my real life that I didn’t have too much time for my online self. The past few weeks I was swamped with some paper deadlines. Now that those are done, I’m trying to catch up with the rest of my life, and eventually got around to poking with my blog again. A lot of other things have happened in the meantime and I’ll try to gradually cover them in subsequent posts.

Meanwhile, I have started hacking together my own theme on top of [[http://getk2.com|K2]]. Right now I’m maintaining it has a set of [[http://www.selenic.com/mercurial/wiki/|Mercurial]] [[http://www.selenic.com/mercurial/wiki/index.cgi/MqExtension|mq]] patches on top of the SVN-trunk of K2. So far its working out alright, we’ll see how it goes moving forward. The only major changes I’ve made so far are the new [[http://floatingsun.net/blog|Home]] page and some typographical tweaks. The Home page has two columns, and they support [[http://widgets.wordpress.com|widgets]] and K2 modules — so you can pretty much put arbitrary content in the columns. I also whipped up my own text widget, which would apply the content filters to the text — the default text widget outputs raw text, which I didn’t like. I want to be able to use my [[http://floatingsun.net/blog/code/wp-dokuwiki/|WP-Dokuwiki]] plugin wherever possible.

I’m still trying to figure out fonts and color schemes. I’m so bad at color schemes, really! I’m really digging Adobe’s new online tool though — [[http://kuler.adobe.com|Kuler]]. I’ll also beg/borrow/steal the features I like from some of the other really neat WordPress themes out there, like [[http://cutline.tubetorial.com|Cutline]], [[http://squible.com|Squible]], [[http://deanjrobinson.com|Redoable]] etc.

Finally, I know that I’m probably spending too much time unnecessarily on the look and feel of the blog. I’m not really a designer, so this is my attempt to fool myself into believing that I can atleast try. But I will start focusing on content very soon now.

Whats with __MACOSX in Zip files?

More and more people are using Mac’s for development these days. As an example, a lot of the core developers from some of the leading web frameworks use Mac as their primary development platform. Several plugin and theme authors for WordPress also develop on Mac. While this is a good thing, there is one particular side effect of this development that annoys me beyond relief.

It seems that the easiest way to archive something on Mac is to right click on your directory of choice in Finder and select “Archive as…”. This creates a Zip file, which then the developer can distribute to users. The problem is that Apple, like many other software giants, tends to twist and bend the user’s will and interpret what the user wants to mean something else. In this case, the natural thing for the OS to do is pack up that directory, and ONLY that directory in a Zip file. But no sir, how can that be? How can Apple “transparently” embed some metadata in the Zip file so that if some other Mac user opens it in Finder, he/she can benefit from this metadata.

Apple does this by creating another folder suspiciously named ”%%__MACOSX%%” at the root of your Zip archive. Here’s an example (its the Cutline theme):

0 02-02-07 12:37 Cutline 1.1/
12292 01-31-07 17:16 Cutline 1.1/.DS_Store
0 02-02-07 12:38 __MACOSX/
0 02-02-07 12:38 __MACOSX/Cutline 1.1/
82 01-31-07 17:16 __MACOSX/Cutline 1.1/._.DS_Store
82 01-31-07 00:12 __MACOSX/Cutline 1.1/._ie6.css
238 01-30-07 23:59 Cutline 1.1/ie7.css
82 01-30-07 23:59 __MACOSX/Cutline 1.1/._ie7.css
0 09-13-06 17:30 Cutline 1.1/images/
12292 09-13-06 17:30 Cutline 1.1/images/.DS_Store
0 02-02-07 12:38 __MACOSX/Cutline 1.1/images/
82 09-13-06 17:30 __MACOSX/Cutline 1.1/images/._.DS_Store
65705 09-11-06 15:55 Cutline 1.1/images/header_1.jpg
34365 09-11-06 15:55 __MACOSX/Cutline 1.1/images/._header_1.jpg
62867 09-11-06 15:59 Cutline 1.1/images/header_2.jpg
33224 09-11-06 15:59 __MACOSX/Cutline 1.1/images/._header_2.jpg
82708 09-11-06 16:01 Cutline 1.1/images/header_3.jpg
34855 09-11-06 16:01 __MACOSX/Cutline 1.1/images/._header_3.jpg
59780 09-11-06 16:03 Cutline 1.1/images/header_4.jpg
33555 09-11-06 16:03 __MACOSX/Cutline 1.1/images/._header_4.jpg

This folder contains, among other things, thumbnails for images in the original archive. Now, this kind of unwanted, undesirable outcomes just really really annoy me. But I’ll try to keep my cool, and present a systematic analysis of not only why what Mac OSX does is wrong, but also stupid and unnecessary:

  • No surprises: As a user, I don’t like surprises, specially of the bad kind. If I request to archive a directory into a Zip file, thats exactly what I want. If I later unarchive that zip file, I should get my original directory back. Nothing more, nothing less. Any kind of unintended behavior is BAD.
  • We are not stupid: If I wanted you to stick in an extra folder named ”%%__MACOSX%%” in my archive, I’d let you know. Your users are a smart group, don’t insult them like this.
  • I hate clutter: In my WordPress themes directory, I unzip Cutline. If each theme starts creating its own ”%%__MACOSX%%” folder, then my themes directory would soon get cluttered with needless garbage.
  • It breaks things: If MacOSX did something harmless, like embed some metadata (like Zip file creator) into the Zip file itself, I might have been OK. But creating an entire tree structure in the archive just breaks things, in ways more than one. As an example, if like Cutline, each WordPress theme started creating ”%%__MACOSX%%” folders in the root of the archive, then later if I install another theme, I’ll get lots of errors and file name collissions because the new theme will also try to extract in the ”%%__MACOSX%%” folder. Not only this, some programs (like Gallery and WordPress) have the ability to load plugins/images directly from Zip files. As a result, I’ll end up with unwanted images, themes and plugins in my setup. Not only this, it might actually just break your installation. Since you did not create the ”%%__MACOSX%%” folder yourself, you don’t know what is in it, and it might not always obey the expecations of the software.
  • Security: Again, you did not explicitly create that folder. What if someone creates a virus, that just modifies the default zip program on Mac to sneak in malicious payload via the ”%%__MACOSX%%” folders in any new Zip archives you create? Apart from the security risk, its a time sink. Why should I go around cleaning up mess that I did not create? Software is supposed to make my life easier, not harder.
  • Redundant: From the looks of it, it seems that all of the data inside the ”%%__MACOSX%%” folder is created from the original directory. No external information is used/needed. If thats the case, why oh why would anyone EVER need this stupid new folder? If some metadata is needed, it can always be reconstructed from the original on demand. This seems downright stupid to me.

Would someone, anyone, please explain Apple’s intent and motivation behind this “feature”? What are the benefits (if any)?

Tools I use: beamer

This is largely a rip-off of my original article. I figured I should repost it for posterity and it fits in line with my tools theme.

Being a grad student (for that matter, in almost any profession these days), I frequently need to give talks or present some material. I have finally settled on Latex Beamer as my preferred presentation tool, and this article describes why.

Introduction

Presentation is one of the most effective means of communication for a small audience with diverse backgrounds. Both in the industry and the acedemia, it is becoming increasingly important to create affective and compelling presentations. Not surprisingly then, the presentation tool you use becomes very important in the work place.

The de facto tool for presentation out there is Microsoft Powerpoint. For more reasons than one, I prefer not to use it. I have tried several alternatives, and finally decided to use Latex Beamer for my presentations. Here I try to describe why I made this choice. I must mention here that the beamer web page looks ostentiously simple and naive — don’t be fooled by it. Beamer is one of the most sophisticated and extensively documented (user manual has more than 300 pages of professionally written documentation) presentation tools I have come across. Take a look at one of the sample slides to get a feel of what beamer can do.

Things I dislike about other presentation tools

While I’m not talking about any one tool in particular, the general flavor is of tools belonging to the Powerpoint family (this includes OpenOffice.org’s Impress, KOffice‘s KPresenter etc)

  • I have to worry about layout
  • Font sizes are a function of amount of content
  • Changing parts of a “theme” is hard
  • Powerpoint slides won’t run nicely on Impress or KOffice. The latter two won’t run at all on Powerpoint. Why do I need something as bulky as powerpoint just to do the presentation? While making, I can understand that we might need significant software complexity, but can’t we have something more lightweight for presenting?

Things I like about beamer?

  • Its LaTeX: latex and friends have survived the test of time and for more than 2 decades people have been using tex derived technologies for typesetting their writings. With latex, beamer makes it easier than ever to put mathematical formualae and all kinds of symbols in your presentations, embed images, make tables and do everything else that you can do with latex. Since many of us already use latex, it means there is less tool to learn — I can make my presentations in a language that I’m already familiar with! And I don’t need any bulky tool to manipulate my presentation, just a text editor is enough, thank you.
  • Its PDF: We all know what PDF stands for — Portable Document Format. Thats it! Portable! Latex runs on all major operating systems and architectures out there. Once you get a PDF from Latex, you can display it using any regular PDF viewer. Imagine how easy it now becomes to move your presentation around. You don’t have to worry if your laptop breaks down and the other laptops in the room don’t have the right version of Powerpoint installed. Put your PDF in a USB key and stop worrying about it!
  • Takes care of layout
  • Themes are endlessly customizable: beamer comes with dozens of pre-packages themes, and its very easy to modify an existing theme. Same thing with fonts and colors (you can even do alpha transparency!)
  • Notes and handouts made the way you want them
  • Organize your presentation in a logical manner: beamer sort of follows the MVC philosophy. In each presentation, there is a content structure, which determines how your content flows through (just like a regular article with sections and subsections). Then there is a slide structure, which determines how this content fits onto your slides. The content structure controls the generation of navitation and table of contents. The slide structure controls the slides and the control flow between them.
  • Amazing documentation: The beamer user manual is over 200 pages long, and its all good solid documentation. It is amazing well written considering the fact that its mostly done by a single person. It starts off with a nice tutorial, followed by detailed references and examples.
  • Accompanying packages: Just check out the documentations for xcolor and pgf. The documentation is just as comprehensive as beamer itself, and these packages make it easy and fun to do fancy stuff with beamer. Like draw pretty pictures and do some basic animation. Again, all with the comfort of latex.

But nothing is perfect

  • No knowledge of projectors or screens — the user has to deal with that (or the operating system)
  • Animation is still hard.
  • In general, multimedia is hard: embedding audio and video clips may not work reliably on all platforms.

I highly recommend beamer to anyone who wants to try an alternative to Powerpoint, and if you write a lot of technical papers in latex, you’ll immediately love beamer. Check out my NSDI talk for a sample of what beamer can do for you.

WP-Dokuwiki 0.2


This is a major release, with some new features and bug fixes.

New Features:
* Support for multiple wiki tags.
* Enabled Javascript based toggling of ToC, as in Dokuwiki.
* Added answers.com interwiki link.
* Added acronym for SCM.
* Removed the template directory — this cuts down significantly on the size.
* Support for formatting of ‘urlextern’ links.

Get the tarball [[http://floatingsun.net/code/wp-dokuwiki-0.2.tar.bz2|here]].

Further details on installation, usage and older versions can be found on the [[http://floatingsun.net/blog/code/wp-dokuwiki|WP-Dokuwiki page]].