Tagged: Tools

March 21st, 2006

Tools I use: screen

The [[http://www.gnu.org/software/screen/|screen website]] introduces screen thus:

//Screen is a full-screen window manager that multiplexes a physical terminal between several processes, typically interactive shells.//

But I don’t think it does justice to the utility of screen. Let me try to simply the introduction a little bit and motivate the usefulness of screen with some examples.

Lets say you started some long running simulation in your lab (for instance, a shell script that calls your main routine with different parameters; then collects/collates the data and also plots it). Now you leave it running and come back home with the hope that when you come in next morning, everything would be done.

But if something goes wrong in between, how do you find out? One cheap way to do this is to use ”nohup” and redirect the output of the process to some log file, and inspect the log file from home. But this is not neat. Ideally, you want to be able to interact with the process as if it was running locally (keystrokes and everything).

More importantly, how do you recover from failures? Say your script gets hung somewhere, and it needs some input to proceed. How do you do it from home? You can of course use remote desktop solutions such as VNC, but they typically have high bandwidth requirements, and are almost unusable over slow networks (as are to be found at most homes. Yes, that includes DSL and cable).

Basically, screen lets you run multiple shells, and “detach” them. Later on, you can simply SSH to your machine, and “reattach” your screen session and have your console workspace restored just as it was. You can monitor and interact with your jobs as if they were running locally. In some sense then, screen is a poor man’s version of full fledged remote desktop protocols, since it basically does the same thing for consoles.

But screen can be used for a lot of other things as well. As the introduction says, screen is a window manager. What that means is that you can have multiple “windows” within a single screen (think of them as buffers in an editor, or just multiple shells that you can switch between). You can split up the screen to display multiple windows at once. This is very convinient if you are editing your script in one window and executing it in another while tailing a log file in a third window, for instance.

Another popular use for screen is for doing collaborative editing. Screen’s can be shared across mutliple users, and so multiple parties can communicate using a single screen. You can edit a report together with your project partners, or just keep an eye on what your friends are typing ;-).

Finally, screen is infinitely configurable. Do ”man screenrc” for details. Here is a basic configuration file to get you started, save it as ”~/.screenrc”:

# no annoying audible bell, please vbell on


# detach on hangup

autodetach on
# don't display the copyright page

startup_message off
# caption

caption always "%w"

# hardstatus hardstatus alwayslastline "%H %C%a %M %d, %Y Load: %l "

February 17th, 2006

Turbogears rocks

So I’ve finally started working on my first serious (serious might be too strong a word, but certainly non-toy) web app using [[http://turbogears.org|Turbogears]]. I don’t want to make this yet another Turbogears vs. X,Y,Z ([[http://rubyonrails.com|Rails]], [[http://www.djangoproject.com/|Django]], put-your-favorite-framework) post — after much reading (and a fairly low signal-to-noise ratio in a lot of such comparisons), I’ve decided that it doesn’t really matter. Functionally, almost all frameworks are more or less equal. Personally, I feel that for small projects, it boils down to personal taste.

For me, Rails was out from the start because I’m a Python guy. I’ve coded extensively in Ruby at one point, but I didn’t enjoy it as much as I enjoy coding Python and thats that. Besides, I’m heavily tied into some tools (such as matplotlib) that just don’t have comparable couterparts in Ruby (yet). To be honest I haven’t looked at Django in any depth, so I can’t make a fair comparison. On the other hand, I //have// been following Turbogears pretty much from its beginnings, although I haven’t tried it in any great depth either.

What it all boiled down to was that I had to pick one of them, any one, and dive in. Stick with it till I run into the meat, and then only I would figure out the strenghs and weaknesses. I chose Turbogears, because I really like the philosophy behind the project. Each of the component projects it itself quite mature. I’ve been tracking the mailing list of TG, and the community is great! There are a lot of smart people who evidently know what they’re doing.

I spent some hours with TG today and I’m really enjoying it. Here are some of the things that I thought were great, but which are not really highlighted as TG features. I think they should be emphasized a bit more, because they’re all pretty significant for any project:

* Extending Jeff Watkin’s [[http://nerd.newburyportion.com/2005/11/updated-identity-framework|identity framework]] was really easy and intuitive. I just derived my own User class from TG_User and thats it! I was all set for using identity!
* I really **loved** [[http://checkandshare.com/catwalk/|Catwalk]]. I mean, I cannot stress how useful it was for me. And how well it has been done. It is the perfect tool when you’re bootstrapping your database and testing your application. The interface is beautiful and intuitive, and even support SQLObject inheritance and joins!
* The entire [[http://turbogears.org/docs/toolbox/|toolbox]] is actually very useful, specially the WidgetBrowser. The things that I didn’t find of much use were the web based Python console and [[http://www.checkandshare.com/modelDesigner/|ModelDesigner]]. I think the latter is actually useful, I just haven’t gotten around to using it yet.
* Widgets: the new widgets are a great way for quickly putting together interactive elements on a page and even displaying data programatically.

So far I haven’t touched [[http://mochikit.com/|Mochikit]] at all, and barely scratched the surface of Kid and CherryPy, but with my experiences so far, I’m really looking forward to digging deeper. A comforting thought is that TG is highly flexible in terms of the components — people have used SQLAlchemy instead of SQLObjects, there are [[http://www.blueskyonmars.com/2006/01/06/template-plugins-for-everyone/|template plugins]] for Cheetah, Stan etc. So if I don’t enjoy this particular combination for some reason, I can atleast hope that my efforts into learning TG will not go to waste — something or the other will work out :-)

February 7th, 2006

Tools I use: matplotlib

I get a lot of questions along the lines of “Hey Diwaker, what do you use for blah?” (insert your requirement there). Apparently, I seem to have a talent for finding “smart” tools that people like using. So I figured I should blog about some of the tools I used. Maybe others can benefit.

I’ll skip a couple of obvious ones here: my editor of choice is [[http://floatingsun.net/blog/tags/vim|vim]], and I used [[http://floatingsun.net/blog/tags/wordpress|wordpress]] for my blogs.

Let me instead come to something that //every// grad student ends up doing a lot of — making graphs (well, almost every. My friends in theory hardly draw graphs). And frequently, even people outside of research need to make pretty looking graphs and plots. Within academia and researchers, [[http://www.gnuplot.info|gnuplot]] has been the defacto plotting tool for as long as I can remember (I’m pretty sure it goes back atleast a decade, if not more). Outside the research community, most people tend to use the plotting tools that come with Office software — M$ Excel or Powerpoint and the likes.

Don’t even get me started on the Excel/Powerpoint crap. Maybe they are good enough for a quick and dirty work. But for anything more than that, for doing any //real// analysis/visualization, they are pretty much useless. For me, a good tool must meet the following requirements:

* It must be scriptable: I don’t want to have to open a bloated GUI and click a 100 buttons and drag-and-select columns to get a plot out. When dealing with large amounts of data stored in myriads of files, it is **critical** to be able to script/automate the process.
* It must support multiple output formats: EPS, PS, PDF, PNG, JPG, SVG are the ones I usually need. (E)PS/PDF for embedding in papers. PNG/JPG for viewining/emailing. SVG is just cool :-)
* It must support a variety of graph types: bar charts, pie charts, histograms, error bars.
* It should be **highly** customizable: tick size, label fonts, colors, line styles, thicknesses, positioning, subplots, grids, log scales, transparency, marker styles — EVERYTHING.
* Easy things should be **really** easy, and complicated things must be possible.

GNUPlot has served us quite well over the years. Atleast in CSE, I can confidently say that close to 80% of all graphs in papers are done using GNUPlot. In rare cases its OpenOffice/Excel. But GNUPlot is showing its age now: it can only deal with very simplistic input formats, its not very customizable, it supports a very limited number of graph types (AFAIK, it //still// doesn’t support bar charts natively). But for me, the biggest gripe is that it forces me to break my data analysis phase in two steps: in the first phase I write some scripts (typically in Python) to process the raw data into a form that can be consumed by GNUPlot; in the second phase I write another script (in GNUPlot) to do the actual plotting.

Enter [[http://matplotlib.sourceforge.net/|Matplotlib]]: this is easily the **best** plotting library I have ever used. Endlessly customizable, Matplotlib can do almost [[http://matplotlib.sourceforge.net/screenshots.html|any kind of plot you can imagine]] and some more. Apart from the traditional object oriented interface, Matplotlib also gives a very simple MATLAB (R) like interface for easy plotting. The API maintains a high degree of compatibility with MATLAB API, so MATLAB users will feel right at home.

Furthermore, since it is written in Python, it means that I can unify my data analysis — the possibilties are endless. I can feed all kinds of data directly to Matplotlib. I can process, analyze and plot in the same script. I can make my scripts highly generic (since they are in Python, I can pass command line parameters and what not — none of this is possible with GNUPlot).

Here’s a code fragment to make a really simple plot (stolen from [[http://matplotlib.sourceforge.net/tutorial.html|the excellent tutorial]]):

from pylab import * plot([1,2,3]) xlabel('time') ylabel('volts') title('A line') show()

And, best of all, you get a fabulous, interactive user interface for free! Yes, I know GNUPlot has an interactive mode too, but this is beyond comparison. You can pan, zoom, go back-forward in view history, save and what not. Here’s a screenshot:
{{ http://matplotlib.sourceforge.net/tut/navcontrols2.png?300×200|Toolbar2}}

Finally, since its Python, its very portable. It supports a variety of [[http://matplotlib.sourceforge.net/backends.html|backends]], [[http://matplotlib.sourceforge.net/interactive.html|interfaces with ipython]] and is under active development (at version 0.86 currently). So next time you have to do a plot, consider doing it in style — do it with matplotlib!