Sunday, June 7, 2009

Google Wave - not just for distributed collaboration

The buzz around Google Wave seems to be focused on how it blends collaboration styles from email, IM, wiki, microblogging, etc and unifies them with a common model. That by itelf is very cool. If you want a good overview from that perspective, I recommend this one at Mashable written by Ben Parr.

However, there's more to the story. Google Wave is also a generic platform for building distributed applications with data that can be represented in XML. That's a very broad domain, even broader than distributed collaboration.

The federation protocol reminds me of a line-oriented editor, an analogy which admitedly dates me. For any younger folks out there, this is what we had before text editors showed you the contents of the file you were editing. You had to navigate through the document by typing line numbers at the command prompt, or move forward and back through the document by a specified number of lines. Then you could insert, replace, or delete lines at the current location. If you're on a UNIX or Linux machine, see the manual page for "ed", a standard line-oriented text editor. One format the UNIX "patch" tool accepts is a set of ed commands. This is essentially what the server-to-server protocol sends, except the commands operate on XML documents.

When a server is sharing changes with another server, the changes are expressed as edits made to the XML document representing a "wave" (or a "wavelet", although that terminology always makes me think of compression algorithms). You can insert characters and elements, delete elements, or split and merge elements. Once you fully grok this, the "playback" feature of the Google Wave user interface becomes a complete no-brainer.

Some of the example gadgets hint at the variety of possible applications. There's a chess game, which I suppose could also be considered a kind of collaboration. Two people alter the state of a board according to a set of rules. Why not have a process engine gadget that alters the state of a running process (according to a set of ruls) and a different gadget for business application monitoring? Why not have an air traffic control gadget with robots that update the positions and vectors of airplanes? Ok, bad example. Maybe an air traffic control simulator game.

Some applications would be more of a challenge to build on the Google Wave model. Data models with lots of non-hierarchical relationships between objects would probably indicate a poor fit.

If you've missed the buzz and want to hear the news straight from Google, start from the beginning here. The original launch video from Google I/O is 80 minutes long. If you know where I can find a five minute demo to share with people, please let me know. Or if I could get sandbox access, I'd love to make one. Hint hint.

If you're interested in an introduction to operational transformation, you might want to watch "Issues and Experiences in Designing Real-time Collaborative Editing Systems". This one is an hour long.

Google Wave is sure to see more than its share of hype, but there's some elegant design behind it. I'll be watching it closely to see if it can live up to its potential.

Wednesday, March 25, 2009

Bit Rot in the Cloud

From Wikipedia:
Bit rot, or bit decay, is a colloquial computing term used either to describe gradual decay of storage media or to facetiously describe the spontaneous degradation of a software program over time. The latter use of the term implies that software can literally wear out or rust like a physical tool.
One way this manifests in the cloud is with VM images that worked fine just a few months ago but have problems today. It's not unique to the cloud, but it happens that I've been experiencing this with some EC2 images. Specifically, for demonstrating how to automate distributed testing with multiple browsers triggered by a continuous integration build.

A taste of some things that can go wrong:
  • REST URLs for APIs become deprecated and no longer supported
  • Services and servers move and are decommissioned
  • Strong password security policies cause expiration of passwords, prevent reuse of old passwords, and lock out users after too many retries (especially bad if it's the admin user)
  • Xauth cookies expire and prevent access to the display
There are a couple ways to guard against this type of bit rot. One is to identify everything that depends on time or external services, then have the instance make the appropriate adjustments and diagnostic checks on startup. Another approach is to only use vanilla VM images, and do all installation and configuration through something like Puppet.

Maybe there's also some value in using continuous integration tools to regularly exercise the VMs and their configuration, especially systems of associated nodes.

Has anybody else run into this? I'd love to hear what approaches you've taken to mitigate this sort of thing, and how they've worked for you.