pmuellr: 2009-01-04

Saturday, January 10, 2009

spring break in canada?

My wife has this bizarre problem. She likes snow. Apparently, it's genetic, as my children suffer from this as well. Despite the fact that we sometimes get deluged with snow (per the included image), most of the time, we get nothing, or not enough to count.

I've suggested to my wife several times that we should take a vacation in Canada some time; I hear they got lots of the white stuff up there. She's taking me seriously now; desperation is setting in, it appears.

The current thought on time frame is spring break, which for us falls at the beginning of April. Seems pretty dicey to me as to whether they'll still be any snow around by then.

I've been to Mont Tremblant and the big log cabin at Montebello, for "work related meetings", both of which are the kind of thing I'm thinking about, though I'm guessing more expensive than I'd like.

Any suggestions from the lazy web?

Friday, January 09, 2009

JavaScript modules

When I see code like this, an example I pulled from YUI, I simply want to cry. Things need not be so ugly. I don't mean to callout Yahoo here, it's just an example I found. Most of the non-trivial 'package-ized' JavaScript I see is like this.

Issues:

Everyone of these files is a single anonymous top-level function that is invoked to execute it at the bottom of the file. Icky. Why do we need this?

This is done because the method of compositing function into your application is done by including the source of that function into the big, single namespace known as your JavaScript environment. To keep from having source you are compositing into your app not "infect" it with additional global variables, you use the trick of putting all your code in a function body and executing it. The function can be (and should be) anonymous. No infection, or controlled infection, as long as you use var on all your variables (argh), as the variables at the top level of the function are local to the function, and not the "environment" (basically, globals).
We have functions defined in functions defined in functions here, two levels of which are asynchronous callbacks.

I don't have a big beef with nested functions, except when it gets silly. Like, in this case. One of the big offenders is definition of the loader function, whose purpose is to load the code, pre-reqs, etc, defined as callbacks presumably because the loading of such files isn't necessarily provided as a synchronous capability from the browser.
I bet the folks that wrote this had editor macros set for the text YAHOO.example.app

Frankly, there's no defense for this; the code should probably be using "shortcut" variables for the "package names", and even just some of the functions, like YAHOO.log.

I assume there is some kind of taboo over using shortcut variables here; or are people depending on fully qualified function names for code completion or other source analysis tools? Yikes.

How can we make this nicer looking?

I think we need a better way of packaging up the pieces of our composite applications.

Packages. Modules. Bundles. Whatever.

Google Gears WorkerPools, again

I've previously blogged about using Google Gears Worker Pools to build service frameworks. For certain types of functionality, this makes a lot of sense. It certainly has the characteristic of compositing function into your application in a clean, infection-free manner. But it also has the following characteristics:

All message sense are asynchronous. While this sort of programming style might be comfortable to some people, and in fact might be the best way to program in the end, it's not terribly friendly for most programmers who have been using mostly/only synchronous function calls their entire programming life.
Out of the box, you're always going to be 'sending messages' instead of 'calling functions'.There's technically not much of a distinction between sending a message and a function invocation, you might say, besides the invocation style of the two. But again, for most programmers, function invocation is the norm. And probably requires less syntax per invocation. Shorter programs == good.

So while I don't have any problem with WorkerPools per se, and in fact, I think they are a great pattern for handling asynchronous, parallel work, they also aren't really going to provide the best pattern for modularity.

But I really love the cleanliness aspect.

So here is what I want

Python modules.

And here's the thing. I think we can add support for this fairly unobtrusively.

The basic idea is to define a new function, say loadModule() which is used to 'reference' another module by passing it's name as a parameter (URI to the name, prolly). A module is just a JavaScript file. Only instead of working the way <script src=""> does, it actually defines a new separate, empty namespace and loads the JavaScript into that namespace (just like the way Google Gears WorkPools does). The process of running loadModule() on a module the first time is that the JavaScript source is executed. The object returned is a 'module' object, whose propertes include all the global variables in the module's private namespace. For loadModule() calls with the same module beyond the first, the code is not executed again, but the same 'module' object is returned.

Or other varieties thereof. It's fairly simple to play with this kind of stuff from with Rhino, though your brain will be hurting after getting all the prototype and context linkages set up right. I assume you can do this sort of multi-environment stuff in other JavaScript implementations.

I want it in the browser.

This isn't the kind of code you can write in userland JavaScript, because JavaScript doesn't give you low-level access to it's innards. Needs to be yet another function the browser injects into the JavaScript environment, probably not even implemented in JavaScript, but in C, C++, Java, etc.

What changes

This makes individual JavaScript files a little cleaner by:

Not requiring the anonymous function wrapper.
Letting you get away with making your namespace a mess without worrying about infecting someone else's namespace.
Letting you use shorter names, because imports beyond the first are crazy cheap, so every module would probably just import everything they needed as one-level modules.

Sure would be nice if we could make that loadModule() function synchronous, otherwise loadModule() would really have to be a function which took the URI to the module and a callback, and invoked the callback after the module load. Back into some ickys-ville. Is <script src=""> synchronous?

It's not a lot. But it's a start.

Additional advantages

It's easy to imagine that the process of reloading a module which has changed (you just edited it while you were debugging) would be a little more straight-forward; largely only the module itself is affected, though presumably there are some imported object references that would also need to be fixed up (using short-cut variables causes issues here - is that one of the reason the Yahoo example used fully-qualified names?). Some lower-level VM help could get even those references fixed up, I'm thinking.
Better than eval(). Yeah, you could code something up to do this using eval(), I suppose. Or get close. The problem with eval() is the code becomes disassociated from it's source location. This makes it difficult/impossible to debug. Or save, after I make my changes in the debugger (some day). With an import story, the original location of the source can be associated with the code, just like all the files <script src="">'d into your page get associated with their source location.
You could imagine the keeping byte- or machine-code versions of those modules, in their pre-executed state, cached in memory for future interpreter invocations that imported the module. And cached on disk.
As a simple function, you can imagine have embellished versions that handled things like version numbering, pre-reqs, etc.

Example

I coded up an implementation of loadModule() for Rhino tonight, along with a simple example that uses four modules:

main.js:


print("loading module: " + __FILE__)

abc = loadModule("abc.js")
def = loadModule("def.js")

abc.sayHello()
def.sayHello()

abc.js:


print("loading module: " + __FILE__)

sayer = loadModule("sayer.js")

function sayHello() {
    sayer.say("hello")
}

def.js:


print("loading module: " + __FILE__)

sayer = loadModule("sayer.js")

function sayHello() {
    sayer.say("world")
}

sayer.js:


print("loading module: " + __FILE__)

function say(message) {
    print(message)
}

Each module prints a line indicating it's being loaded; the environment I set up defines the __FILE__ variable containing the module source file name (my C roots are showing), and a print() function which prints a line to stdout.

main.js loads two modules, abc.js and def.js. It then calls the sayHello() function in the abc module, followed by the sayHello() function in the def module.

abc.js and def.js are identical, except for the message printed from the sayHello() function at the bottom of the file. Both modules load the sayer.js module. They also both define a function with the same name - sayHello() - but that's ok because they live in separate namespaces and can be accessed separately by code that imports them, like main.js does above.

sayer.js defines a say() function prints a string (my REXX roots are showing).

Here's the output of running the main.js module:

loading module: main.js
loading module: abc.js
loading module: sayer.js
loading module: def.js
hello
world

The output shows that the code in sayer.js is only executed once, like with Python modules; subsequent imports just return the module reference which was built during the first import.

The source for the Java code to run this is available here: http://muellerware.org/hg/org.muellerware.jsml/; it's an Eclipse project stored in an hg repository.

There are really only two Java files used for this, if you just want to peek at the code: Main.java and ModuleLoader.java.

Why don't we have something like this in the browser?

Frankly, after spending a very small amount of time implementing the basic functionality, I have to wonder why we don't have something like this in the web browsers today? We have XMLHttpRequest to programmatically fetch data, why don't we have a way of programmatically fetching and executing code? <script src=""> is a sorry excuse of a version of this. Let me code it, dammit!

Tuesday, January 06, 2009

gpx and exif

Using a GPS

We got our first GPS last Christmas, in preparation for our trip to Ireland in the spring (awesome trip, BTW). That was a Garmin nüvi 270, which is the basic hardware device preloaded with maps of the US and Europe. Buying a device without the Europe maps, and then adding them back would have been a little more expensive. The device was quite useful on the trip.

As I'm a man, I've had more need for the device than my wife, and I had been leaving it in my car. So for my birthday this year, my wife got me a basic device, the Garmin nüvi 205. She wanted 'hers' back. When I started hiking a bit more this fall, I took it with me on the hikes, because it sucks to get lost. I could also kinda figure out where I was based on the shape of the track the device was generating, compared to maps showing trails.

The big problem with taking a nüvi hiking is that the battery only lasts 4-5 hours. It never ran out, but came close a few times. Another problem people may have with older devices is that they don't seem to have the nice tracking function that is really what you want in the device, to show you visibly where you've been on the map. Our one year old nüvi 270 doesn't do the tracking thing, near as I can tell. Lastly, it's not terribly convenient to slip into your pocket; it has a very sensitive touch screen and a easily switched on/off switch at the top. I found an old Palm Pilot leather case, with a hard 'front side' to prevent accidental touches through the case, that ended up being a perfect fit (saving $20 or on the Garmin case; I'm a pack-rat), but still a tight fit for the pocket, and you have to slip it in and out the case just so.

For Christmas, my wife ended up getting me a Garmin eTrex Venture HC, which is the basic GPS hiking model. The maps, compared to the nüvi suck, but that's ok, even the default Garmin maps don't include enough detail for hiking. This device handles track data much better than the nüvi, in that you can pre-load a bunch of tracks into the device and then display them while you're hiking. I've got a bit of a long-winded procedure to generate tracks from existing trail maps and Google Earth (see below), which then shows me something close to the actual trails while I'm hiking.

Besides being used for live tracking, the other thing I've been wanting to do is to correlate the pictures I've been taking while hiking with the GPS, so that I can associate a fairly precise location with the pictures. So that's how I spent a bit of my xmas break; writing that program.

What is GPX?

GPX is a XML file that your Garmin device will poop out giving you a braindump of what it knows; "favorites" you've set up, track logs for where you've been, etc. The file format is described pretty well on this site. The only thing I couldn't quickly figure out was the units for the elevation; meters.

The GPX file will contain a list of points, where each point has the following properties - latitude, longitude, elevation, and time - which it collects every so often (you can configure how often this happens). Here's the GPX file from my most recent hike to White Plains Nature Preserve - http://muellerware.org/kml/White-Pines-Nature-Preserve.gpx.

Actually, getting that GPX file can be a little tricky. You'll need to connect your GPS device to your computer, and for Garmin use the software they provide on a CD to pull the GPX file out, or for the Mac use RoadTrip. For RoadTrip, I always create a new folder for each GPX file I want to create, copy just the stuff I want from the "most recent import" (or whatever), and then export that folder, which exports it to a GPX file with the same name as the folder. A bit non-intuitive, but you'll figure it out.

Once you have the GPX file, you can open it directly in Google Earth. Google Maps doesn't appear to directly eat GPX files, but will eat KML files, and you can easily convert a GPX file to a KML file using the gpsbabel program.

What is EXIF?

EXIF is a standard for metadata embedded in image files. The site http://www.exif.org/ explains all, I guess. The spec is a bit dry. All sorts of metadata can get added to images by your camera, including all the camera settings used when the picture was takem, model information, and for this purpose, GPS information.

Here is an example of the sorts of information that gets stored as EXIF data for a photo.

Two great tastes ...

So, now that we have a bunch of images, and a GPX file, it's a SMOP to get the time of the photo, calculate the GPS coordinates given that time, and then stamp them back into the photo.

It looked to be a difficult slog to deal with the EXIF data myself, so some reading quickly led me to the exiftool program which can do all manner of slicing and dicing of EXIF data for your images.

The program I wrote reads in the GPX file, and then for every image pulls out the time the photo was taken with exiftool, and calculates the GPS coordinates for that photo, stamping that data back into the image with exiftool.

The program, gpx2exif, is housed here: http://muellerware.org/hg/gpx2exif/, is written in Python, may required version 2.5 or above, and also requires that you have exiftool installed.

In addition to stamping the images with GPS data (actually, creating new copies of the images with the GPS EXIF data), it also creates a KML file you can load into Google Earth to 'test' the locations that got stamped. In case your camera's clock is not synchronized to the GPS (hint, hint). If your times are off, read the exiftool help, there's a way to adjust the times of your photos in one swell foop.

Once you've got the GPS data stamped in your images, sites like Flickr and Picasa will show you "map" versions of your sets, and do other stuff with the geo data. The map view for my White Pines set at Flickr is here and the map view for the same set at Picasa is here.

What's next

Turns out you can do all sorts of interesting analysis of the data in the GPX file, like:

calculate distance travelled
calculate speed
figure out when you stopped for a break
plot data onto Google Maps or Google Earth
generate elevation maps

You should be able to do all this stuff in a web browser, in fact, by writing the analysis code in JavaScript. Given that you can access Flickr cross-site via their 'JSONP'-ish support, associating photos with the GPS data is something you can also probably do in the browser. We'll see. I'm a little worried that the number of data points and expensive math required will be a bit much for normal JavaScript processing; I may need to use a Google Gears worker to offload some of that processing.

Notes

exiftool rocks; I was was happy to not have to deal with reading/writing EXIF data myself.
My camera stores times in "local" format. Would have been nice if it used UTC. Do any cameras do this? I made an assumption that the camera, and computer you are running gpx2exif on, are running at the same local time. Again, use exiftool to "fix" this, if it's wrong.
I still can't figure out the secret to the findall() method ElementTree. bugger. Seems like a great API, I just can't use it. The XML processing wasn't that complex, so minidom, which I'm very familiar with, was fine.
Neither Flickr nor Picasa will do anything with your EXIF GPS data unless you specifically tell them to; presumably for privacy reasons. For Flickr, the setting is here; for Picasa, the setting is here.
The resulting map views from Flickr and Picasa aren't terribly pleasing to me; in fact, the KML "test" file I produce from gpx2exif is way more interesting. I think because you can see the actual trail, but also the markers I used (default ones) work better than thumbnails that Picasa uses, and the markers used by Flickr can't be disambiguated when they're too close, like they can in Google Earth.
On the Garmin eTrex device, if you "save" a track that you've made (hiked), it will strip the time values out. Make sure you export the track off the device before saving; the time values are (obviously) critical to determining the locations for your photos.
To pre-load a set of trails for a park onto the device, I do the following. Get a version of the trail maps (prolly from a PDF from the park site) and convert to a JPG file. Bring up Google Earth, find the park, and add an image overlay for the image file you created; set the transparency down so you can see the trails and Google Earth detail. Hopefully there's enough detail in the image, and Google Earth so that you can move/resize the image overlay close enough. Then create some new line segments in Google Earth, tracing over the trails. I couldn't figure out how to export those line segments directly out of Google Earth, but if you "mail" the folder they are in to yourself, you will get a KMZ file, which is just a zip file containing a KML file. Garmin tools like RoadMap don't eat KML, but you can convert the KML to a GPX using gpsbabel, and then import that. Voilà; trails to follow on my device.
When traveling long distances now, I've become completely dependent on the GPS, and it's very nice to have when you're not on the interstate. In fact, I've been actively avoiding interstates as much as I can now; traveling back roads through small towns is much more fun. You basically don't have to keep track of where you are, what roads you're on, where to turn, etc. As long as you got the destination plugged in right. And then I find myself racing against the ETA the GPS displays prominently. I called my wife at one point when I was coming home from hiking trip and the conversation went something like this:

wife: So, where are you?

me: I have no idea.

wife: Well, which way are you coming home?

me: I have no idea.

wife: I don't suppose you know when you'll be getting home?

me: 4:37

wife:	So, where are you?
me:	I have no idea.
wife:	Well, which way are you coming home?
me:	I have no idea.
wife:	I don't suppose you know when you'll be getting home?
me:	4:37

Links