pmuellr is Patrick Mueller

other pmuellr thangs: home page, twitter, flickr, github

Wednesday, April 11, 2007

twit-growl woops

I would be remiss in not pointing out a huge problem I recently found with my twit-growl program that I previously blogged about.

The problem is that the program is a fairly simple python script that uses some 'command line' programs to do all it's heavy lifting; curl and growl-notify. To invoke these, I was using os.system() to invoke the program, building up a big command-line as the parameter. As the programs take parameters which are coming from the data being downloaded, I was trying to be careful to escape the data when constructing the command lines.

Unfortunately, I'm an idiot, and didn't understand the escaping rules. In particular, I wasn't taking into account backquote substitution.

So, I was really suprised one day, when I got a message displayed in growl that was humongous; and the tail end of of the message was a wad of html. Going to the twitter site, to see the message there, I immediately realized what happened. The twitter message included some text like this:

... `curl`


Got that fixed up pretty quickly, by avoiding the use of os.system() and using an os.spawn() variant instead, which (hopefully) avoids the shell completely.

BTW, I continue to use my twit-growl to get 'popups' for incoming tweets, and generally use the wonderful tweetbar Firefox extension for posting, and reviewing tweets en masse.

json schema

James Clark has a great post on XML and JSON. But I have one point to pick at.

"JSON's primitive datatype support is weak."

First, the number issue. If you assume JSON datatypes map into JavaScript ones, I believe the JavaScript number type does in fact have a very specific representation. I believe, a 64 bit floating point number. Even for integers. I'd love to point to a reference for this, but I don't have one handy, and the one at mozdev is insufficient.

Note that outside of JavaScript, for languages that treat integers and floating point as separate types, a parser can obviously distinguish between integers and floating points and 'do the right thing' in terms of mapping the JSON value into the appropriate type.

Next, "the set of primitive datatypes is not extensible". But this is true for XML itself. Until you mix in XML schema. It's schema that allows you to interpret a string in XML as some sort of other data type. JSON has no standard schema. Yet. It's easy to imagine though. Defined in JSON, of course.

json array exploit

Unless you've been living in a cave for the last week or so, you are familiar with JSON exploit as publicized by Fortify Software. I believe this is the same exploit I first read about on Robert Yates' blog. Original reference was from Joe Walker.

I'm not going to claim to be an expert on this particular exploit, but if I understand the situation correctly, I have a few issues with the current concerns.

  1. It's clearly not just JSON data which is suspect here. Anything which appears to be a valid (or maybe even somewhat valid) chunk of JavaScript, which someone could access via <script src=> is suspect. How many resources like this might be available at your site? How 'valid' (in terms of valid JavaScript) does it have to be? Only folks who are intimately familiar with the code of the JavaScript interpreters we use, can say for sure. Let's also not forget E4X; could your XML data be suspect?

  2. This appears to be a problem today just for Mozilla / Firefox. And I believe the reason is the advanced functionality it provides in JavaScript with coolio getter/setter capabilities. (BTW, ActionScript 3 also supports getter/setter capabilities). Is this actually a bug? Doesn't seem like it. Is there some chance this functionality could be locked down during script/src inclusion? Seems unlikely (slippery slope; what else would you lock down?). Could the enhanced functionality be turned off for user-land scripts? Probably; I don't suspect too many people are actually making use of the functionality. There's an open bug @ Mozilla on the issue that has additional thoughts.

  3. I'm clearly showing my n00biness here, but why aren't even simpler exploits possible, on all the browsers? Before doing a script/src, redefine a well-known object/function with one that does some hijacking? The security model for JavaScript is complex; so complex; too complex.

  4. Ted Husted pointed out that one way to fix this problem is to enclose your JSON in comment characters, and then strip the comments before parsing the JSON. That's great, unless I want to actually include comments inside my JSON. And assuming that JavaScript comments can't be 'recursive' (not specified at moz's js doc). Why would I want comments in my JSON? Why not? If you can have them in XML, I'm not sure why we won't someday want them in JSON. Note that I don't consider RFC 4627 to be the final statement on JSON syntax.

  5. The fact that JSON is also a syntactically valid chunk of JavaScript is smelling worse and worse all the time. I can't imagine a situation where I'd eval() JSON data in a production app; it's just so much safer to parse the data yourself. So, why don't we use this opportunity to come up with YASDN (yet another simple data notation)? JSON is great because it's readable, but I've already pointed out how we could make it even MORE readable.

  6. Isn't it about time we got a little better modularity available in JavaScript, beyond the script/src story. Which is more or less the same thing as building a C program in one big monolithic file that pulls in functionality with #include. So 1970's ...