Links

pmuellr is Patrick Mueller

other pmuellr thangs: home page, twitter, flickr, github

Tuesday, January 22, 2019

reboot

After 3.5 years at NodeSource, I retired at the end of last week. I really loved working there - great people, fun products, challenging problems to solve. Alas, my work history shows that I tend to not stick around an organization more than 3-4 years, even with great projects. I kinda enjoy mixing things up every few years, and as I get older ... there realistically won't be that many more for me, before I really retire.

So, shakin' things up!

Taking a few months off, but here's my current rough list of TODOs:

In order to kill more birds with less stones, there's a particular app I'd like to build to display weather information, something I've been wanting to build for a while. So I guess I can try building that with the client bits, deploying that a bunch of different ways, and blog about it. A Max standalone app for it is a stretch goal :-)

Sunday, September 10, 2017

moar profiling!

I'm a long-time fan of using CPU profilers to analyze programs.

CPU profilers are great because they provide insight into our programs that you can't get easily any other way - how fast - or more likely how slow - are functions in your code. I've learned with profilers that there's no point in guessing what code you think is fast or slow, because:

  • you're likely wrong, or will find surprises
  • it's so easy to profile some code, just do it!

For Node.js, there are some useful profilers available:

Of course I'm partial to N|Solid since I work on it, <wink/>.

And yet ... I've grown a little restless with our current crop of profiling tools.

The defacto visualization for profiles is flame graphs. But I have issues:

  • the interesting dimension to be looking for is the widths of the boxes, but my eye is often drawn to the heights

  • they don't provide much room for textual information, since many of the boxes can be very thin

  • usually, all I care about looking at a profile are the most expensive functions, but visually sorting box widths across a 2 dimensional graph of boxes doesn't make a lot of sense

The age-old profile visualization is the table/tree, where functions are displayed as rows in a table, but can be expanded to show the callers or callees of the individual functions. While a grand idea - merging tables and trees into a single UI widget - it's also as tedious as you can imagine it might be. Lots of clicking to expand/contract the tree bits, and then the whole thing gets very noisy.

Both of these tools are sufficient, and do provide a lot of value, but ... I feel like we can do better.

And so I've started playing with some new tooling:

  • moar-profiler - a command-line tool to generate profiles for Node.js processes, that adds even moar information to the profile than what v8 provides (eg, source, package info, process meta data). These profiles are compatible with every v8 .cpuprofile digesting tool, they just extend the data provided in the "standard" profile data.

  • moar-profile-viewer - a web app to view v8 .cpuprofile files generated from Node.js profiling tools, that provides even moar goodies when displaying profiles generated from moar-profiler

Here's what the profile viewer looks like with a profile loaded:

moar profile viewer

The viewer is at Minimum Viable Product state. There are no graphs. There are no call stacks. There are only lists of functions, and the ability to get more information on those functions - callers and callees - and a view of the source code of the function. Gotta start somewhere!

Some of the fun stuff the profile viewer does have is:

  • displaying the package and version a function came from, along with a link to the package info at npmjs.org

  • annotated source code

  • ability to toggle hiding "system" functions vs "user" functions

Generating a profile with moar-profiler is pretty easy:

  • run your program with the --inspect option (requires Node.js >= version 6):
node --inspect myProgram.js arg1 arg2 ...
  • in a separate terminal, run moar-profiler which will generate the profile to stdout
moar-profiler --duration 5 9229 > myProgram.cpuprofile

In this case, a 5 second profile will be collected, by connecting to the Node.js process using the inspect port 9229 (the default inspect port).

Now that you have a .cpuprofile file, head over to the moar profile viewer web app, and load the file (or drop it on the page).

If you would like to test drive the web app without generating a .cpuprofile file first, you can download this file, which is a profile of npm update, and then load it or drop it in the moar profile viewer web app:

If you don't feel comfortable sending a .cpuprofile file over the intertubes, you can clone the moar-profile-viewer repo and open the docs/index.html page on your own machine - it's a single page app in an .html file - no server required.

There's a bunch of cleanup to do with the viewer, but I'd like for it to serve as a basis for further investigation. The cool thing with both moar-profiler and the moar profile viewer is that they work with existing v8 profile tools. If you really need a flame chart from a profile generated with moar-profiler, just load it into your flavorite profile visualizer (eg, Chrome DevTools). Likewise, the moar profile viewer can read profiles generated with other Node.js profiling tools, but you won't have as nice of an experience compared to profiles generated with moar-profiler.

I do want to provide the ability to see call stacks, somehow. Current thought is Graphviz visualizations of the call stacks involving a selected function. Those might be small enough to provide useful visualization. Previous experiments rendering the entire profile call graph as a single Graphviz graph were ... disastrous, as most extremely large and complex Graphviz graphs tend to be.

This is all open source, of course, so feel free to chime in and help out. I'm obviously interested in bug reports, but also new feature requests. Dream on!

Monday, March 14, 2016

watching your projects

I'm a big fan of having an "automated build step" as part of the development workflow for a project. Figured I'd share what I'm currently doing with my Node.js projects.

What is an "automated build step"? For me, this mean some "shell" code that gets run when I save files while developing. It may not actually be "building" anything - often I'm just running tests. But the idea is that as I save files in my editor, things are run against the code that I just saved.

So there's basically two concepts here:

  • file watchers - watching for updates to files
  • code runners - when files are updated, run some code

file watchers

I have some history with file watchers - wr, jbuild, cakex. The problem with file watching is that it's not easy. The file watching bits in jbuild and cakex are also bundled with other build-y things, which is good and bad - basically bad because you have to buy in to using jbuild and cakex exclusively. wr is a little better in that it's just a file watcher/runner, but I ended finding a replacement for that with nodemon. The advantage to using nodemon is that it's already an accepted tool, and I don't have to support it. :-) Thanks Remy!

code runners

There are many options for code runners; make, cake, grunt, gulp, npm scripts, etc. I'm currently using npm scripts, because they're the easiest thing to do, for non-complex "build steps".

For example, here are the scripts defined in the package.json of the nsolid-statsd package.

"scripts": {
  "start": "node daemon",
  "utest": "node test/index.js | tap-spec",
  "test": "npm run utest && standard",
  "watch": "nodemon --exec 'npm test' --ext js,json"
}

Everything's kicked off by the watch script. It watches for file changes - in this case, changes to .js and .json files, and runs npm test when file changes occur. npm test is defined to run the test script, and for this package, that means running npm run utest and then standard. utest is the final leg of the scripts here, which runs my tape tests rooted at test/index.js, piping them through tap-spec for colorization, etc.

So the basic flow will be this, when running the watch script:

  • run node test/index.js, piping through tap-spec
  • if that passes, run standard
  • wait for a .js or .json file to change
  • when one of those files changes, start over from the beginning

Just run the watch script in a dedicated terminal window, perhaps beside your editor, and start editing and saving files.

Note that nodemon, standard and tap-spec are available in the local node_modules directory, because they are devDependencies in the package itself (tape is used when running test/index.js):

"devDependencies": {
  "nodemon": "~1.8.1",
  "tape": "~4.2.0",
  "tap-spec": "~4.1.1",
  "standard": "~5.4.1"
}

As such, I can use their 'binaries' directly in the npm scripts.

command line usage

You would then kick this all off by running npm run watch. But since it turns out I've been adding a watch script to projects for a long time, all of which do basically the same thing (maybe with different tools), I instead wrote a bash script a while back called watch:

#!/bin/sh
npm run watch

So now I just always run watch in the project directory to start my automated build step.

Here's what the complete workflow looks like:

  • open a terminal in the project directory
  • open my editor (Atom)
  • run watch at the command-line
  • see the tests runs successfully
  • make a change by adding a blank line that standard will provide a diagnostic for
  • save file
  • see tests run, see message from standard
  • remove blank line, save file
  • see tests run, no message from standard

(open image in a new window)

It's basically an IDE using the command-line and your flavorite editor, and I love it.

As projects get more complex, I'll start using some additional tooling like make, cake, etc, or just invoke stand-alone bash or node scripts available in a tools directory. Those would just get added as new npm scripts, and injected into the script flow with nodemon. For instance, you might do something like this:

"scripts": {
  "start": "node daemon",
  "utest": "node test/index.js | tap-spec",
  "test": "npm run utest && standard",
  "build": "tools/build.sh",
  "build-test": "npm run build && npm test",
  "watch": "nodemon --exec 'npm run build-test' --ext js,json"
}

Now when I watch, tools/build.sh will be run before the tests. You can also run each of these steps individually, via npm run build, npm test, npm run utest, etc.

Monday, September 28, 2015

getting started with N|Solid at the command line

Last week, NodeSource (where I work) announced a new product, N|Solid . N|Solid is a platform built on Node.js that provides a number of enhancements to improve troubleshooting, debugging, managing, monitoring and securing your Node.js applications.

N|Solid provides a gorgeous web-based console to monitor/introspect your applications, but also allows you to introspect your Node.js applications, in the same way, at ye olde command line.

Let's explore that command line thing!

installing N|Solid Runtime

In order to introspect your Node.js applications, you'll run them with the N|Solid Runtime, which is shaped similarly to a typical Node.js runtime, but provides some additional executables.

To install N|Solid Runtime, download and unpack an N|Solid Runtime tarball (.tar.gz file) from the N|Solid download site. For the purposes of this blog post, you'll only need to download N|Solid Runtime; the additional downloads N|Solid Hub and N|Solid Console are not required.

On a Mac, you can alternatively download the native installer .pkg file. If using the native installer, download the .pkg file, and then double-click the downloaded file in Finder to start the installation. It will walk you through the process of installing N|Solid Runtime in the usual Node.js installation location, /usr/local/bin.

If you just want to take a peek at N|Solid, the easiest thing is to download a tarball and unpack it. On my mac, I downloaded the "Mac OS .tar.gz" for "N|Solid Runtime", and then double-clicked on the .tar.gz file in Finder to unpack it. This created the directory nsolid-v1.0.1-darwin-x64. Rename that directory to nsolid, start a terminal session, cd into that directory, and prepend it's bin subdirectory to the PATH environment variable:

$ cd Downloads/nsolid
$ PATH=./bin:$PATH
$ nsolid -v
v4.1.1
$

In the snippet above, I also ran nsolid -v to print the version of Node.js that the N|Solid Runtime is built on.

This will make the following executables available on the PATH, for this shell session:

  • nsolid is the binary executable version of Node.js that N|Solid ships
  • node is a symlink to nsolid
  • npm is a symlink into lib/node_modules/npm/bin/npm-cli.js, as it is with typical Node.js installs
  • nsolid-cli is a command-line interface to the N|Solid Agent, explained later in this blog post

Let's write a hello.js program and run it:

$ echo 'console.log("Hello, World!")' > hello.js
$ nsolid hello
Hello, World!
$

Success!

the extra goodies

N|Solid Runtime version 1.0.1 provides the same Node.js runtime as Node.js 4.1.1, with some extra goodies. Anything that can run in Node.js 4.1.1, can run in N|Solid 1.0.1. NodeSource will release new versions of N|Solid as new releases of Node.js become available.

So what makes N|Solid different from regular Node.js?

If you run nsolid --help, you'll see a listing of additional options and environment variables at the end:

$ nsolid --help
...
{usual Node.js help here}
...
N|Solid Options:
  --policies file       provide an NSolid application policies file

N|Solid Environment variables:
NSOLID_HUB              Provide the location of the NSolid Hub
NSOLID_SOCKET           Provide a specific socket for the NSolid Agent listener
NSOLID_APPNAME          Set a name for this application in the NSolid Console
$

N|Solid policies allow you to harden your application in various ways. For example, you can have all native memory allocations zero-filled by N|Solid, by using the zeroFillAllocations policy. By default, Node.js does not zero-fill memory it allocates from the operating system, for performance reasons.

For more information on policies, see the N|Solid Policies documentation.

Besides policies, the other extra goody that N|Solid provides is an agent that you can enable to allow introspection of your Node.js processes. To enable the N|Solid Agent, you'll use the environment variables listed in the help text above.

For the purposes of the rest of this blog post, we'll just focus on interacting with a single Node.js process, and will just use the NSOLID_SOCKET environment variable. The NSOLID_HUB and NSOLID_APPNAME environment variables are used when interacting with multiple Node.js processes, via the N|Solid Hub.

The N|Solid Agent is enabled if the NSOLID_SOCKET environment variable is set, and is not enabled if the environment variable is not set.

Let's start a Node.js REPL with the N|Solid Agent enabled:

$ NSOLID_SOCKET=5000 nsolid
> 1+1 // just show that things are working
2
>

This command starts up the typical Node.js REPL, with the N|Solid Agent listening on port 5000. When the N|Solid Agent is enabled, you can interact with it using N|Solid Command Line Interface (CLI), implemented as the nsolid-cli executable.

running nsolid-cli commands

Let's start with a ping command. Leave the REPL running, start a new terminal window, cd into your nsolid directory again, and set the PATH environment variable:

$ cd Downloads/nsolid
$ PATH=./bin:$PATH
$

Now let's send the ping command to the N|Solid Agent running in the REPL:

$ nsolid-cli --socket 5000 ping
"PONG"
$

In this case, we passed the --socket option on the command line, which indicates the N|Solid Agent port to connect to. And we told it to run the ping command. The response was the string "PONG".

The ping command just validates that the N|Solid Agent is actually running.

Let's try the system_stats command, with the REPL still running in the other window:

$ nsolid-cli --socket 5000 system_stats
{"freemem":2135748608,"uptime":2414371,"load_1m":1.17431640625,"load_5m":1.345703125,"load_15m":1.3447265625,"cpu_speed":2500}
$

The system_stats command provides some system-level statistics, such as amount of free memory (in bytes), system uptime, and load averages.

The output is a single line of JSON. To make the output more readable, you can pipe the output through the json command, available at npm:

$ nsolid-cli --socket 5000 system_stats | json
{
  "freemem": 1970876416,
  "uptime": 2414810,
  "load_1m": 1.34765625,
  "load_5m": 1.26611328125,
  "load_15m": 1.29052734375,
  "cpu_speed": 2500
}
$

Another nsolid-cli command is process_stats, which provides some process-level statistics:

$ nsolid-cli --socket 5000 process_stats | json
{
  "uptime": 2225.215,
  "rss": 25767936,
  "heapTotal": 9296640,
  "heapUsed": 6144552,
  "active_requests": 0,
  "active_handles": 4,
  "user": "pmuellr",
  "title": "nsolid",
  "cpu": 0
}
$

The full list of commands you can use with nsolid-cli is available at the doc page N|Solid Command Line Interface (CLI) .

generating a CPU profile

Let's try one more thing - generating a CPU profile. Here's a link to a sample program to run, that will keep your CPU busy: busy-web.js

This program is an HTTP server that issues an HTTP request to itself, every 10 milliseconds. It makes use of some of the new ES6 features available in Node.js 4.0, like template strings and arrow functions. Since the N|Solid Runtime is using the latest version of Node.js, you can make use of those features with N|Solid as well.

Let's run it with the agent enabled:

$ NSOLID_SOCKET=5000 nsolid busy-web
server listing at http://localhost:53011
send: 100 requests
recv: 100 requests
...

In another terminal window, run the profile_start command, wait a few seconds and run the profile_stop command, redirecting the output to the file busy-web.cpuprofile:

$ nsolid-cli --socket 5000 profile_start
{"started":1443108818350,"collecting":true}
... wait a few seconds ...
$ nsolid-cli --socket 5000 profile_stop > busy-web.cpuprofile

The file busy-web.cpuprofile can then be loaded into Chrome Dev Tools for analysis:

  • in Chrome, select the menu item View / Developer / Developer Tools
  • in the Developer Tools window, select the Profiles tab
  • click the "Load" button
  • select the busy-web.cpuprofile file
  • in the CPU PROFILES list on the left, select "busy-web"

For more information on using Chrome Dev Tools to analyze a CPU profile, see Google's Speed Up JavaScript Execution page.

Note that we didn't have to instrument our program with any special profiling packages - access to the V8 CPU profiler is baked right into N|Solid! About time someone did that, eh?

You can easily write a script to automate the creation of a CPU profile, where you add a sleep command to wait some number of seconds between the profile_start and profile_stop commands.

#!/bin/sh

echo "starting CPU profile"
nsolid-cli --socket 5000 profile_start

echo "waiting 5 seconds"
sleep 5

echo "writing profile to busy-web.cpuprofile"
nsolid-cli --socket 5000 profile_stop > busy-web.cpuprofile

Or instead of sleeping, if your app is an HTTP server, you can drive some traffic to it with Apache Bench (ab), by running something like this instead of the sleep command:

ab -n 1000 -c 100 http://localhost:3000/

generating heap snapshots

You can use the same technique to capture heap snapshots, using the snapshot command. The snapshot command produces output which should be redirected to a file with a .heapsnapshot extension:

$ nsolid-cli --socket 5000 snapshot > busy-web.heapsnapshot

You can then load those files in Chrome Dev Tools for analysis, the same way the CPU profiles are loaded.

For more information on using Chrome Dev Tools to analyze a heap snapshot, see Google's How to Record Heap Snapshots page.

more info

The full list of commands you can use with nsolid-cli is available at the doc page N|Solid Command Line Interface (CLI) .

All of the documentation for N|Solid is available at the doc site N|Solid Documentation.

If you have any questions about N|Solid, feel free to post them at Stack Overflow, and add a tag nsolid.

Wednesday, May 13, 2015

resuscitating a 2006 MacBook

One bad assumption I made about leaving IBM is that I'd be able to get a new MacBook Pro quickly. Nope. 2-3 weeks. NO LAPTOP! HOW AM I SUPPOSED TO TAKE TIME OFF AND RELAX?

Poking through my shelves of computer junk, I spied my two old MacBooks. I forgot I had two of them.

The first was a G4 (PPC) iBook. That was the laptop I bought in 2003 to kick the tires on Apple hardware and OS X 10.2. I was hooked.

The second was a Intel Core Duo (x86) MacBook. 2 2Ghz x86 cores, 2GB RAM, 120GB drive. I bought that in 2006, and remember it being a productive machine. Eventually replaced that with a line of MacBook Pros IBM provided me, up until this week.

Hmmm, is that MacBook still usable? Powered it up fine. Poking around, it seemed like constraints on hardware / OS levels were that I should try to get this box from OS X 10.5 to 10.6. Also - this being a 32-bit machine - some number of apps wouldn't run. Eg, Chrome.

Luckily I still had an OS X 10.6 DVD lying around in my piles of old stuff. Upgraded easily. So now I can run PathFinder, Sublime, iTerm, Firefox, and Skype. Can't run Chrome, Atom, or Twitter. Or io.js. Again, HOW AM I SUPPOSED TO TAKE TIME OFF AND RELAX?

The build requirements for io.js looked like something that I might be able to meet. Rolled up my sleeves.

First I wanted to update git. Currrent version was at 1.6. It was already complaining about git clone operations on occasion. So, first installed homebrew, then installed git 2.4.

That was easy. Now to install gcc/g++ - they were not already on the machine, and the latest stable at brew is 4.9.2. After a long time, it installed that fine. But it doesn't override the built-in gcc/g++. Instead, it provides gcc-4.9 and similar named tools for that version of the gcc toolchain.

To get the iojs Makefile to use these instead of the built-in gcc tools, I set env vars in my .bash_profile:

export CC=gcc-4.9
export CXX=g++-4.9
export AR=gcc-ar-4.9
export NM=gcc-nm-4.9
export RANLIB=gcc-ranlib-4.9

Ready to build iojs! Ran ./configure - it completed successfully but was a little complain-y about some unrelated looking things. Ran make. And then fell asleep. Woke up, it had completed successfully, so ran a little test and ... WORKED! Finally ran sudo make install and NOW IT IS EVERYWHERE.

Got some test failures, but they may be environmental. Tried some typical workflow stuff and things seem fine.

Not a dream machine, by any means. Slow. Constantly watching memory usage and quitting apps as appropriate. And the fan is quite noisy. And the display seems like it doesn't have long to live (display adapter problems are my lot in life as a MacBook owner).

But kinda fun.

leaving IBM, taking a break, going somewhere else

This last Monday - May 11, 2015 - was my last day as an IBMer.

I had a great run at IBM. Worked on a lot of great, diverse projects with great people.

But I'm ready for a change. I've been planning on retiring this year, for a while now, but it kinda snuck up on me. Mid-life crisis? Maybe. Anyway, it's time!

First, a breather.

Then, looking forward to starting my next adventure(s) in software development.

I'm talking to some potential employers right now, and would like to talk to more. Contact info is in my resume, linked to below.

In terms of what I'm looking for, I'd like to continue working in the node.js environment. Still having a lot of fun there. My favorite subject matter is working on tools for developers. But I'm handy - and interested - in lots of stuff.

I live in the Raleigh/NC area, and can't relocate for a few years. I'm quite comfortable working remote, or local.

My resume is available here:

http://muellerware.org/resume/Patrick-Mueller-Resume.html

Thursday, March 26, 2015

wiki: something old, something new

Back in 1995, Ward Cunningham - a fellow BoilerMaker - created a program called WikiWikiWeb, which created a new species of web content we now know as the "wiki".

My colleague Rick DeNatale showed it to me, and I was hooked. So hooked, I wrote a clone in REXX for OS/2, which I've sadly been unable to find. It's funny to see the names of some other friends on list of initial visitors.

Since then, I've always had some manner of wiki in my life. We still use wikis to document project information in IBM, I use Wikipedia all the time for ... whatever ..., and occaisonally use the free wiki service that GitHub provides for projects there. I think some credit also goes to Ward for helping push the "simplied html markup" story (eg, Markdown), to make editing and creating web content more approachable to humans.

Ward started a new project a few years back, Smallest Federated Wiki, to start exploring some interesting aspects of the wiki space - federating wikis, providing plugin support, multi-page views, rich change history. It's fascinating.

I've had in the back of my mind to get Federated Wiki to run on Bluemix for a while, and it seemed like an appropriate time to make something publishable. So, I created the cf-fed-wiki project, which will let you easily deploy a version of Federated Wiki on Cloud Foundry (and thus Bluemix), and is also set up to use with the new easy-to-deploy Bluemix Button (below).

Deploy to Bluemix

There are still some rough spots, but it seems workable, or at least something to play with.

The best way to learn about Federated Wiki is to watch Ward's videos.

Enjoy, and thanks for the wiki, Ward!

Friday, February 13, 2015

having your cake and cakex'ing it too

As a software developer, I care deeply about my build tools. I strive to provide easy-to-use and understand build scripts with my projects, so that other folks can rebuild a project I'm working on, as simply as possible. Often one of those other folks is me, coming back to a project after being away for it for months, or years.

I cut my teeth on make. So awesome, for it's time. Even today, all kinds of wonderful things about it. But there are problems. Biggest is ensuring Makefile compatibility across all the versions of make you might encounter. I've used make on OS/2, Windows, AIX, Linux, Mac, and other Unix-y platforms. The old version of Windows nmake was very squirrelly to deal with, and if you had to build a Makefile that could run on Windows AND anywhere else, well best of luck, mate! make is also a bit perl-ish with all the crazy syntax-y things you can do. It's driven me to near-insanity at times.

I was also a big user of Ant. I don't do Java anymore, and would prefer to not do XML either, but I do still build weinre with Ant.

Since I'm primarily doing node.js stuff these days, it makes a lot of sense to use a build tool implemented on node.js; I've already got the runtime for that installed, guarantee. In the past, I've used jake with the Apache Cordova project, and have flirted with the newest build darlings of the node.js world, grunt and gulp.

I tend towards simpler tools, and so am happier at the jake level of simple-ness comared to the whole new way of thinking required to use grunt or gulp, which also have their own sub-ecosystem of plugins to gaze upon.

Of course, there are never enough build tools out there, so I've also built some myself: jbuild. I've been using jbuild for projects since I built it, and have been quite happy with it. But, it's my own tool, I don't really want to own a tool like this. The interesing thing about jbuild was the additional base level functionality it provided to the actual code you'd write in your build scripts, and not the way tasks were defined and what not.

As a little experiment, I've pulled that functionality out of jbuild and packaged up as something you can use easily with cake. cake is one the simplest tools out there, in terms of what it provides, and let's me write in CoffeeScript, which is closer to the "shell scripting" experience with make (which is awesome) compared to most other build tools.

Those extensions are in the cakex package available via npm.

why cakex

  • shelljs function built in as globals. So I can do things like

    mkdir "-p", "tmp"
    rm   "-rf", "tmp"
    

    (that's CoffeeScript) right in my Cakefile. Compare to how you'd do that in Ant. heh.

  • scripts in node_modules/.bin added as global functions that invoke those scripts. Hat tip, npm run. So I can do things like

    opts = """
      --outfile    tmp/#{oBase}
      --standalone ragents
      --entry      lib/ragents.js
      --debug
    """
    
    opts = opts.trim().split(/\s+/).join(" ")
    
    log "running browserify ..."
    browserify opts
    
  • functions acting as watchers, and server recyclers. I always fashion build scripts to that they have a watch task, which does a build, runs tests, restarts the server that's implemented in the project, etc. So that when I save a file in my text editor, the build/test/server restart happens all the time. These are hard little things to get right; I know, I've been trying for years to get them right. Here's an example usage:

    taskWatch = ->
      watchIter()          # run watchIter when starting
    
      watch
        files: sourceFiles # watch sourceFiles for changes
        run:   watchIter   # run watchIter on changes
    
    watchIter = ->
      taskBuild()          # run the build
      taskServe()          # run the server
    
    taskServe = ->
      log "restarting server at #{new Date()}"
    
      # starts / restarts the server, whichever is needed
      daemon.start "test server", "node", ["lib/server"]
    

The cakex npm page - https://www.npmjs.com/package/cakex - includes a complete script that is the kind of thing I have in all my projects, so you can take in the complete experience. I love it.

Coupla last things:

  • yup, globals freaking everywhere. It's awesome.

  • I assume this would be useful with other node.js-based build tools, but wouldn't surprise me if the "globals everywhere" strategy causes problems with other tools.

  • I'm using gaze to watch files, but it appears to have a bug where single file patterns end up matching too many things; hence having to do extra checks when you're watching a single file.

  • I've wrestled with the demon that are the daemon functions in cakex for a long time. Never completely happy with any story there, but it's usually possible to add enough hacks to keep things limping. Wouldn't be surprised if I have to re-architect the innards there, again, but hopefully the API can remain the same.

  • Please also check the section in the README titled "integration with npm start", for what I believe to be a best practice of including all your build tools as dependencies in your package, instead of relying on globally installed tools. For node.js build tools anyway.

Thursday, January 01, 2015

ragents against the machine

A few years back, I developed a tool called web inspector remote - aka weinre. It's an interesting hack where I took the WebKit Web Inspector UI code, ran it in a plain old (WebKit-based) browser, and hooked things up so it could debug a web browser session running somewhere else. I was specifically targeting mobile web browsers and applications using web browser views, but there wasn't much (any?) mobile-specific code in weinre.

(btw, I currently run a publicly accessible weinre server at Bluemix)

One of the interesting aspects of weinre, is the way the browser running the debugger user interface (the client) communicates with the browser being debugged (the target). The WebKit Web Inspector code was originally designed so that the client would connect to the target in a traditional client/server pattern. But on mobile devices, or within a web browser environment, it's very difficult to run any kind of a tradtional "server" - tcp, http, or otherwise.

So the trick to make this work with weinre, is to have both the client and target connect as HTTP clients to an HTTP server that shuttles messages between the two. That HTTP server is the weinre server. It's basically just a message switchboard between debug clients and targets.

Photograph of Women Working at a Bell System Telephone Switchboard

This switchboard server pattern is neat, because it allows two programs to interact with each other, where neither has to be an actual "server" of any kind (tcp, http, etc).

And this turns out to be a useful property in other environments as well. For instance, dealing with server farms running "in the cloud". Imagine you're running some kind of web application server, and decide to run 5 instances of your server to handle the load. And now you want to "debug" the server. How do you connect to it? Presumably the 5 instances are running behind one ip address - which server instance are you going to connect to? How could you connect to all of them at once?

Instead of using the typical client/server pattern for debugging in this environment, you can use the switchboard server pattern, and have each web application server instance connect to the switchboard server itself, and then have a single debug client which can communicate with all of the web application server instances.

I've wanted to extract the switchboard server pattern code out of weinre for a while now, and ... nows the time. I anticipate being able to define a generic switchboard server, and messaging protocol that can be used to allow multiple, independent, orthagonal tools to communicate with each other, where the tools are running in various environments. The focus here isn't specifically traditional graphical step debugging, but diagnostic tooling in general. Think loggers, REPLs, and other sorts of development tools.

One change I've made as part of the extraction is terminology. In weinre, there were "clients" and "targets" that had different capabilities. But there really doesn't need to be a distinction between the two, in terms of naming or functionality. Instead, I'm referring to these programs as "agents".

And sometimes agents will be communicating with agents running on other computers - remote agents - "ragents" - hey, that would be a cool name for a project!

Another thing I'm copying from the Web Inspector work is the messaging protocol. At the highest level, there are two types of messages that agents can send - request/response pairs, and events.

  • A request message can be sent from one agent to another, and then a response message is sent the opposite direction. Like RPC, or an HTTP request/response.

  • An event message is like a pub/sub message; multiple agents can listen for events, and then when an agent posts an event message, it's sent to every agent that is listening.

You can see concrete examples of these request/response and event messages in the V8 debugger protocol and the Chrome Remote Debugging Protocol.

This pattern makes it very easy to support things like having multiple debugger clients connect to a single debug target; all state change in the target gets emitted as events so every connected client will see the change. But you can also send specific requests from the client to the target, and only the client will see the responses.

For example, one debugger client might send a request "step" to have the debugger execute the next statement then pause. The response for this might be just a success/failure indicator. A set of events would end up being posted that the program started, and then stopped again (on the next statement). That way, every debugger client connected would see the effects of the "step" request.

Turns out, you really need to support multiple clients connected simultaneously if your clients are running in a web browser. Because if you only support a single client connection to your program, what happens when the user can't find their browser tab running the client? The web doesn't want to work with singleton connections.

I'm going to be working on the base libraries and server first, and then build some diagnostic tools that make use of them.

Stay tuned!

Wednesday, September 10, 2014

keeping secrets secret

If you're building a web app, you probably have secrets you have to deal with:

  • database credentials
  • session keys
  • etc

So, where do you keep these secrets? Typical ways are:

  • hard-code them into your source
  • require them to be passed on the command-line of your program
  • get them from a config file
  • get them from environment variables

Folks using Cloud Foundry based systems have another option:

This blog post will go over the advantages and disadvantages of these approaches. Examples are provided for node.js, but are applicable to any language.

secrets via hard-coding

The documentation for the express-session package shows the following example of hard-coding your secrets into your code:

This is awful:

  • If you need to change the secret, you need to change the code; apply some separation of concerns, and keep your code separate from your configuration.

  • If you happen to check this code into a source code management (SCM) system, like GitHub, then everyone with access to that SCM will have access to your password. That might be literally everyone.

Please, DO NOT DO THIS!!

Don't be one of these people. Use one of the techniques below, instead.

secrets via config files

Here is an example using require() to get a secret from a JSON file:

This example takes advantage of the node.js feature of being able to load a JSON file and get the parsed object as a result.

If you're going to go this route, you should do the following:

  • Do NOT store the config file in your SCM, because otherwise you may still be making your secret available to everyone who has access to your SCM.

  • To keep the config file from being stored, add the file to your .gitignore file (or equivalent for your SCM).

  • Create an example config file, say secret-config-sample.json, which folks can copy to the actual secret-config.json file, and use as an example.

  • Document the example config file usage.

You now have an issue of how to "manage" or save this file, since it's not being stored in an SCM.

secrets via command-line arguments

Here is an example using the nopt package to get a secret from a command-line argument:

You can then invoke your program using either of these commands:

node secret-arg.js --sessionSecret "keyboard cat"
node secret-arg.js -s "keyboard cat"

This is certainly nicer than having secrets hard-coded in your app, but it also means you will be typing the secrets a lot. If you decide to "script" the command invocation, keep in mind your script now has your secrets in it. Use the "example file" pattern described above in "secrets via config files" to keep the secret out of your SCM.

secrets via environment variables

Here is an example using process.env to get a secret from an environment variable:

You can then invoke your program using the following command:

SESSION_SECRET="keyboard cat" node secret-env.js

Like using command-line arguments, if you decide to script this, keep in mind your secret will be in the script.

You likely have other ways of setting environment variables when you run your program. For instance, in Cloud Foundry, you can set environment variables via a manifest.yml file or with the cf set-env command.

If you decide to set the environment variable in your manifest.yml file, keep in mind your secret will be in the manifest. Use the "example file" pattern described above in "secrets via config files" to keep the secret out of your SCM. Eg, put manifest.yml in your .gitignore file, and ship a manifest-sample.yml file instead.

secrets via Cloud Foundry user-provided services

Here is an example using the cfenv package to get a secret from a user-provided service:

This is my favorite way to store secrets for Cloud Foundry. In the example above, the code is expecting a service whose name matches the regular expression /session-secret/ to contain the secret in the credentials property named secret. You can create the user-provided service with the cf cups command:

cf cups a-session-secret-of-mine -p secret

This will prompt you for the value of the property secret, and then create a new service named a-session-secret-of-mine. You will need to cf bind the service to your application to get access to it.

There are a number of advantages to storing your secrets in user-provided services:

  • A service can be bound to multiple applications; this is a great way to store secrets that need to be shared by "micro-services", if you're into that kind of thing.

  • Once created, these values are persistent until you delete the service or use the new cf uups command to update them.

  • These values are only visible to users who have the appropriate access to the service.

  • Using regular expression matching for services makes it easy to switch services by having multiple services with regexp matchable names, and binding only the one you want. See my project bluemix-service-switcher for an example of doing this.

secrets via multiple methods

Of course, for your all singing, all dancing wunder-app, you'll want to allow folks to configure secrets in a variety of ways. Here's an example that uses all of the techniques above - including hard-coding an undefined value in the code! That should be the only value you ever hard-code. :-)

The example uses the defaults() function from underscore to apply precedence for obtaining a secret from multiple techniques.