pmuellr: 2015

Monday, September 28, 2015

getting started with N|Solid at the command line

Last week, NodeSource (where I work) announced a new product, N|Solid . N|Solid is a platform built on Node.js that provides a number of enhancements to improve troubleshooting, debugging, managing, monitoring and securing your Node.js applications.

N|Solid provides a gorgeous web-based console to monitor/introspect your applications, but also allows you to introspect your Node.js applications, in the same way, at ye olde command line.

Let's explore that command line thing!

installing N|Solid Runtime

In order to introspect your Node.js applications, you'll run them with the N|Solid Runtime, which is shaped similarly to a typical Node.js runtime, but provides some additional executables.

On a Mac, you can alternatively download the native installer .pkg file. If using the native installer, download the .pkg file, and then double-click the downloaded file in Finder to start the installation. It will walk you through the process of installing N|Solid Runtime in the usual Node.js installation location, /usr/local/bin.

If you just want to take a peek at N|Solid, the easiest thing is to download a tarball and unpack it. On my mac, I downloaded the "Mac OS .tar.gz" for "N|Solid Runtime", and then double-clicked on the .tar.gz file in Finder to unpack it. This created the directory nsolid-v1.0.1-darwin-x64. Rename that directory to nsolid, start a terminal session, cd into that directory, and prepend it's bin subdirectory to the PATH environment variable:

$ cd Downloads/nsolid
$ PATH=./bin:$PATH
$ nsolid -v
v4.1.1
$

In the snippet above, I also ran nsolid -v to print the version of Node.js that the N|Solid Runtime is built on.

This will make the following executables available on the PATH, for this shell session:

nsolid is the binary executable version of Node.js that N|Solid ships
node is a symlink to nsolid
npm is a symlink into lib/node_modules/npm/bin/npm-cli.js, as it is with typical Node.js installs
nsolid-cli is a command-line interface to the N|Solid Agent, explained later in this blog post

Let's write a hello.js program and run it:

$ echo 'console.log("Hello, World!")' > hello.js
$ nsolid hello
Hello, World!
$

Success!

the extra goodies

N|Solid Runtime version 1.0.1 provides the same Node.js runtime as Node.js 4.1.1, with some extra goodies. Anything that can run in Node.js 4.1.1, can run in N|Solid 1.0.1. NodeSource will release new versions of N|Solid as new releases of Node.js become available.

So what makes N|Solid different from regular Node.js?

If you run nsolid --help, you'll see a listing of additional options and environment variables at the end:

$ nsolid --help
...
{usual Node.js help here}
...
N|Solid Options:
  --policies file       provide an NSolid application policies file

N|Solid Environment variables:
NSOLID_HUB              Provide the location of the NSolid Hub
NSOLID_SOCKET           Provide a specific socket for the NSolid Agent listener
NSOLID_APPNAME          Set a name for this application in the NSolid Console
$

N|Solid policies allow you to harden your application in various ways. For example, you can have all native memory allocations zero-filled by N|Solid, by using the zeroFillAllocations policy. By default, Node.js does not zero-fill memory it allocates from the operating system, for performance reasons.

For more information on policies, see the N|Solid Policies documentation.

Besides policies, the other extra goody that N|Solid provides is an agent that you can enable to allow introspection of your Node.js processes. To enable the N|Solid Agent, you'll use the environment variables listed in the help text above.

For the purposes of the rest of this blog post, we'll just focus on interacting with a single Node.js process, and will just use the NSOLID_SOCKET environment variable. The NSOLID_HUB and NSOLID_APPNAME environment variables are used when interacting with multiple Node.js processes, via the N|Solid Hub.

The N|Solid Agent is enabled if the NSOLID_SOCKET environment variable is set, and is not enabled if the environment variable is not set.

Let's start a Node.js REPL with the N|Solid Agent enabled:

$ NSOLID_SOCKET=5000 nsolid
> 1+1 // just show that things are working
2
>

This command starts up the typical Node.js REPL, with the N|Solid Agent listening on port 5000. When the N|Solid Agent is enabled, you can interact with it using N|Solid Command Line Interface (CLI), implemented as the nsolid-cli executable.

running `nsolid-cli` commands

Let's start with a ping command. Leave the REPL running, start a new terminal window, cd into your nsolid directory again, and set the PATH environment variable:

$ cd Downloads/nsolid
$ PATH=./bin:$PATH
$

Now let's send the ping command to the N|Solid Agent running in the REPL:

$ nsolid-cli --socket 5000 ping
"PONG"
$

In this case, we passed the --socket option on the command line, which indicates the N|Solid Agent port to connect to. And we told it to run the ping command. The response was the string "PONG".

The ping command just validates that the N|Solid Agent is actually running.

Let's try the system_stats command, with the REPL still running in the other window:

$ nsolid-cli --socket 5000 system_stats
{"freemem":2135748608,"uptime":2414371,"load_1m":1.17431640625,"load_5m":1.345703125,"load_15m":1.3447265625,"cpu_speed":2500}
$

The system_stats command provides some system-level statistics, such as amount of free memory (in bytes), system uptime, and load averages.

The output is a single line of JSON. To make the output more readable, you can pipe the output through the json command, available at npm:

$ nsolid-cli --socket 5000 system_stats | json
{
  "freemem": 1970876416,
  "uptime": 2414810,
  "load_1m": 1.34765625,
  "load_5m": 1.26611328125,
  "load_15m": 1.29052734375,
  "cpu_speed": 2500
}
$

Another nsolid-cli command is process_stats, which provides some process-level statistics:

$ nsolid-cli --socket 5000 process_stats | json
{
  "uptime": 2225.215,
  "rss": 25767936,
  "heapTotal": 9296640,
  "heapUsed": 6144552,
  "active_requests": 0,
  "active_handles": 4,
  "user": "pmuellr",
  "title": "nsolid",
  "cpu": 0
}
$

The full list of commands you can use with nsolid-cli is available at the doc page N|Solid Command Line Interface (CLI) .

generating a CPU profile

Let's try one more thing - generating a CPU profile. Here's a link to a sample program to run, that will keep your CPU busy: busy-web.js

This program is an HTTP server that issues an HTTP request to itself, every 10 milliseconds. It makes use of some of the new ES6 features available in Node.js 4.0, like template strings and arrow functions. Since the N|Solid Runtime is using the latest version of Node.js, you can make use of those features with N|Solid as well.

Let's run it with the agent enabled:

$ NSOLID_SOCKET=5000 nsolid busy-web
server listing at http://localhost:53011
send: 100 requests
recv: 100 requests
...

In another terminal window, run the profile_start command, wait a few seconds and run the profile_stop command, redirecting the output to the file busy-web.cpuprofile:

$ nsolid-cli --socket 5000 profile_start
{"started":1443108818350,"collecting":true}
... wait a few seconds ...
$ nsolid-cli --socket 5000 profile_stop > busy-web.cpuprofile

The file busy-web.cpuprofile can then be loaded into Chrome Dev Tools for analysis:

in Chrome, select the menu item View / Developer / Developer Tools
in the Developer Tools window, select the Profiles tab
click the "Load" button
select the busy-web.cpuprofile file
in the CPU PROFILES list on the left, select "busy-web"

For more information on using Chrome Dev Tools to analyze a CPU profile, see Google's Speed Up JavaScript Execution page.

Note that we didn't have to instrument our program with any special profiling packages - access to the V8 CPU profiler is baked right into N|Solid! About time someone did that, eh?

You can easily write a script to automate the creation of a CPU profile, where you add a sleep command to wait some number of seconds between the profile_start and profile_stop commands.

#!/bin/sh

echo "starting CPU profile"
nsolid-cli --socket 5000 profile_start

echo "waiting 5 seconds"
sleep 5

echo "writing profile to busy-web.cpuprofile"
nsolid-cli --socket 5000 profile_stop > busy-web.cpuprofile

Or instead of sleeping, if your app is an HTTP server, you can drive some traffic to it with Apache Bench (ab), by running something like this instead of the sleep command:

ab -n 1000 -c 100 http://localhost:3000/

generating heap snapshots

You can use the same technique to capture heap snapshots, using the snapshot command. The snapshot command produces output which should be redirected to a file with a .heapsnapshot extension:

$ nsolid-cli --socket 5000 snapshot > busy-web.heapsnapshot

You can then load those files in Chrome Dev Tools for analysis, the same way the CPU profiles are loaded.

For more information on using Chrome Dev Tools to analyze a heap snapshot, see Google's How to Record Heap Snapshots page.

more info

The full list of commands you can use with nsolid-cli is available at the doc page N|Solid Command Line Interface (CLI) .

All of the documentation for N|Solid is available at the doc site N|Solid Documentation.

If you have any questions about N|Solid, feel free to post them at Stack Overflow, and add a tag nsolid.

Wednesday, May 13, 2015

resuscitating a 2006 MacBook

One bad assumption I made about leaving IBM is that I'd be able to get a new MacBook Pro quickly. Nope. 2-3 weeks. NO LAPTOP! HOW AM I SUPPOSED TO TAKE TIME OFF AND RELAX?

Poking through my shelves of computer junk, I spied my two old MacBooks. I forgot I had two of them.

The first was a G4 (PPC) iBook. That was the laptop I bought in 2003 to kick the tires on Apple hardware and OS X 10.2. I was hooked.

The second was a Intel Core Duo (x86) MacBook. 2 2Ghz x86 cores, 2GB RAM, 120GB drive. I bought that in 2006, and remember it being a productive machine. Eventually replaced that with a line of MacBook Pros IBM provided me, up until this week.

Hmmm, is that MacBook still usable? Powered it up fine. Poking around, it seemed like constraints on hardware / OS levels were that I should try to get this box from OS X 10.5 to 10.6. Also - this being a 32-bit machine - some number of apps wouldn't run. Eg, Chrome.

Luckily I still had an OS X 10.6 DVD lying around in my piles of old stuff. Upgraded easily. So now I can run PathFinder, Sublime, iTerm, Firefox, and Skype. Can't run Chrome, Atom, or Twitter. Or io.js. Again, HOW AM I SUPPOSED TO TAKE TIME OFF AND RELAX?

The build requirements for io.js looked like something that I might be able to meet. Rolled up my sleeves.

First I wanted to update git. Currrent version was at 1.6. It was already complaining about git clone operations on occasion. So, first installed homebrew, then installed git 2.4.

That was easy. Now to install gcc/g++ - they were not already on the machine, and the latest stable at brew is 4.9.2. After a long time, it installed that fine. But it doesn't override the built-in gcc/g++. Instead, it provides gcc-4.9 and similar named tools for that version of the gcc toolchain.

To get the iojs Makefile to use these instead of the built-in gcc tools, I set env vars in my .bash_profile:

export CC=gcc-4.9
export CXX=g++-4.9
export AR=gcc-ar-4.9
export NM=gcc-nm-4.9
export RANLIB=gcc-ranlib-4.9

Ready to build iojs! Ran ./configure - it completed successfully but was a little complain-y about some unrelated looking things. Ran make. And then fell asleep. Woke up, it had completed successfully, so ran a little test and ... WORKED! Finally ran sudo make install and NOW IT IS EVERYWHERE.

Got some test failures, but they may be environmental. Tried some typical workflow stuff and things seem fine.

Not a dream machine, by any means. Slow. Constantly watching memory usage and quitting apps as appropriate. And the fan is quite noisy. And the display seems like it doesn't have long to live (display adapter problems are my lot in life as a MacBook owner).

But kinda fun.

leaving IBM, taking a break, going somewhere else

This last Monday - May 11, 2015 - was my last day as an IBMer.

I had a great run at IBM. Worked on a lot of great, diverse projects with great people.

But I'm ready for a change. I've been planning on retiring this year, for a while now, but it kinda snuck up on me. Mid-life crisis? Maybe. Anyway, it's time!

First, a breather.

Then, looking forward to starting my next adventure(s) in software development.

I'm talking to some potential employers right now, and would like to talk to more. Contact info is in my resume, linked to below.

In terms of what I'm looking for, I'd like to continue working in the node.js environment. Still having a lot of fun there. My favorite subject matter is working on tools for developers. But I'm handy - and interested - in lots of stuff.

I live in the Raleigh/NC area, and can't relocate for a few years. I'm quite comfortable working remote, or local.

My resume is available here:

http://muellerware.org/resume/Patrick-Mueller-Resume.html

Thursday, March 26, 2015

wiki: something old, something new

Back in 1995, Ward Cunningham - a fellow BoilerMaker - created a program called WikiWikiWeb, which created a new species of web content we now know as the "wiki".

My colleague Rick DeNatale showed it to me, and I was hooked. So hooked, I wrote a clone in REXX for OS/2, which I've sadly been unable to find. It's funny to see the names of some other friends on list of initial visitors.

Since then, I've always had some manner of wiki in my life. We still use wikis to document project information in IBM, I use Wikipedia all the time for ... whatever ..., and occaisonally use the free wiki service that GitHub provides for projects there. I think some credit also goes to Ward for helping push the "simplied html markup" story (eg, Markdown), to make editing and creating web content more approachable to humans.

Ward started a new project a few years back, Smallest Federated Wiki, to start exploring some interesting aspects of the wiki space - federating wikis, providing plugin support, multi-page views, rich change history. It's fascinating.

I've had in the back of my mind to get Federated Wiki to run on Bluemix for a while, and it seemed like an appropriate time to make something publishable. So, I created the cf-fed-wiki project, which will let you easily deploy a version of Federated Wiki on Cloud Foundry (and thus Bluemix), and is also set up to use with the new easy-to-deploy Bluemix Button (below).

There are still some rough spots, but it seems workable, or at least something to play with.

The best way to learn about Federated Wiki is to watch Ward's videos.

Enjoy, and thanks for the wiki, Ward!

Friday, February 13, 2015

having your cake and cakex'ing it too

As a software developer, I care deeply about my build tools. I strive to provide easy-to-use and understand build scripts with my projects, so that other folks can rebuild a project I'm working on, as simply as possible. Often one of those other folks is me, coming back to a project after being away for it for months, or years.

I cut my teeth on make. So awesome, for it's time. Even today, all kinds of wonderful things about it. But there are problems. Biggest is ensuring Makefile compatibility across all the versions of make you might encounter. I've used make on OS/2, Windows, AIX, Linux, Mac, and other Unix-y platforms. The old version of Windows nmake was very squirrelly to deal with, and if you had to build a Makefile that could run on Windows AND anywhere else, well best of luck, mate! make is also a bit perl-ish with all the crazy syntax-y things you can do. It's driven me to near-insanity at times.

I was also a big user of Ant. I don't do Java anymore, and would prefer to not do XML either, but I do still build weinre with Ant.

Since I'm primarily doing node.js stuff these days, it makes a lot of sense to use a build tool implemented on node.js; I've already got the runtime for that installed, guarantee. In the past, I've used jake with the Apache Cordova project, and have flirted with the newest build darlings of the node.js world, grunt and gulp.

I tend towards simpler tools, and so am happier at the jake level of simple-ness comared to the whole new way of thinking required to use grunt or gulp, which also have their own sub-ecosystem of plugins to gaze upon.

Of course, there are never enough build tools out there, so I've also built some myself: jbuild. I've been using jbuild for projects since I built it, and have been quite happy with it. But, it's my own tool, I don't really want to own a tool like this. The interesing thing about jbuild was the additional base level functionality it provided to the actual code you'd write in your build scripts, and not the way tasks were defined and what not.

As a little experiment, I've pulled that functionality out of jbuild and packaged up as something you can use easily with cake. cake is one the simplest tools out there, in terms of what it provides, and let's me write in CoffeeScript, which is closer to the "shell scripting" experience with make (which is awesome) compared to most other build tools.

Those extensions are in the cakex package available via npm.

why cakex

shelljs function built in as globals. So I can do things like
```
mkdir "-p", "tmp"
rm   "-rf", "tmp"
```
(that's CoffeeScript) right in my Cakefile. Compare to how you'd do that in Ant. heh.

scripts in node_modules/.bin added as global functions that invoke those scripts. Hat tip, npm run. So I can do things like

opts = """
  --outfile    tmp/#{oBase}
  --standalone ragents
  --entry      lib/ragents.js
  --debug
"""

opts = opts.trim().split(/\s+/).join(" ")

log "running browserify ..."
browserify opts

functions acting as watchers, and server recyclers. I always fashion build scripts to that they have a watch task, which does a build, runs tests, restarts the server that's implemented in the project, etc. So that when I save a file in my text editor, the build/test/server restart happens all the time. These are hard little things to get right; I know, I've been trying for years to get them right. Here's an example usage:

taskWatch = ->
  watchIter()          # run watchIter when starting

  watch
    files: sourceFiles # watch sourceFiles for changes
    run:   watchIter   # run watchIter on changes

watchIter = ->
  taskBuild()          # run the build
  taskServe()          # run the server

taskServe = ->
  log "restarting server at #{new Date()}"

  # starts / restarts the server, whichever is needed
  daemon.start "test server", "node", ["lib/server"]

The cakex npm page - https://www.npmjs.com/package/cakex - includes a complete script that is the kind of thing I have in all my projects, so you can take in the complete experience. I love it.

Coupla last things:

yup, globals freaking everywhere. It's awesome.
I assume this would be useful with other node.js-based build tools, but wouldn't surprise me if the "globals everywhere" strategy causes problems with other tools.
I'm using gaze to watch files, but it appears to have a bug where single file patterns end up matching too many things; hence having to do extra checks when you're watching a single file.
I've wrestled with the demon that are the daemon functions in cakex for a long time. Never completely happy with any story there, but it's usually possible to add enough hacks to keep things limping. Wouldn't be surprised if I have to re-architect the innards there, again, but hopefully the API can remain the same.
Please also check the section in the README titled "integration with npm start", for what I believe to be a best practice of including all your build tools as dependencies in your package, instead of relying on globally installed tools. For node.js build tools anyway.

Thursday, January 01, 2015

ragents against the machine

A few years back, I developed a tool called web inspector remote - aka weinre. It's an interesting hack where I took the WebKit Web Inspector UI code, ran it in a plain old (WebKit-based) browser, and hooked things up so it could debug a web browser session running somewhere else. I was specifically targeting mobile web browsers and applications using web browser views, but there wasn't much (any?) mobile-specific code in weinre.

(btw, I currently run a publicly accessible weinre server at Bluemix)

One of the interesting aspects of weinre, is the way the browser running the debugger user interface (the client) communicates with the browser being debugged (the target). The WebKit Web Inspector code was originally designed so that the client would connect to the target in a traditional client/server pattern. But on mobile devices, or within a web browser environment, it's very difficult to run any kind of a tradtional "server" - tcp, http, or otherwise.

So the trick to make this work with weinre, is to have both the client and target connect as HTTP clients to an HTTP server that shuttles messages between the two. That HTTP server is the weinre server. It's basically just a message switchboard between debug clients and targets.

This switchboard server pattern is neat, because it allows two programs to interact with each other, where neither has to be an actual "server" of any kind (tcp, http, etc).

And this turns out to be a useful property in other environments as well. For instance, dealing with server farms running "in the cloud". Imagine you're running some kind of web application server, and decide to run 5 instances of your server to handle the load. And now you want to "debug" the server. How do you connect to it? Presumably the 5 instances are running behind one ip address - which server instance are you going to connect to? How could you connect to all of them at once?

Instead of using the typical client/server pattern for debugging in this environment, you can use the switchboard server pattern, and have each web application server instance connect to the switchboard server itself, and then have a single debug client which can communicate with all of the web application server instances.

I've wanted to extract the switchboard server pattern code out of weinre for a while now, and ... nows the time. I anticipate being able to define a generic switchboard server, and messaging protocol that can be used to allow multiple, independent, orthagonal tools to communicate with each other, where the tools are running in various environments. The focus here isn't specifically traditional graphical step debugging, but diagnostic tooling in general. Think loggers, REPLs, and other sorts of development tools.

One change I've made as part of the extraction is terminology. In weinre, there were "clients" and "targets" that had different capabilities. But there really doesn't need to be a distinction between the two, in terms of naming or functionality. Instead, I'm referring to these programs as "agents".

And sometimes agents will be communicating with agents running on other computers - remote agents - "ragents" - hey, that would be a cool name for a project!

Another thing I'm copying from the Web Inspector work is the messaging protocol. At the highest level, there are two types of messages that agents can send - request/response pairs, and events.

A request message can be sent from one agent to another, and then a response message is sent the opposite direction. Like RPC, or an HTTP request/response.
An event message is like a pub/sub message; multiple agents can listen for events, and then when an agent posts an event message, it's sent to every agent that is listening.

You can see concrete examples of these request/response and event messages in the V8 debugger protocol and the Chrome Remote Debugging Protocol.

This pattern makes it very easy to support things like having multiple debugger clients connect to a single debug target; all state change in the target gets emitted as events so every connected client will see the change. But you can also send specific requests from the client to the target, and only the client will see the responses.

For example, one debugger client might send a request "step" to have the debugger execute the next statement then pause. The response for this might be just a success/failure indicator. A set of events would end up being posted that the program started, and then stopped again (on the next statement). That way, every debugger client connected would see the effects of the "step" request.

Turns out, you really need to support multiple clients connected simultaneously if your clients are running in a web browser. Because if you only support a single client connection to your program, what happens when the user can't find their browser tab running the client? The web doesn't want to work with singleton connections.

I'm going to be working on the base libraries and server first, and then build some diagnostic tools that make use of them.

Stay tuned!

Links