01 Jun 2010

Asynchronous code in node.js

Writing an async library has almost become a right of passage for node developers and I really don't want to add to this myriad of already great modules. Really. However (you saw that coming!), when I'm using node, I want to stay fairly close to the vanilla async implementation. When writing my own modules I try to follow the convention of using a single callback with an optional error as the first argument.

For people not familiar with the various async implementations, below is an example function implemented in a few of the options available.

Single callback

As used in the standard node modules:

// defining a function
var async_function = function(val, callback){
    process.nextTick(function(){
        callback(val);
    });
};

// using the function
async_function(true, function(val){
    // val == true
});

Continuations

As provided by the node-continuables module by bentomas:

var continuables = require('continuables');

// defining a function
var async_function = function(val){
    var continuable = continuables.create();
    process.nextTick(function(){
        continuable.fulfill(val);
    });
    return continuable;
};

// using the function
async_function(true)
(function(val) {
    // val == true
});

Promises

As provided by the node-promise module by kriszyp:

var Promise = require("promise").Promise;

// defining a function
var async_function = function(val){
    var promise = new Promise();
    process.nextTick(function(){
        Promise.resolve("succesful result");
    });
    return promise;
};

// using the function
var promise = async_function(true);
promise.then(function(val){
    // val == true
},
function(error){
    // executed if the promise fails
});

While these new styles provide some interesting capabilities, sticking to convention makes the API easier to understand, and allows people to wrap the module with other methods of handling async code if they so wish. Because of this, I've avoided using these existing async modules in favour of the single callback style used throughout node. However, I've found myself repeating a number of patterns, so I've decided to abstract the more common ones into a separate module. And so, the async module was born! Aha! I hear you say. Now you too are implementing a new way of doing async! …Well, not quite.

What I've ended up with are a few higher-order functions that operate on async code using the convention of a single callback. This includes the usual 'functional' suspects (map, reduce, filter, forEach…) as well as some common patterns for running blocks of async code (parallel, series, waterfall, auto…). This is not an entirely new idea. Perhaps the closest existing module is Do by creationix. However, Do operates on continuations, not on functions using the standard callbacks. This means you have to wrap any functions using the conventional style before using them with Do:

var Do = require('do');
// Convert `readFile` from fs to use continuable style.
var fs = Do.convert(require('fs'), ['readFile']);

So, in case you didn't get it already: the async module is not an attempt to replace the standard callback mechanism in node. It is designed to work as seamlessly as possible with the existing node modules by working with functions using the single callback style.

I think I have now compiled the most comprehensive set of features for working with async functions in node. It includes many of the ideas from other modules, such as Do, and wouldn't have been possible without the great work that's gone into them.

If you're interested, you can check out the code on Github. Alteratively, see below for example usage and a quick explanation of the functions available.

Example

Just for fun, here is an example of using one of the higher-order async functions. This example tests a list of filenames and reports any files that already exist.

With the async module

var files = ['file1', 'file2', 'file3'];

async.filter(files, path.exists, function(results){
    if(results) sys.puts('The following files already exist: ' + results);
});

Without the async module

var files = ['file1', 'file2', 'file3'],
    results = [],
    completed = 0;

files.forEach(function(f){
    path.exists(f, function(exists){
        if(exists) results.push(f);
        completed++;
        if(completed == files.length){
            if(results){
                sys.puts('The following files already exist: ' + results);
            }
        }
    });
});

API

Collections

  • forEach (forEachSeries) - Applies an async iterator to each item in an array.
  • map (mapSeries) - Produces a new array of values by mapping each value in the given array through an async iterator function.
  • filter (filterSeries) - Returns a new array of all the values which pass an async truth test.
  • reduce - Reduces a list of values into a single value using an async iterator to return each successive step.
  • some - Returns true if at least one element in the array satisfies an async test.
  • every - Returns true if evert element in the array satisfies an async test.

Flow Control

  • series - Run an array of functions in series, each one running once the previous function has completed.
  • parallel - Run an array of functions in parallel, without waiting until the previous function has completed.
  • waterfall - Runs an array of functions in series, each passing their results to the next in the array.
  • auto - Determines the best order for running functions based on their requirements.
  • iterator - Creates an iterator function which calls the next function in the array, returning a continuation to call the next one after that.

Get the code!