Node.js

Improve Node.js Performance by Turning It into a Clusterfork

David Elton JavaScript, Node.js, Technology Snapshot 2 Comments

One of the big reasons people are drawn to Node.js for running web servers is the simplicity you gain from a single-threaded paradigm, versus having to deal with the challenges of threaded programming – race conditions, deadlocks, etc.

However, one drawback of running a web server on a single thread means it also runs on a single core. With nearly all production servers now operating on multiple cores, this could potentially be a big waste of available resources. Node.js comes prepackaged with a clustering API that is very easy to use and can help take advantage of more of these resources.

When you turn your Node.js server into a cluster and span it over multiple processes, you also gain some added reliability. With only a single process running, it is common practice to use a utility like forever in order to restart processes that may have exited for unexpected reasons (uncaught errors, for example). During the time the application is crashing and restarting, your server will be unable to serve requests. Clustering helps to solve this problem because the “master” process knows when one of its child processes has failed, and will automatically route future requests to other child processes while the failed process is restarted.

Adding Clustering to Your Existing Node.js Server

Fortunately, it is easy to add clustering to an existing Node.js server application. Let’s start with a very basic single process Express.js application:

var express = require('express');
var app = express();

app.get('/', function(req, res) {
  res.send('Hello World!');
});

var server = app.listen(3000, function() {
  console.log('Server started on port 3000');
});

The following commands can be used to start the server, assuming the above is named server.js:

npm install express
node server

Note: the npm install express command only needs to be run once. Control+C stops the server. It might be necessary to run node via sudo on *nix machines since it opens a port.

This very basic sample starts a web server on port 3000 and responds with “Hello World!” upon visiting http://localhost:3000.

The idea behind clustering in Node.js is pretty simple. When the application is first started, it considers itself the master process. The master process then creates one or more child processes. Typically, the number of child processes started is equal to the number of CPUs installed in the system, but this is entirely up to you.

The master process itself ideally should not be a server – it should only be responsible for creating and maintaining the child processes. This is because if the master process fails, all of its children are also shut down.

In the following example, our original application is modified to start the same number of web servers as there are CPUs (no additional external npm packages are necessary for the rest of the examples):

var cluster = require('cluster');

if (cluster.isMaster) {
  var numCPUs = require('os').cpus().length;

  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  var express = require('express');
  var app = express();

  app.get('/', function(req, res) {
    res.send('Hello World!');
  });

  var server = app.listen(3000, function() {
    console.log('Server started on port 3000');
  });
}

Upon restarting your application, you should notice the console now says “Server started on port 3000” once for each CPU installed in your system. The cluster module automatically forwards requests to the first idle worker. Neat!

Handling Unexpected Errors and Other Bad Things

As much as we may try to handle all error scenarios, in large, real world applications something is likely to explode. Let’s add a new route to our application that causes an uncaught exception to occur:

var cluster = require('cluster');

if (cluster.isMaster) {
  var numCPUs = require('os').cpus().length;

  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  var express = require('express');
  var app = express();

  app.get('/', function(req, res) {
    res.send('Hello World!');
  });

  app.get('/explode', function(req, res) {
    setTimeout(function() {
      res.send(this.wont.go.over.well);
    }, 1);
  });

  var server = app.listen(3000, function() {
    console.log('Server started on port 3000');
  });
}

Now, visiting the page at http://localhost:3000/explode will cause one of the child processes to error, and you will see a stack trace in your console. Your application is still running, but if you load the explode page too many times, you will have killed all of the child processes, and the application will exit.

Note: If you’re wondering why the error causing code is inside a setTimeout, it’s because Express.js catches errors inside route handlers and gracefully recovers. However, it is not capable of detecting errors inside of asynchronous callbacks due to scoping.

It is very simple to detect failed processes and to restart them. To add this functionality, we only have to add a small snippet of code:

var cluster = require('cluster');

if (cluster.isMaster) {
  var numCPUs = require('os').cpus().length;

  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', function() {
    console.log('A worker process died, restarting...');
    cluster.fork();
  });
} else {
  var express = require('express');
  var app = express();

  app.get('/', function(req, res) {
    res.send('Hello World!');
  });

  app.get('/explode', function(req, res) {
    setTimeout(function() {
      res.send(this.wont.go.over.well);
    }, 1);
  });

  var server = app.listen(3000, function() {
    console.log('Server started on port 3000');
  });
}

Now, no matter how many times you explode the application, it continues to run. Also neat!

Measuring Performance Gains

There are many tools out there for measuring server performance, but one very simple and free tool that I like to use is called siege (this link for Windows).

While testing performance, the most obvious thing to test is server response time, which is what I’ll test here. However, during a critical deployment, it is probably also a good idea to monitor the increased demands on memory, file I/O, etc. that will surely come with adding clustering to your solution.

Going back to the very first sample running only a single thread, on my i7 with eight logical cores, running the command

siege -b -t20s http://localhost:3000

results in:

Transactions:                  44194 hits
Availability:                 100.00 %
Elapsed time:                  19.21 secs
Data transferred:               0.51 MB
Response time:                  0.01 secs
Transaction rate:            2300.33 trans/sec
Throughput:                     0.03 MB/sec
Concurrency:                   14.45
Successful transactions:       44194
Failed transactions:               0
Longest transaction:            0.11
Shortest transaction:           0.00

Running the same command on our second example with clustering added in gives us this:

Transactions:                  48046 hits
Availability:                 100.00 %
Elapsed time:                  19.37 secs
Data transferred:               0.55 MB
Response time:                  0.00 secs
Transaction rate:            2480.82 trans/sec
Throughput:                     0.03 MB/sec
Concurrency:                    5.73
Successful transactions:       48059
Failed transactions:               0
Longest transaction:            0.03
Shortest transaction:           0.00

A roughly 8.7% gain in performance. Not too impressive considering we jumped from one process to eight, but the application is so simple that this is a very poor test case. The more CPU bound our application becomes, the greater the benefits appear.

Let’s change our Hello World! route to contain some wasteful CPU work:

  app.get('/', function(req, res) {
    for (var a = 0; a < 999999; a++) {
      // this is pretty wasteful
    }
    res.send('Hello World!');
  });

Our siege command on a single process now results in:

Transactions:                  15932 hits
Availability:                 100.00 %
Elapsed time:                  19.41 secs
Data transferred:               0.18 MB
Response time:                  0.02 secs
Transaction rate:             820.64 trans/sec
Throughput:                     0.01 MB/sec
Concurrency:                   14.79
Successful transactions:       15932
Failed transactions:               0
Longest transaction:            0.03
Shortest transaction:           0.01

And run again on our clustered example:

Transactions:                  34479 hits
Availability:                 100.00 %
Elapsed time:                  19.38 secs
Data transferred:               0.39 MB
Response time:                  0.00 secs
Transaction rate:            1779.38 trans/sec
Throughput:                     0.02 MB/sec
Concurrency:                    7.93
Successful transactions:       34489
Failed transactions:               0
Longest transaction:            0.07
Shortest transaction:           0.00

A 116% increase in performance! Node.js is not an ideal choice for CPU-bound solutions (and so this is not a very realistic test either), so real world results should fall somewhere in between the giant gulf of results presented here. If you have a lot of CPU-intensive work to do, a better solution might be to proxy the work over to another physical machine via HTTP or redis pub/sub, but that’s a whole different topic.

One other note: when testing a real world application, it is probably wise to run the siege command on multiple client machines against the server, instead of on the same single machine running the server as I have done.

Additional Information

The Node.js documentation on clustering is pretty thorough and rather easy to read. I would suggest reading through it if you are interested in clustering, as you will find that there are many more features available than I have presented. For example, it is possible for processes to communicate with each other via events provided by the cluster module.

One shortcoming of the examples I have provided is that while the processes do restart, they do not manage to send back an HTTP 500 status response before exiting. One solution to this would be to handle errors via Node’s global error handler, however, this is a discouraged practice. Instead, it is recommended to use one of their other modules, domain.

Even though domains can be used as a catch-all, it is still recommended to exit your child process upon unexpected errors. In the link provided above, I would strongly encourage reading and understanding the reasoning behind the section entitled “Warning: Don’t Ignore Errors!” regardless of whether you choose to use domains.

One Last Thing…

One thing I had to try and was pleasantly surprised in the result with was whether making changes to your server code would be picked up upon calls to cluster.fork(). Turns out, it does pick up any changes made.

What this means is that if you want to upgrade your server’s code while maintaining 100% uptime, you could introduce some mechanism into your master process to shut down workers and bring them back up one by one with the new code. I haven’t actually implemented this myself, but it sounds cool in theory.

Whether or not you think this is as cool as I do, it’s an important concept to remember because this behavior could also cause problems if you aren’t careful. For example, if you have started to deploy new code in anticipation of an update, and then one of the worker processes fails and restarts with the new code before you had intended for the new code to go live. That could be a very bad thing.

Thanks for reading!

— Dave Elton, asktheteam@keyholesoftware.com


About the Author
David Elton

David Elton

David Elton loves cheesecake! https://hallmarkcheesecake.wordpress.com


Share this Post

Comments 2

  1. Zach Gardner

    Do you know if there’s anything built into that module to handle when the master crashes? I couldn’t find anything when I was looking, so I might have just missed it.

    Also, have you looked into a micro-service oriented architecture? Deploying the same stateless NodeJS code to discrete servers should provide the same level of concurrently handling requests while keeping leaks (e.g. global variables, memory) isolated. I’ve been leaning towards having micro-services be the de facto method to solve concurrency and availability instead of forks on a server. Have you looked into that yet?

    1. David Elton Post
      Author
      David Elton

      Zach, I do not believe the cluster module itself has anything built in to handle when the master crashes. But, the domain module could be used to trap and respond to errors in the master process. Crashes in the master process could be pretty ugly to deal with though, and so that is why I suggested keeping the code in the master as lightweight as possible, handling only the starting and restarting of children.

      As far as taking a microservices approach, there are definitely some strong advantages of that architecture. One of the biggest in my opinion is that there might be times when an entire machine may become unavailable due to hardware/network failure and having that work distributed across machines instead of cores would be really helpful there. Depending on your application needs, it may even make sense to blend the two ideas, clustering your APIs on a single machine to ensure efficient resource management, and also across multiple machines to provide some more redundancy.

      One thing that is nice about clustering is the ease in which processes may communicate with each other, if that is for some reason necessary (maybe a websocket needs to be notified of an event that has occurred in another process). That said, redis pub/sub is something I have played with and think is really cool, and would be a great way of overcoming the problem of interprocess communication across multiple machines.

Leave a Reply