Distributed Systems With Node.js: Part 3 Scaling

Matthew MacFarquhar
6 min readApr 21, 2024

Introduction

In this series, I will be working through this book on Distributed Systems using Node.js. The book is quite large and in depth, but in these articles I will be distilling the big picture pieces of the book into bite size tutorials and walk throughs.

In this section, we will be exploring some key concepts in creating a scalable service. First we will learn how to set up a reverse proxy and what a reverse proxy can do for us. Then we will load test a system using autocannon to determine how much traffic our service can handle before needing to scale up.

The code for this demo is found in this repo at the commit here

Reverse Proxy

Reverse proxies sit in front of our server(s) and accept incoming traffic which they then route to a backend worker server they pick to handle the request. Reverse proxies can also do extra things for us like: compression, TLS termination, caching and rate limiting.

Cluster Module

We start by exploring a very simple (not to be used in production) example of the behavior a reverse proxy takes, sitting in front of the service and routing requests.

#!/usr/bin/env node

const cluster = require('cluster');
console.log(`master pid=${process.pid}`);
cluster.setupMaster({
exec: __dirname+'/cluster-fibonacci.js'
});
cluster.fork();
cluster.fork();

cluster
.on('disconnect', (worker) => {
console.log('disconnect', worker.id);
})
.on('exit', (worker, code, signal) => {
console.log('exit', worker.id, code, signal);
cluster.fork()
})
.on('listening', (worker, {address, port}) => {
console.log('listening', worker.id, `${address}:${port}`);
})

Here is our master fibonacci processes, which when run will spin up two cluster-fibonacci processes.

#!/usr/bine/env node

const server = require('fastify')();
const HOST = process.env.HOST || '127.0.0.1';
const PORT = process.env.PORT || 4000;

console.log(`worker pid=${process.pid}`);

server.get('/:limit', async (req, res) => {
await sleep(10);
return String(fibonacci(Number(req.params.limit)));
});

server.listen(PORT, HOST, () => {
console.log(`Producer running at http://${HOST}:${PORT}`);
});

function fibonacci(limit) {
let prev = 1n, next = 0n, swap;
while (limit) {
swap = prev;
prev = prev + next;
next = swap;
limit--;
}

return next;
}

function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}

The master process intakes the requests and sends the work to the worker processes using inter-process communication. This is a hacky example of what a reverse proxy can do, but in production the book recommends using an established tool to facilitate reverse proxy duties instead of relying on the cluster module.

HAProxy

HAProxy is a tool which makes setting up reverse proxy systems quick and easy.

Load Balancing

defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

frontend inbound
bind localhost:3000
default_backend web-api
stats enable
stats uri /admin?stats

backend web-api
option httpchk GET /health
server web-api-1 localhost:3001 check
server web-api-2 localhost:3002 check

HAProxy allows us to define an inbound uri of localhost:3000 which will accept all incoming traffic. We point the inbound to our web-api backend which has two servers running on ports 3001 and 3002 (which we must spin up ourselves). HAProxy will also preform health checks on our services by hitting the newly created /health endpoint

#!/usr/bin/env node

const server = require('fastify')();
const fetch = require('node-fetch');
const HOST = process.env.HOST || '127.0.0.1';
const PORT = process.env.PORT || 3000;
const TARGET = process.env.TARGET || 'localhost:4000';

server.get('/', async () => {
const res = await fetch(`http://${TARGET}/recipes/42`);
const producer_data = await res.json();

return {
consumer_pid: process.pid,
producer_data: producer_data
};
});

server.get('/health', async () => {
console.log('health check');
return 'OK';
});

server.listen(PORT, HOST, () => {
console.log(`Consumer running at http://${HOST}:${PORT}`)
})

Which we can see just returns a 200 OK.

HAProxy also enables a cool stats dashboard we can view by navigating to localhost:3000/admin?stats.

Compression, TLS & Rate Limiting

HAProxy can perform non-business logic operations on our requests — like body compression and de-compression.

defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

frontend inbound
bind localhost:3000
default_backend web-api

backend web-api
compression offload
compression algo gzip
compression type application/json text/plain
server web-api-1 localhost:3001

We can deal with compression with three simple lines added to the .cfg file.

TLS termination to enable HTTPS

defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

global
tune.ssl.default-dh-param 2048

frontend inbound
bind localhost:3000 ssl crt haproxy/combined.pem
default_backend web-api

backend web-api
server web-api-1 localhost:3001

We need to create a .pem (which is just a private key and public key combined into one file) and then amend our frontend address binding to use ssl.

We can also add Rate Limiting of requests to our backend services, which will increase client latency but reduce the chance our system crashes due to having a single server overwhelmed with too many requests.

defaults
maxconn 8
mode http

frontend inbound
bind localhost:3010
default_backend web-api

backend web-api
option httpclose
server web-api-1 localhost:3020 maxconn 2

This is done simply by specifying the concurrent max connections we allow. If a third request comes in, the caller will experience higher latency while we wait for one of the two ongoing requests to complete.

Summary

  • Reverse proxies allow one process to receive and route work to a set of worker processes
  • We should use established tools for creating our reverse proxy
  • Reverse proxies can preform tasks like: Load balancing, TLS termination, compression and rate limiting with just a few config line changes

Load Testing

Load testing is a good way to stress test our system before we start to receive production traffic. It gives us a good way to gauge how our service will preform under different traffic loads and when we would need to add more servers to serve the increased traffic.

Autocannon

The book uses autocannon for testing our services — which I found to be a very easy and helpful tool. To use autocannon, we simply point it at a url and specify things like: concurrent connections, how many requests to fire and over what period of time to send them.

For example:

autocannon -d 60 -c 10 -R 20 http://localhost:3000

tells autocannon to send 20 requests per second using 10 concurrent connections over a period of 60 seconds to localhost:3000. Autocannon will spit out useful metrics like P99 latency and requests per second.

Coming up with an SLO

The book then goes through an exercise in coming up with an SLO (Service Level Objective) of creating a fibonacci service which can handle traffic with a P99 latency of 20ms. We then play around with this cfg file

defaults
mode http

frontend inbound
bind localhost:5000
default_backend fibonacci

backend fibonacci
server fibonacci-1 localhost:5001
server fibonacci-2 localhost:5002
server fibonacci-3 localhost:5003

and use autocannon to stress test the latency under different requests per second. We add servers to our backend to achieve the 20 ms latency goal for higher and higher request per second trials to give us an idea of how we should scale up our service when we receive more traffic.

Summary

  • Autocannon is a great tool for load testing a service
  • To come up with a scaling plan and an SLO, we should load test our service and scale up by adding more backend services to account for the extra requests

Conclusion

In this section, we dug into two important concepts for scaling out our service — creating a reverse proxy to put more worker servers behind our service and stress testing them using load test tools like autocannon. In our next section, we will investigate tools to help us capture information from our system when it is running in production.

--

--

Matthew MacFarquhar

I am a software engineer working for Amazon living in SF/NYC.