Distributed Systems With Node.js: Part 2 Protocols

Matthew MacFarquhar
10 min readApr 20, 2024

Introduction

In this series, I will be working through this book on Distributed Systems using Node.js. The book is quite large and in depth, but in these articles I will be distilling the big picture pieces of the book into bite size tutorials and walk throughs.

In this section, we will be exploring how to make HTTP better by compressing payloads using gzip and enabling HTTPS with TLS for advanced security, we will also explore the different types of protocols that services can use to communicate with each other (JSON over HTTP, GraphQL and RPC).

The code for this demo is found in this repo at the commit here

HTTP

HTTP is built on top of TCP and sits at Layer 7 on the OSI System

The book goes over the semantics of HTTP: methods, headers, status codes, and body. All said and done, HTTP is just an agreed upon way to layout text that will be sent and received over the network.

Compression

One piece of HTTP is the body, which will usually take up most of the package size. To reduce the size sent over the network, the consumer and producer can agree to compress the body. The flow goes like this…

1. The consumer sends a request to the producer with the header ‘Accept-Encoding: gzip’ to tell the server that it can send back the response with the body gziped

curl -H 'Accept-Encoding: gzip' http://localhost:1337 | gunzip

NOTE: the pipe to gunzip is so we can see the uncompressed reuslt in our terminal

2. The producer receives this request with the header and zips the body before sending it back alongside with a header ‘Content-Encoding: gzip’ to tell the consumer that the body is gziped

#!/usr/bin/env node

const zlib = require('zlib');
const http = require('http');
const fs = require('fs');

http.createServer((req, res) => {
const raw = fs.createReadStream(__dirname + '/index.html');
const acceptEncoding = req.headers['accept-encoding'] || '';
console.log(acceptEncoding);

res.setHeader('Content-Type', 'text/plain');

if (acceptEncoding.includes('gzip')) {
console.log('encoding with gzip');
res.setHeader('Content-Encoding', 'gzip');
raw.pipe(zlib.createGzip()).pipe(res);
} else {
console.log('no encoding');
raw.pipe(res);
}
}).listen(process.env.PORT || 1337);

In this example we are sending over an html file so we set the Content-Type to be text/plain.

HTTPS/TLS

The book goes into how we can set up TLS on our service to allow our services to run over HTTPS (which is really just HTTP that gets passed through TLS before our server processes it).

TLS and HTTPS allow us to send encrypted traffic over the network (unlike HTTP traffic which is readable for any man-in-the middle). To establish this encrypted message sending…

  1. The server creates a public and private key. The public key can encrypt messages, the private key can decrypt them.
  2. The client has the server’s public key which was made publicly available
  3. The client then creates a random session key which can be used for encrypting and decrypting messages, and encrypts that session key using the public key
  4. The encrypted session key is sent to the server, and the server uses its private key to decrypt it. The client and server now have securely shared an encryption key
  5. All traffic is now encrypted and decrypted using the shared session key

The book goes over creating a private key and cert (public key) with the following command.

openssl req -nodes -new -x509 -keyout recipe-api/tls/basic-private-key.key -out shared/tls/basic-certificate.cert

Now our recipe service has the secret key and the public key is available to all.

#!/usr/bin/env node

const fs = require('fs');
const server = require('fastify')({
https: {
key: fs.readFileSync(__dirname+'/tls/basic-private-key.key'),
cert: fs.readFileSync(__dirname+'/../shared/tls/basic-certificate.cert')
}
});
const HOST = process.env.HOST || '127.0.0.1';
const PORT = process.env.PORT || 4000;

console.log(`worker pid=${process.pid}`);

server.get('/recipes/:id', async (req, res) => {
console.log(`worker request pid=${process.pid}`);
const id = Number(req.params.id);

if (id !== 42) {
res.statusCode = 404;
return { error: 'not_found' }
}

return {
producer_pid: process.pid,
recipe: {
id: id,
name: "Chicken Tikka Masala",
steps: "Throw it in a pot...",
ingredients: [
{id:1, name: "Chicken", quantity: "1 lb"},
{id:2, name: "Sauce", quantity: "2 cups"},
]
}
}
});

server.listen(PORT, HOST, () => {
console.log(`Producer running at https://${HOST}:${PORT}`);
})

Our only changes to the producer are to configure the server with our private and public keys and then update our console log to correctly indicate we are now accepting HTTPS traffic.

#!/usr/bin/env node

const server = require('fastify')();
const fetch = require('node-fetch');
const https = require('https');
const fs = require('fs');
const HOST = process.env.HOST || '127.0.0.1';
const PORT = process.env.PORT || 3000;
const TARGET = process.env.TARGET || 'localhost:4000';

const options = {
agent: new https.Agent({
ca: fs.readFileSync(__dirname+'/../shared/tls/basic-certificate.cert')
})
};

server.get('/', async () => {
const res = await fetch(`https://${TARGET}/recipes/42`, options);
const producer_data = await res.json();

return {
consumer_pid: process.pid,
producer_data: producer_data
};
});

server.listen(PORT, HOST, () => {
console.log(`Consumer running at http://${HOST}:${PORT}`)
})

Our only change to our consumer is to send our request using the public key.

We still accept external traffic to the consumer using HTTP, but our consumer-producer communication is now done over HTTPS.

The book also goes over how to use Root Certs to sign our public keys so that each client does not need to have every HTTPS server’s public key and only needs to use a widely available root cert that has a trust chain to the server’s cert.

Summary

  • HTTP is just semantics on top of TCP written in plain text
  • HTTP bodies can be compressed so long as the server and client agree on a compression methodology communicated via headers
  • Public-Private keys can be used to encrypt traffic over the network using a TLS layer

Protocols

Above level 7 on the OSI layers, there is a theoretical level 8 about how data is communicated. Most of the internet uses JSON over HTTP to communicate data on this layer. The book points out inconsistencies in handling things like communicating pagination between different services using JSON over HTTP. It is pretty clear that layer 8 is still a bit of a wild west (and may remain that way forever). Two other layer 8 protocols are GraphQL and RPC — the next two topics in the book.

GraphQL

The book goes over GraphQL as the first alternative to JSON over HTTP. GraphQL is great at stitching together multiple resources into one facade API, we can have a GraphQL API do things like query one service for user info and then use another service to drill down into that user’s favorite recipes. I have written an article previously about the cool things GraphQL can do here https://medium.com/dev-genius/creating-a-graphql-api-174487b0dd82.

For the purposes of the book, we will only be hydrating our Query using a single service.

type Query {
recipe(id: ID): Recipe,
pid: Int
}

type Recipe {
id: ID!
name: String!
steps: String
ingredients:[Ingredient]!
}

type Ingredient {
id: ID!
name: String!
quantity: String
}

GraphQL uses a .gql schema file that the producer service reads in order to set up its server. In our API, we allow people to get the producer process id that returns the response and — when provided a recipe ID — we can allow users to drill down and get components of a Recipe.

#!/usr/bin/env node

const server = require('fastify')();
const graphql = require('fastify-gql');
const fs = require("fs");
const schema = fs.readFileSync(__dirname+'/../shared/graphql-schema.gql').toString();
const HOST = process.env.HOST || '127.0.0.1';
const PORT = process.env.PORT || 4000;

const resolvers = {
Query: {
pid: () => process.pid,
recipe: async (_obj, {id}) => {
if (id != 42) throw new Error(`recipe ${id} not found`);

return {
id: id,
name: "Chicken Tikka Masala",
steps: "Throw it in a pot..."
}
}
},
Recipe: {
ingredients: async (obj) => {
return (obj.id != 42) ? [] : [
{id: 1, name: "Chicken", quantity: "1 lb"},
{id: 2, name: "Sauce", quantity: "2 cups"}
]
}
}
};

server
.register(graphql, {schema, resolvers, graphiql: true})
.listen(PORT, HOST, () => {
console.log(`Producer running at http://${HOST}:${PORT}/graphql`);
});

The producer will read the schema file and create resolvers. The resolvers will tell us how to generate the requested data for the requestor. We can see that if the user requests the pid, we can simply return the process id. If the user queries for a recipe, we will throw an error if the id is not 42 and if it is 42 we can immediately populate the id, name and steps fields. Getting a Recipe’s ingredients requires a separate resolver which we define below. It just returns a hard coded list of ingredients.

#!/usr/bin/env node

const server = require('fastify')();
const fetch = require('node-fetch');
const HOST = '127.0.0.1';
const PORT = process.env.PORT || 3000;
const TARGET = process.env.PORT ||'localhost:4000';
const complex_query = `query kitchenSink ($id:ID) {
recipe(id: $id) {
id
name
ingredients {
name
quantity
}
}
pid
}`;

server.get('/', async () => {
const res = await fetch(`http://${TARGET}/graphql`, {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
query: complex_query,
variables: {id: "42"}
})
});

const producer_data = await res.json();

return {
consumer_pid: process.pid,
producer_data: producer_data
};
});

server.listen(PORT, HOST, () => {
console.log(`Consumer running at http://${HOST}:${PORT}`)
})

The consumer can now make a query by sending a POST request to the /graphql endpoint of our producer. The body of our post must contain a string under the field query and a JSON object for any variables used in our query. Our current query requests everything, but the cool thing about GraphQL is that if something is not requested then it is not computed and not returned. For example, if we do not request ingredients information for our recipe, then the ingredients resolver will not run which may save us an extra database call.

RPC

RPC is essentially a way for one process to have precise calling ability over another process running separately. Unlike traditional HTTP where you may have a POST request which adds a user to a system and sends an email, RPC would expose two APIs for each function (i.e. one call to invoke adding the user to the database and one to trigger the email send). RPC calls allow for more powerful interaction patterns (like client streaming, server streaming or bi-directional streaming) than HTTP and more fine grained calling control — as explained above. I have written an entire article on RPC if you are interested https://medium.com/dev-genius/building-powerful-client-server-communication-with-rpc-a-step-by-step-tutorial-with-demo-2e3d498ee6a3.

Like with our GraphQL demo, we will be doing a very trivial demo of RPC using a simple unary call.

syntax="proto3";
package recipe;

service RecipeService {
rpc GetRecipe(RecipeRequest) returns (Recipe) {}
rpc GetMetaData(Empty) returns (Meta) {}
}

message Recipe {
int32 id = 1;
string name = 2;
string steps = 3;
repeated Ingredient ingredients = 4;
}

message Ingredient {
int32 id = 1;
string name = 2;
string quantity = 3;
}

message RecipeRequest {
int32 id = 1;
}

message Meta {
int32 pid = 1;
}

message Empty {}

This .proto file describes all services, RPC functions on those services and the message structures being passed around via those services. Clients and Servers share this .proto file so they have a shared understanding of what can be done in the services.

#!/usr/bin/env node

const grpc = require('@grpc/grpc-js');
const loader = require('@grpc/proto-loader');
const pkg_def = loader.loadSync(__dirname+'/../shared/grpc-recipe.proto');
const recipe = grpc.loadPackageDefinition(pkg_def).recipe;

const HOST = process.env.HOST || '127.0.0.1';
const PORT = process.env.PORT || 4000;
const server = new grpc.Server();
server.addService(recipe.RecipeService.service, {
getMetaData: (_call, cb) => {
cb(null, {
pid: process.pid
});
},
getRecipe: (call, cb) => {
if (call.request.id !== 42) {
return cb(new Error(`unknown recipe ${call.request.id}`));
}
cb(null, {
id: 42,
name: "Chicken Tikka Masala",
steps: "Throw it in a pot...",
ingredients: [
{id: 1, name: "Chicken", quantity: "1 lb"},
{id: 2, name: "Sauce", quantity: "2 cups"}
]
})
}
});

server.bindAsync(`${HOST}:${PORT}`, grpc.ServerCredentials.createInsecure(), (error, port) => {
if (error) throw error;
server.start();
console.log(`Producer running at http://${HOST}:${port}`);
});

Our producer will load the proto file and add services to the server. Each service object will define how to resolve the different available RPC calls. For example, when we get a getRecipe call, we verify the id is 42 and then return a JSON object matching the structure of our Recipe message defined in .proto.

const util = require('util');
const grpc = require('@grpc/grpc-js');
const server = require('fastify')();
const loader = require('@grpc/proto-loader');
const pkg_def = loader.loadSync(__dirname+'/../shared/grpc-recipe.proto');
const recipe = grpc.loadPackageDefinition(pkg_def).recipe;

const HOST = '127.0.0.1';
const PORT = process.env.PORT || 3000;
const TARGET = process.env.PORT ||'localhost:4000';

const client = new recipe.RecipeService(
TARGET,
grpc.credentials.createInsecure()
);
const getMetaData = util.promisify(client.getMetaData.bind(client));
const getRecipe = util.promisify(client.getRecipe.bind(client));

server.get('/', async () => {
const [meta, recipe] = await Promise.all([
getMetaData({}),
getRecipe({id: 42})
]);

return {
consumer_pid: process.pid,
producer_data: meta,
recipe: recipe
}
});

server.listen(PORT, HOST, () => {
console.log(`Consumer running at http://${HOST}:${PORT}`)
})

Our consumer can now load the same .proto file and create a client by using the RecipeService object defined in .proto and pointing it to our producer’s url. We can then call the methods exposed in the .proto as if they were local functions.

Summary

  • Most of the web uses JSON over HTTP as its layer 8 protocol
  • GraphQL is an alternative to JSON over HTTP and works particularly well as a facade in front of multiple API resources
  • RPC allows for more fine grained method calling on a remote server, it allows clients to call server functions as if they were local functions

Conclusion

In this section we got to explore a few OSI layers and learned about how HTTP can be optimized using gzip compression and secured using TLS. We also explored some level 8 alternatives to JSON over HTTP, namely GraphQL and RPC and talked about when each may be useful. In the next section, we will dive into making our applications scalable by exploring master-client systems, reverse proxies and load testing.

--

--

Matthew MacFarquhar

I am a software engineer working for Amazon living in SF/NYC.