View all posts →

Backend(Node) with Kernel

Node with Kernel

I am currently exploring on low-level networking and Operating system related concepts, particularly on how backend and OS stuffs are inter-related.

While going through this i was so fascinated to read about on how efficiently Node.js handles concurrency with the help of Kernel

Is Node a single-threaded, or does it use multi-threading or other mechanisms to manage concurrency?

To answer this question,

let’s first review some basic concepts that are essential for understanding on how typical Backend application works and how Node.js manages network connections and concurrency.

Ok now coming to the real question, Is Node.js Single-Threaded or Multi-Threaded?

By default, Node.js is a single-threaded environment. which means that Node.js can processes only one operation at a time. However, there are some exceptions. Certain operations, such as crypto, file I/O, and DNS resolution, are given to a thread pool managed by libuv (the library that Node.js uses for asynchronous I/O operations).

By default, Node.js has a thread pool of 4 threads, but this can be configured up to a maximum of 1024 threads(though the exact maximum may depend on the system). The number of threads in the thread pool can be controlled via the UV_THREADPOOL_SIZE environment variable.

Example: Consider the following code

const { pbkdf2 } = require("crypto");


console.log(`currently running threads: ${process.env.UV_THREADPOOL_SIZE}`);


const start = (task) => {

  const st = Date.now();

  pbkdf2("SECRET", "salt", 1000000, 64, "SHA512", () => {

    console.log(`Finished running task ${task} in ${(Date.now() - st) / 1000} seconds`);

  });

};

start(1);

start(2);

start(3);

start(4);


Output

From the above output you can clearly confirm, that crypto is utilizing the libuv threads from the Node.js.

This is not the case for network related stuffs like DB call, fetch request etc. If so then how node is handling n number of Network Connections in a single thread?

Before going into that, Below is an high-level overview steps of how a typical backend server handles a request and response:

-> consider a backend application listens for incoming connections on a specific IP address and port (e.g., 1.1.1.1:8080).

-> server kernel now creates a socket to listen on the port(8080) and manages two queues: the SYN queue and the Accept queue for this port.

-> client sends an HTTP request to the server on port 8080 at the specified. address. This involves a SYN request from the client.

-> server kernel receives the SYN request, places it in the SYN queue, and responds with a SYN/ACK to the client.

-> client replies with an ACK packet.

-> server kernel matches the ACK with the corresponding SYN in the queue, completes the TCP handshake, and moves the connection from the SYN queue to the Accept queue.

-> BE application now retrieves the accepted connection (via the accept() system call) and removes the connection from the Accept queue.


The accepted connection is returned in the form socket descriptor or file descriptor to BE, which can be observed by running the following command in linux:

lsof -n -i :8080

subsequent read and write operations to this connection are handled by the operating system kernel, while the backend application only needs to make system calls (e.g., accept(), read(), write()) to interact with the connection.

How Node.js Utilizes the kernel to handle Concurrency?

Node.js uses a long-running event loop(kinda while loop) to manage incoming requests and network connections. This event loop runs continuously as long as the application process is active and constantly polls the operating system (kernel) for updates on connections, such as when data is available to read, or when the connection is ready to be written to.

This way Node.js manages concurrency using a single thread, while utilizing the operating system’s event-driven I/O mechanisms, like epoll etc that allows it to handle thousands of concurrent network connections efficiently.

This is just a drop in the ocean, There is much more to learn about the underlying mechanisms of how Node.js handles this and the wonders that OS is doing for us to make our lives easier. Some topics we can explore further include:

  1. Understanding of File descriptors in linux.

  2. How node uses Epoll system call to handle this asynchronous IO?

  3. How the event loop is keep running after our synchronous code?

  4. How epoll/kqueue kinda stuff works in the kernel level?

...and much more

View all posts →