Node Streams
Last updated: Apr 7, 2020
What are Streams and why do we need them?
On the modern internet, a lot of data moves from machine to machine. Several protocols exist to manage this process in different contexts. FTP is one, BitTorrent is another. I’m working on a project at the moment using Node.js to build a server that manages a database full of audio files, and since I want users to be able to download and upload these, I will be using Node’s implementation of Streams. Streams are used to “pipe” data from one place to another in Node. Without them, moving 1GB of data would take up 1GB of memory.
How do Streams work?
Readable streams emit “data” events—which means they tell the recipient they’re spitting out a unit of data—and they have “end” events—which means they tell the recipient there’s no more data. Writable streams have “drain” events—which means they can take in more data—and “finish” events—which means all the data has been moved. If you were reading from one stream and writing to another, you might listen for a data event from a readable stream, and write the relevant chunk to a writable stream. You could also flip this, and—after pausing the readable stream—only choose to read() from the readable stream when you get a “drain” event from the writable stream.
These have other events (like those used for error-handling) and there are other kinds of streams (like one that does both), but this is sort of the general shape of it.
Who came up with Streams?
Ryan Dahl wrote most of Node.js, and very probably implemented Streams in Node. However, the idea of a Stream-like protocol is older than Node. In fact, standard streams are older than JavaScript. It’s a little tough to find, because a few different things are called streams, but I think the first good example of an explanation of Streams as we know them is in a 1984 paper by Dennis M. Ritchie, who worked at Bell Labs at the time.
I actually think it’s even older than this. Maybe the 70s? It’s hard to tell—articles don’t separate “Pipelines” from “Pipelines in UNIX” from the use of stdin and stdout.
Why are Streams interesting?
I’m interested in Streams (especially in Node) because they are a way of helping Internet-connected devices to manage their resources. On the modern Internet, those devices can be… tiny phones and enormous servers, but also coffee mugs and traffic lights. These devices send messages back and forth with many different protocols, but Streams help them to manage a specific and sometimes difficult task. It’s also a metaphor that in 1984 was applied to a display and a keyboard and today has evolved to be the basis for several streaming platforms.