The story of asynchronous programming

Published Mar 11, 2018Last updated May 26, 2018

When writing modern web or mobile applications, it is common knowledge that asynchronous programming (async) is preferable to synchronous programming (sync) when building high QPS systems like web servers. But why is that the case? I've found when answering questions about Node or when helping people come from Node to other languages they always ask how to go about writing async code. But if I ask them why they need to have async it, they are not always able to tell me. So today, we'll dive into the history of concurrent programming to learn why async is preferable, and in what cases it is not useful.

Let's go back in time about 25 years to around 1993. We could go back further, but while the history is very interesting, we don't really need to know those details in order to understand async. In 1993 threads were around, but they weren't as common as you might think because most programmers typically only needed to work in a single-threaded world, and threads were typically only useful when you wanted to load some things from a slow source like a hard drive or a network, and process some of those things while other ones came in. They were also pretty useful when building GUIs since you could have one thread managing UI interaction while other threads did useful stuff. The only optimizations that needed to be done for threads by the underlying system was to remember to avoid scheduling threads that were waiting for I/O, so you didn't have to waste much time context switching.

Let's fast forward a few years now, to the dot-com era. This is when the Internet was taking off, and so you saw high-throughput web and database servers becoming very useful. The model of threading that these servers use is very different - instead of spinning up a few threads that are relatively long-lived, these systems use many threads that are extremely short-lived. The thread spins up, receives an incoming request, writes a response, and then dies.

When working with a low QPS system this isn't a huge problem, because the overhead of spinning up and down threads is not that significant - especially with databases that spend their time processing data. In the new web world though these requests were very quick, typically just looking up data in an index and writing a response, so the overhead of spinning up and down threads became quite high.

When there is an inefficiency or slowdown somewhere, smart engineers will sit down and think of a solution to make it faster. In this case, some of these smart engineers somewhere decided that instead of spinning threads up and down, they would just start a "pool" of threads that would sit around idle, and then distribute any available work as needed among the threads. Some applications didn't need too many threads and so would only have a few, but other applications such as high QPS servers might need a lot of threads and so they could spin up loads of them. This technique was unsurprisingly called "thread pooling".

This solution worked for a little while, however as is always the case with any sort of technical choice, some shortcomings appeared. Once the system had too many things to do and exhausted the pool of idle threads, new things to do (like incoming web connections) had to wait to be handled. In extreme cases new connections could just time out, and the client could not connect at all. Another example is GUI applications, where the UI thread could potentially get locked up and severely reduces the responsiveness of the application.

This is the problem that asynchronous programming solves. In many applications that use thread pools, the threads don't actually do much but wait for something else - a database, another web service, or some user input. If instead of the thread just waiting for the response you set up some system to call some code when it's available, you could then return the thread to the thread pool to do some other useful work. This way, you can have more operations in progress than you have threads. The downside is that there is a possibility that when the response comes back there are no available threads, but if threads are being cycled quickly then a new one should come up fairly quickly.

So next time you are annoyed at having to deal with a complex chain of Promises (aka Futures) complete with arrays and failure handling, remember that all of that mess is due to threads taking a little while to start up and down.

Discover and read more posts from Rob Britton
get started