Concurrency, the future of computing

Not so long ago, computer chips would get twice as fast every couple of years. Code programmers wrote two years ago would run almost twice as fast now, despite being completely untouched, just from hardware improvements alone. This well-known trend, Moore’s Law, has predicted — with uncanny accuracy — the advancement of integrated circuits since 1965.

Moore’s Law won’t last forever. Current transistors from AMD have a minuscule size of 7nm (nanometers). For comparison, molecules of DNA are about 2 nanometers wide. Fundamentally, it’s very hard for a transistor to get any smaller than that, due to the way that transistors work.

Transistors and quantum tunneling

Transistors function by controlling the flow of electrons, where a small electric current can control a larger one, like a switch. This allows it to have a “state”, on or off, or 0 or 1. Every transistor is able to store a “bit” of information, which can be used to perform more advanced computations. This is how a computer works: by performing operations on these bits.

However, as transistors get even smaller, quantum effects can be observed causing unusual behavior in transistors. Quantum tunneling is a phenomenon observed only at the quantum scale, in which particles have a small probability of “tunneling” through a barrier, in a way that classical mechanics is unable to explain. In short, this can happen because of the contradictory behavior of particles and waves; for example, in the famous “double-slit” experiment, light is observed to exhibit properties of both a particle and a wave, a phenomenon classical mechanics cannot explain.

What does this mean? As transistors get smaller, effects like quantum tunneling cause problems in integrated circuits due to electrons tunneling through transistors. This means transistors are unable to properly control the flow of electrons, leading to nondeterministic behavior. Imagine a universe in which 1 + 1 is only 2 half the time!

The multi-core era

Naturally, as it became harder to fit more transistors into a circuit, engineers started to think about an alternative approach to making processors faster. This led to the invention of multi-core processors, which utilize multiple processing units (“cores”) instead of a monolithic core to perform operations in parallel. In comparison with previous chips, these multi-core processors were generally faster and more energy efficient.

Since IBM introduced the first multi-core processor in 2001, the POWER4, the speed of multi-core processors continued to grow. Now, multi-core processors are ubiquitous. Almost all desktop and server processors manufactured within the last two decades have multiple cores.

Until quantum computing becomes viable for general, non-research purposes, it is evident that our current model of computation will hit a speed limit. This moves the burden of speeding up computation from processor manufacturers to software developers.

Concurrency and parallelism

Concurrency and parallelism are two terms that are often misused interchangeably, roughly to mean “doing many things at the same time”. However, there are precise differences between the two. Concurrency specifically refers to being able to efficiently switch between many tasks at the same time, while parallelism refers to actually executing multiple actions at the same time, because the CPU has multiple cores that can be used to execute more than 1 instruction at a time.

For example, a web server likely needs to use concurrency, because it needs to process thousands of requests at once. Since web servers do a lot of I/O, they can perform some work upon receiving a request, then switch to processing another request when the current one is blocked on I/O. In contrast, parallelism allows us to actually perform CPU-bound workloads more efficiently by partitioning it into chunks and running those at the same time; for example, at a very high level, my 16-core laptop could process 16 different files at the same time by using different threads to do so.

Although concurrency and parallelism are different, they are often used in tandem. The web server can’t rely on only parallelism to do its work — what if there are more requests than there are available cores? However, the web server won’t just rely on concurrency. If there are enough cores available to it, it will also try to spread its work out across the cores to make processing faster, in addition to using concurrency for processing I/O.

Concurrency models

Unfortunately, developing software in a concurrent style requires different system design, different patterns and techniques, more paradigms, and in general, a different way of thinking. This makes it quite difficult to port our existing software to the “multi-core era”; much of the code already in use today is written in a sequential style instead of concurrent.

However, it is evident that we are gradually moving towards concurrent programming. Many languages now provide or have libraries that allow programmers to exploit hardware improvements to make our software faster. This relies on us using a concurrency model.

Async/await

The async/await pattern, popular in languages like JavaScript, is built on top of Promises and allows for an easier way to write asynchronous code. With async/await, you can write asynchronous code that looks and behaves like synchronous code.

An async function can contain one or many await expressions that pause the execution of the function until the awaited promise is resolved or rejected. This enables the event loop to handle other tasks, thus improving the overall responsiveness of the application.

Actors

Actors are a concurrency model introduced by Carl Hewitt in the 1970s. An actor is a lightweight, isolated, and concurrently executing context with its own mailbox. Messages sent to an actor are queued in the mailbox, and the actor processes them one at a time.

The actor model simplifies concurrent programming by eliminating the need for explicit locks, shared memory, and other synchronization mechanisms. Elixir, a functional programming language, uses the actor model in the form of the GenServer and Agent behaviors, enabling developers to implement concurrent, fault-tolerant, and distributed systems.

Communicating sequential processes

CSP, introduced by Tony Hoare in the 1970s, is a theoretical framework describing concurrent processes through the use of channels and synchronous message passing. CSP has inspired various concurrency models, including Go’s goroutines and channels.

In Go, goroutines are lightweight threads managed by the Go runtime. Channels are used for inter-goroutine communication and synchronization. By using goroutines and channels, you can build concurrent applications with ease.

Conclusion

As Moore’s Law approaches its limits and transistors can no longer shrink in size, concurrency and parallelism will become increasingly critical for software development. Choosing the right concurrency model will depend on the specific use case and performance requirements. Embracing concurrent programming and mastering these models will help us continue to build high-performing and efficient software in the multi-core era.

Banner image attribution: TrustedReviews (CC BY-NC-ND 4.0)