programming world is shaken by a new language, and the vast majority of developers, and the corporations they work for, move en mass to the new playground. Let’s take a look at the two technology shifts that have preceded this one, so that we can better understand what is happening right now.
What drove C++? It was the emergence of the object-oriented programming paradigm, the emergence of the PC and Microsoft Windows, and support from academic institutions. With hindsight such large-scale trends are easy to identify. The same can be done for Java. In that case, the idea of the software virtual machine, the introduction of garbage collection – a language feature lacking in C++ that offers far higher programmer productivity, and first wave of internet mania. Java, backed by Sun Microsystems, became the language of the internet, and many large corporate networked systems today run on Java. Microsoft can be included in the “Java” wave, in the sense the Microsoft’s proprietary competitive offering, C#, is really Java with the bad bits taken out.
Despite the easily recognizable nature of these two prior waves, one feature that both share is that neither wave led to a true monoculture. The C++ wave was splintered by operating systems, the Java wave by competing virtual languages such as C#. Nonetheless, the key drivers, the key elements of each paradigm shift, created a decade- long island of stability in the technology storm.
The C10K problem is this: how can you service 10000 concurrent clients on one machine? The idea is that you have 10000 web browsers, or 10000 mobile phones, all asking the same single machine to provide a bank balance or process an e-commerce transaction. That’s quite a heavy load. Java solves this by using threads, which are way to simulate parallel processing on a single physical machine. Threads have been the workhorse of high capacity web servers for the last ten years, and a technique known as “thread pooling” is considered to be industry best practice. But threads are not suitable for high capacity servers. Each thread consumes memory and processing power, and there’s only so much of that to go round. Further threads introduce complex programming programs, including a particularly nasty one known as “deadlock”. Deadlock happens when two threads wait for each other. They are both jammed and cannot move forward, like Dr. Seuss’s South-going Zax and North-going Zax. When this happens, the client is caught in the middle and waits, forever. The website, or cloud service, is effectively down.
There is a solution to the this problem – event-based programming. Unlike threads, events are light-weight constructs. Instead of assigning resources in advance, the system triggers code to execute only when there is data available. This is much more efficient. It is a different style of programming, one that has not been quite as fashionable as threads. The event-based approach is well suited to the cost structure of cloud computing – it is resource efficient, and enables one to build C10K-capable
systems on cheap commodity hardware.
Threads also lead to a style of programming that is known as synchronous blocking code. For example, when a thread has to get data from a database, it hangs around (blocks) waiting for the data to be returned. If multiple database queries have to run to build a web page (to get the user’s cart, and then the product details, and finally the current special offers), then these have to happen one after other, in other words in a synchronous fashion. You can see that this leads to a lot of threads alive at the same time in one machine, which eventually runs out of resources.
The event based model is different. In this case, the code does not wait for the database. Instead it asks to be notified when the database responds, hence it is known as non-blocking code. Multiple activities do not need to wait on each other, so the code can be asynchronous, and not one step after another (synchronous). This leads to highly efficient code that can meet the C10K challenge.