Top of Page
May 29, 2012
(Original Japanese article translated on September 11, 2012)
The IIJ-II Research Laboratory began development of a Web server called Mighttpd (pronounced "mighty") in Fall of 2009, and has released it as open source. Through its implementation we arrived at an architecture that has enhanced multi-core performance while maintaining code simplicity. Here we take a look at each architecture one at a time.
Traditional servers use a technique called thread programming. This architecture involves handling each connection using a single process or native thread.
It can be broken down by creating processes or native threads. When using a thread pool, multiple threads are created in advance. An example of this is the prefork mode in Apache. Otherwise, threads are created each time a connection is received. This architecture has the advantage of allowing clear code to be written, because exclusive control is possible. Also, because the kernel assigns processes and native threads to available cores, handling can be balanced evenly across them. Its shortcoming is a large number of kernel and process context switches occur when switching processes, decreasing performance.
* Thread programming results in clear code because exclusive control is possible.
Recently, it is said that event-driven programming is required to implement high-speed servers. This architecture handles multiple connections using a single process. One Web server using this architecture is lighttpd (Mighttpd's name was taken from lighttpd).
Since there is no need to switch processes, less context switches occur, and performance is improved. This is its chief advantage. However, it has two shortcomings. The first is the fact that only one core can be utilized because there is only a single process. The second is that it requires asynchronous programming, so code is fragmented into event handlers. Asynchronous programming also prevents the conventional use of exception handling (although there are no exceptions in C).
* Event-driven programming results in code with less clarity because it is split into chunks.
Many have hit upon the idea of creating as many event-driven processes as there are cores to utilize multi-core processors. Port 80 must be shared for Web servers, but using the prefork technique, port sharing can be achieved by simply modifying code slightly. I call this 1 core 1 process mapping.
One Web server that uses this architecture is nginx. Additionally, although node.js used event-driven architecture, recently 1 core 1 process mapping has also been implemented.
The advantage of this architecture is that it utilizes all cores of a multi-core processer, and improves performance. However, it does not resolve the issue of programs having poor clarity.
To resolve the issue of code with poor clarity, I used the lightweight threads provided by GHC (Glasgow Haskell Compiler), which is the main compiler for Haskell, a purely functional programming language. Lightweight threads are user-space threads implemented over an event-driven model. Modern computers can operate smoothly with 100,000 of these threads running. Some languages and libraries provided user-space threads in the past, but they are not commonly used now because they are not lightweight, and are prone to issues. GHC is a good match for the properties of the Haskell purely functional programming language, and succeeds in providing lightweight threads.
Use of lightweight threads makes it possible to write code with good clarity like traditional thread programming. Additionally, since the GHC runtime handles the switching of lightweight threads, no context switches occur.
* Lightweight threads result in clear code because exclusive control is possible.
I believe a method that utilizes all cores by creating a process for each core, improves performance with an event-driven model, and maintains code simplicity through use of lightweight threads is currently the best server architecture, but there is one point to note.
When a large number of system calls are issued, servers do not produce the expected performance. This is because issuing a system call causes a context switch to occur, allocating CPU time to the kernel and stopping all lightweight threads. For this reason Mighttpd temporarily caches system call results such as the stat() of a file, reducing the number of system calls issued.
Many mistakenly believe that functional languages are slow or impractical, but in our benchmarks Mighttpd produces performance on par with nginx.
Author Profile
Kazuhiko Yamamoto
Senior Chief Engineer, Research Laboratory, IIJ Innovation Institute Inc. (IIJ-II)
Mr. Yamamoto joined IIJ in 1998. Some of the open-source software he has developed includes Mew, Firemacs, and Mighttpd. He is the translator of "Programming in Haskell". He spends his days tackling Haskell at work, and his two sons at home.
Links
End of the page.