The unintuitive latency over throughput problem

Since 15.01.2015, I am teaching a course about Elixir programming language. It is created by José Valim, who is Rails Core Team Member. He knew, that Erlang virtual machine easily solves problems, that are hard to solve in Ruby, mainly concurrency. During my first presentation, I had couple of slides, that showed, what is so great about Erlang VM, that José picked it for implementing the Ruby successor.

I will not go into details about all technical aspects, but there was one, that was particularly hard to understand: latency over throughput. To understand this problem better, you have to know one thing about scheduling. Erlang gives you very lightweight processes, but this feature is not unique. Go has goroutines, Scala has Akka library and other programming languages start to provide libraries to mimic this. But Erlang gives you also preemptive scheduling, which is really unique feature.

I tried to find something about preemptive scheduling in other languages. I’ve found articles about plans to add it in Go and Akka, but as far as I know, it is not quite there yet (correct me in comments, if I am wrong!).

But what is preemptive scheduling? Without going into details: it means, that a long running process can be paused to give other processes CPU time. Like in operating system, but using light processes and with much, much less overhead :)

Why is this detail important? Because this can reduce latency greatly, which makes user happy. :) How exactly does it work? To answer that problem, lets ask another question.

We have single core, no hyper-threading CPU. There is no parallelism involved. We are testing two web server implementations. We fire 1 000 000 requests and wait until web server returns all responses. First server has no preemptive scheduling. It returns last response after 60 seconds. All responses are correct. Next, we do the same to the server with preemptive scheduler. It finishes processing last request after 90 seconds. Again – all responses are correct.

Which webserver has higher throughput? Which one has lower latency? Which one would you choose?

Throughput question is easy: first one has 1 000 000 / 60 requests per second, which is 16 667 rps. Second one has 11 111 rps, which is worse.

Latency question: It might be tempting to say, that first server has lower latency. If processing all requests was faster, than avarage processing time must be lower, right? WRONG! And I will prove it to you, using counter example consisting of only two requests!


Lets say, there is one CPU intensive request, that will last for 5 time units and one quick one, which will last for 1 time unit. The longer is first in processing pipeline. In webserver without preemptive scheduling, it has to be processes from start to end. After that, we can get to the second one. We also count the time between next requests (one unit). Lets calculate the avarage response time. First response was sent after 5 tu, second one was sent after 7 tu. The average latency is 6 tu.

In preemptive web server, first request is processed only for one time unit and then it gets preempted, so that second one has a chance. It gets processed quickly and then, we come back to the firs one. Here, first request is finished after 8 tu and second one after 3 tu, giving avarage latency of 5.5 tu.

We have worse throughput and better latency at the same time! This is possible, because preemptive scheduling is fair. It minimises waiting time of requests and makes sure, that quick requests are served faster.

In real world, the longer request might be not 5 times, but 1000 times longer. Also context switch time is usually really small fraction of the minimal time spent on processing. This means, that you can get MUCH better results with preemptive scheduling.

Of course, it is not a silver bullet. If your application processes data and you need all data points for next step of computation, you will go with lower throughput. There is no need to optimise for latency in that case. But if you are building website, where you serve independent users and you want to be fair with them – check out the Erlang VM, you will not regret!

2 thoughts on “The unintuitive latency over throughput problem

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s