The unintuitive latency over throughput problem

Since 15.01.2015, I am teaching a course about Elixir programming language. It is created by José Valim, who is Rails Core Team Member. He knew, that Erlang virtual machine easily solves problems, that are hard to solve in Ruby, mainly concurrency. During my first presentation, I had couple of slides, that showed, what is so great about Erlang VM, that José picked it for implementing the Ruby successor.

I will not go into details about all technical aspects, but there was one, that was particularly hard to understand: latency over throughput. To understand this problem better, you have to know one thing about scheduling. Erlang gives you very lightweight processes, but this feature is not unique. Go has goroutines, Scala has Akka library and other programming languages start to provide libraries to mimic this. But Erlang gives you also preemptive scheduling, which is really unique feature.

I tried to find something about preemptive scheduling in other languages. I’ve found articles about plans to add it in Go and Akka, but as far as I know, it is not quite there yet (correct me in comments, if I am wrong!).

But what is preemptive scheduling? Without going into details: it means, that a long running process can be paused to give other processes CPU time. Like in operating system, but using light processes and with much, much less overhead :)

Why is this detail important? Because this can reduce latency greatly, which makes user happy. :) How exactly does it work? To answer that problem, lets ask another question.

We have single core, no hyper-threading CPU. There is no parallelism involved. We are testing two web server implementations. We fire 1 000 000 requests and wait until web server returns all responses. First server has no preemptive scheduling. It returns last response after 60 seconds. All responses are correct. Next, we do the same to the server with preemptive scheduler. It finishes processing last request after 90 seconds. Again – all responses are correct.

Which webserver has higher throughput? Which one has lower latency? Which one would you choose?

Throughput question is easy: first one has 1 000 000 / 60 requests per second, which is 16 667 rps. Second one has 11 111 rps, which is worse.

Latency question: It might be tempting to say, that first server has lower latency. If processing all requests was faster, than avarage processing time must be lower, right? WRONG! And I will prove it to you, using counter example consisting of only two requests!

latency-throughput

Lets say, there is one CPU intensive request, that will last for 5 time units and one quick one, which will last for 1 time unit. The longer is first in processing pipeline. In webserver without preemptive scheduling, it has to be processes from start to end. After that, we can get to the second one. We also count the time between next requests (one unit). Lets calculate the avarage response time. First response was sent after 5 tu, second one was sent after 7 tu. The average latency is 6 tu.

In preemptive web server, first request is processed only for one time unit and then it gets preempted, so that second one has a chance. It gets processed quickly and then, we come back to the firs one. Here, first request is finished after 8 tu and second one after 3 tu, giving avarage latency of 5.5 tu.

We have worse throughput and better latency at the same time! This is possible, because preemptive scheduling is fair. It minimises waiting time of requests and makes sure, that quick requests are served faster.

In real world, the longer request might be not 5 times, but 1000 times longer. Also context switch time is usually really small fraction of the minimal time spent on processing. This means, that you can get MUCH better results with preemptive scheduling.

Of course, it is not a silver bullet. If your application processes data and you need all data points for next step of computation, you will go with lower throughput. There is no need to optimise for latency in that case. But if you are building website, where you serve independent users and you want to be fair with them – check out the Erlang VM, you will not regret!

Managing documentation with GitHub and Jekyll

In my company, we have an open source project, that is running for some time. It is a chat server written in Erlang https://github.com/esl/MongooseIM. Recently, we stumbled on a problem. All of our documentation was on wiki pages on github. MongooseIM team puts great effort to keep it always up to date. But what if someone needed docs for old version of Mongoose?

The solutions seemed to be easy. Lets generate html version of docs for every new git tag!

Current docs used Markdown, so to have better versioning, we moved the docs to the repository. This way, when we update the code, we can update the docs in the same commit. If someone checks out some old tag, he will have matching doc in doc folder. Cool!

Now, the hard part! We would like to generate html docs from markdown ones. Why? They would be easier to read and we have greater control over presentation. We can also show docs for couple of recent version without the need to checkout the repo or switching to tags.

The static page generator choice was a no-brainer. Jekyll is not only easy to set up, generates static html from markdown, but also has native GitHub pages support. We used categories to indicate releases and changed permalinks to use the scheme /:categories/:title/ After prepending dates to file names, our docs automatically became posts. That was easy!

Then, we realised, that not everything is so cool…

1. Links between files were broken. In GitHub Flavoured Markdown, when you specify link [some title](chapter1.md) in file table-of-contents.md, GitHub searches for the file chapter1.md in the same directory as table-of-contents.md. But html generated by Jekyll has <a href=”chapter1.md”>some title</a>, which means, that instead of going from /MongooseIM/1.5/table-of-contents.md/ to /MongooseIM/1.5/chapter1.md, it will go to /MongooseIM/1.5/table-of-contents.md/chapter1.md. It will just append the href to the existing url, which does not make any sense.

We thought about changing the base_url, but it is one for entire page, so we would have problems with different docs versions, which was one of the main points of doing the whole html generation!

The easiest solution, we could think of, was going through all markdown files and changing (chapter1.md) to (../chapter1.md). We simply used sed for that. This looks more like an ugly hack, than a solution, though.

2. Erlang code quite often has code like this: {{atom, Something}, Options}. This is syntax for tuples nested in each other. Jekyll uses Liquid for its templating language and two curly braces are used to print variables. The problem is, we wanted to use unmodified markdown files. We don’t have any variables! We even don’t have the jekyll header with title date and other stuff. We were able to squeeze all that into _config.yml.

This lead us to another hack: we’ve added {% raw %} and {% endraw %} to every file before the generation step.

3. All the titles got capitalised. Instead of mod_mam, we now have Mod_mam. It is problematic, because Erlang is case sensitive, so those titles mean two different things…

We haven’t found easy solution to this problem yet. Lets call it a day, we will think about this tomorrow.

How do you manage documentation for your open source projects? Do you use some other tool? Maybe you know, how to make jekyll generate links in some clever way? Do you know, how to disable Liquid for all posts in category? Where do you configure Jekyll to stop capitalising post titles? If you know one of those things, please post a comment :)