Tuesday, December 1

How Competition and Cooperation benefits Linux

First I must explain what a scheduler is and outline some history. To avoid writing hundreds of pages and getting into too much technical detail, I have simplified things a somewhat and not included a lot of scheduler variants that have also played a part.

In short, if you have two applications running, like a word processor and a music player, you don't want your music to stop playing while you are cutting and pasting a paragraph in your word processor - you need a scheduler! The reason that your music keeps playing is due to the actions of a scheduler that allocates how much precious processor time each application has, possibly interleaving the processing of the two a thousand times a second.

In actual fact there are many tasks that need to be scheduled that most people are totally unaware of, and even highly technical people are glad they don't have to worry about them (normally!). Note that even a single application may be running several tasks that the scheduler has to manage. Now even if you are only running one application that has only a single task, there are other tasks that also need to be scheduled (like updating the clock display), plus system tasks you'll most likely never be aware of.

If you have a processor with multiple cores (as in most new computers bought recently), you still need a scheduler - as it also is responsible for sharing the work between the different cores.

When you are runing a modern operating system like Linux, the kernel (the heart of an operating system) must allocate processor time between different tasks. In the old days, there was a simple scheduler that worked moderately well for single core processors, but did not perform very well when you had 2 or more cores. Performance for 2 cores was not too bad, but it rapidly got worse for 4 or more cores. For ordinary desk top users, this was not really a problem, but for power users and heavy duty servers - there was an increasing mismatch with the performance people expected from multi-core systems and what was actually delivered.

So a bunch of people got together to develop a better scheduler, one that tended to deliver performance proportional to the number of processor cores available - provided, of course, that there were enough tasks that could be allocated between the cores. This was called the O(1) scheduler, as the elapsed time taken to schedule was independent of the number of processor cores. This had a few problems, such as some tasks were unfairly treated, which meant they tended to be starved of the processor time they needed.

So a new scheduler was developed, the CFS or Completely Fair Scheduler. A very gifted developer (Con Kolivas) designed and implemented his own scheduler that was far better in some ways, unfortunately he was very sick, communications between him and the developer of the CFS broke down. Key improvements were made to the CFS as a direct result of Con's work and he was given due recognition - unfortunately bad feelings developed. Developers are human, have emotions, and have bad days - like everyone else.

Just recently, Con developed a much simpler and much faster, for types of workloads on personal computers, refered to as the BFS. A developer of the software used to decode things like MP3 files so you can play music (for the technically inclined it is the multi media codec H.264) found Con's new scheduler did a lot better than the CFS.

So the H.264 developer contacted the CFS developers and they took on board the problem and modified the CFS to be much faster.

If you go to this blog entry, you will see how Competition and Cooperation can significantly improve the Linux kernel (it also shows that not all is perfect in the land of Linux). One of the keys is raising useful bug reports rather than just simply complaining. Remember, even non technical users can raise bug reports. The importance of raising concerns with developers is highlighted here (a test case is a simple way to reproduce the problem):
So I tentatively submitted a test case to the Linux kernel mailing list. I didn’t know what to expect; maybe more flames carrying over from the BFS debate? Instead, I got “Thanks a bunch for the nice repeatable testcase!” This is one of the few times I’ve seen this outside of what I attempt to do with x264: a developer happy to see someone report a bug with his code and apparently eager to jump to fixing it. Though it certainly sounded good so far, but would anything result from this?

Answer: yes: up to a 70% increase in performance, committed the next day. But the kernel devs weren’t done yet: a quick grep of Linux kernel mails over the next weeks showed x264 popping up in quite a few scheduler benchmarks: they had added it as a regular test case. And just recently we got another 10% performance.

No comments:

Post a Comment