Non-Deathmatch: Preempt v. Low-Latency Patch

← Back to Stories (view on slashdot.org)

Non-Deathmatch: Preempt v. Low-Latency Patch

Posted by timothy on Thursday March 21, 2002 @04:19PM from the floor-wax-and-dessert-topping dept.

LiquidPC writes: "In this whitepaper on Linux Scheduler Latency, Clark Williams of Red Hat compares the performance of two popular ways to improve kernel Linux preemption latency -- the preemption patch pioneered by MontaVista and the low-latency patch pioneered by Ingo Molnar -- and discovers that the best approach might be a combination of both."

4 of 178 comments (clear)

Min score:

Reason:

Sort:

The Linux kernel preemption project by BrianGa · 2002-03-21 16:26 · Score: 5, Informative

Check out this
comprehensive guide to Linux Latency.
One good way to reduce kernel latency.. by thesupraman · 2002-03-21 17:15 · Score: 5, Interesting

One thing not mentioned so far is that one of THE largest scheduler latency problems comes from the driver for a PS/2 mouse, a very common item to be found plugged into servers which have no need for it. By removing the PS/2 mouse (and driver..) a significant latency improvement can be gained!

It's a pity that most USB mice don't seem to provide quite the quality of use as the PS/2 items (although this is probably also a driver issue)

Loy latency can be an advantage, but it is important that the cost of the lower latency is not an increase in total load, as in reality the lower latency does not provide a large gain in performance for most desktop or server roles, but rather is a measure more often used in real time systems, which it can make the difference between a system working or not.

An example of this is in an ignition ECU for a V12 engine at 6000 RPM, a (pair of) plug is firing every 1/600th of a sceond (1.66ms), but the accuracy of the firing even must be in the order of 10us, which is not yet reachable be any 'standard' unix kernel, but quite easy to get on a much simpler ECU (I use an SH-2 at 24 MHz) than you would notmally find using a true real-time kernel.

With some developments is may be possibly for a form of linux to reach this level, which would be fantastic, as a LOT of time is spent in embedded development providing 'operating system' level functionality around the actual application code, and with embedded processors getting faster, and memory getting cheap, embedding *nix has become much more of a possibility.
Re:What's up with the degrading performance? by steveha · 2002-03-21 18:11 · Score: 5, Insightful

Well, to be precise, the worst-case value "degraded". And I'm not sure "degraded" is the correct word. With a huge torture load put on the kernel, during a 15-hour interval, at least once the latency value was 215.2 msec. This could mean that there is a possible latency condition that happens under torture load approximately once every 15 hours. It could also mean that after 15 hours, your chance goes up so it could happen much more often than that, but we don't know that. It could even mean that there is a possible 215 msec latency condition that happens under torture load approximately once every 30 hours, and it happened to occur during the first 15 hours.

Embedded systems must have a very high uptime, it's not acceptable to reboot the machine every day to maintain performance.

True that. Which is why the author of that article made the point that combining the two patches is the best way to go, since he ran the torture test for 15 hours and it didn't go over 1.5 msec even once.

Note that for many purposes, a worst-case latency of 1.5 msec is ample. I don't think there is any version of Windows that goes that low; I don't even think BeOS (legendary for low latency) goes that low. As the author noted, if you are driving a chemical processing factory or something like that, you need hard realtime and you should use something other than Linux kernel 2.4.x!

steveha

--
lf(1): it's like ls(1) but sorts filenames by extension, tersely
My take on the results and the future by sagei · 2002-03-22 02:02 · Score: 5, Informative

First, I wanted to give my view of the results - what they mean and what that means. Note there are multiple notions of latency performance. Average latency and worst-case latency, among others, but those are most important. This test measured worst-case latency. Both are important - for user experience average case is very important and for real-time applications worst-case is very important.

It is not a surprise the low-latency patches scored better, or that the ideal scenario was using both. The preemptive kernel patch is not capable of fixing most of the worst-case latencies. This is because, since we can not preempt while holding a lock, any long durations where locks are held now become our worst-case latencies. We have a tool, preempt-stats, that helps us find these. With the preempt-kernel, however, average case latency is incredibly low. Often measured around 0.5-1.5 ms. Worst-case depends on your workload, and varies under both patches.

Now, the results don't mention average case (which is fine), but keep in mind with preempt-kernel it is much lower. The good thing about these results are that it does indeed show that certain areas have long-held locks and the preempt-kernel does nothing about them. Thus a combination of both gives an excellent average latency while tackling some of the long-held locks. Note it is actually best to use my lock-break patch in lieu of low-latency in combination of with preempt-kernel, as they are designed and optimal for each other (lock-break is based on Andrew's low-latency).

So what is the future? preempt-kernel is now in 2.5 and, as has been mentioned, Andrew and I are working on the worst-case latencies that still exist. Despite what has been mentioned here, however, we are not going to adopt a low-latency/lock-break explicit schedule and lock-breaking approach. We are going to rewrite algorithms, improve lock semantics, etc. to lower lock-held times. That is the ease and cleanliness of the preemptive kernel approach: no more hackery and such to lower latency in problem areas. Now we can cleanly fix them and voila: preemption takes over and gives us perfect response. I did some lseek cleanup in 2.5 (removed the BKL from generic_file_llseek and pushed the responsibility for locking into the other lseek methods) and this reduced latency during lseek operations -- a good example.

So that is the plan ... it is going to be fun.

--

Robert Love