Non-Deathmatch: Preempt v. Low-Latency Patch
LiquidPC writes: "In this whitepaper on Linux Scheduler Latency, Clark Williams of Red Hat compares the performance of two popular ways to improve kernel Linux preemption latency -- the preemption patch pioneered by MontaVista and the low-latency patch pioneered by Ingo Molnar -- and discovers that the best approach might be a combination of both."
whats wrong with cooperative multitasking?
Check out this
comprehensive guide to Linux Latency.
This is why I love open source software. Linux will push ahead as Windows is left in the dust.
Hmmm, let's see here. The Low Latency patch makes the slowest parts of the kernel faster or breaks them into smaller pieces. The Preempt Patch allows the kernel to be interrupted in lots of places. Exactly how could combining these NOT be a good idea?
Lump lingered last in line for brains, and the ones she got were sorta rotten and insane.
Lol!! Mod parent up for exhibiting L337 trolling skills!
.
The article doesn't mention this, but something some folks aren't aware of is that MontaVista is a serious linux partner with IBM. If the technologies described in the white paper can be merged, then the real effect can have a more significant impact in the embedded application/ PowerPC products from the World's 9th largest corporation.
ask...is it or is it not, a good read?
#
Don't miss the thrilling link to the debate on whether it is PreemptAble or PreemptIble...
------
Today's Top Deals
...definitely sucks ass in Linux. I don't know if it is better or worse in other OSes, but we have several dual processor machines and two quad Xeon machines at work running Linux, and it seems as though if the first processor is running at max capacity, then you're just kind of shit outta luck if you try to start a process--it sometimes takes as 5-10 seconds before it'll even notice you've tried to run something and assign it to another processor that isn't doing anything.
Does anyone else have this problem? I'm hoping it's just some kind of dumbass configuration mistake on my part and not just the way it is...
Non-deathmatch, eh? I think I smell a Preempt vs. Low-Latency Quake 3 mod in the works..
slashdot!=valid HTML
Stupid, we all get the Sun. Those who don't leave the house notwithstanding.
You're an idiot.
Ingo Molnar's O(1) scheduler was integrated
into the development tree back around Linux 2.5.4
So it's already in there.
Preemption was integrated about the same time.
One thing not mentioned so far is that one of THE largest scheduler latency problems comes from the driver for a PS/2 mouse, a very common item to be found plugged into servers which have no need for it. By removing the PS/2 mouse (and driver..) a significant latency improvement can be gained!
It's a pity that most USB mice don't seem to provide quite the quality of use as the PS/2 items (although this is probably also a driver issue)
Loy latency can be an advantage, but it is important that the cost of the lower latency is not an increase in total load, as in reality the lower latency does not provide a large gain in performance for most desktop or server roles, but rather is a measure more often used in real time systems, which it can make the difference between a system working or not.
An example of this is in an ignition ECU for a V12 engine at 6000 RPM, a (pair of) plug is firing every 1/600th of a sceond (1.66ms), but the accuracy of the firing even must be in the order of 10us, which is not yet reachable be any 'standard' unix kernel, but quite easy to get on a much simpler ECU (I use an SH-2 at 24 MHz) than you would notmally find using a true real-time kernel.
With some developments is may be possibly for a form of linux to reach this level, which would be fantastic, as a LOT of time is spent in embedded development providing 'operating system' level functionality around the actual application code, and with embedded processors getting faster, and memory getting cheap, embedding *nix has become much more of a possibility.
Clearly, most RTOS designers have their priorities backwards.
Mmmm, donuts.
Search 2010 Gen Con events
And some wonder why you post at -1. Oh well, at least it's an improvement on the "Alan Thicke is dead" posting.
The only real OS is MVS, aks OS/390 aka Z/OS. It is truly an operating system. It makes no concessions for mere mortals, it is made to run a machine. It hails from 1966. All bow down before it.
Microsoft - Where would you like to go today, Maybe Jail?
you are a fuckin cheat...you will never defeat me
So after only 12, the low-latency patch degraded by an ungodly amount (1.3 -> 215.2 ms)!! and even the combined patch had a 25% degraded performance(1.2 -> 1.5 ms)!
Embedded systems must have a very high uptime, it's not acceptable to reboot the machine every day to maintain performance. Many embedded systems require a downtime of less than 5 minutes per year. That doesn't give you much time to reboot the machine just for performance issues.
Looking for any old 8-bit Heathkit/Zenith software/hardware - http://heathkit.garlanger.com
Here's a question. How do you go about doing fine grained measurements of these latencies? Every time I've tried doing timings with Linux I've had problems being able to get accurate, fine grained results.
We live in an age of multi GHz machines, yet we only have a timer that can go off 100 times per second on IA32. When can we join the real world (Alpha, Sparc) and get a 1024 HZ timer -- or even better -- get rid of HZ altogether. Realtime MIDI on Linux sucks unless it is the only process monopolising 100% of the CPU.
Unfortunately, the preemptable kernel was not pioneered by Montavista. The real developers were by the Real-Time Linux project. I believe there are several long-standing law-suits between the two. All of which the RTL people have won.
>"13 yr old trolls who end up breaking their >necks trying to suck their own dicks"
"Do people actually try that??"
"Yup, some do"
"I tried once, I couldnt reach it."
"Sick bastard."
You can start by submitting patches to the Linux Kernel Mailing List of code indented to 5 spaces, as is the standard in south america. The kernel maintainers always appreciate well indented code, and Linus usually integrates such patches almost immediately. But I must warn you - patches involving fewer than 50 files are rarely accepted by Linus. He wants to see a lot of effort on your part and weed out those "code indenter wannabees". Fortune 500 companies are always on the look out for industrious, self-starting code (re)indenters and are willing to pay big $$$. Get indenting today!
I've found SMP on Linux to work very well. There are definitely no delays on my dual cpu workstation and dual cpu server. I've even done some extensive bvenchmarks which show that Linux scales very well. At least compared to NT4/W2K where the results were dreadful.
I'm sorry that I can't offer any help but I hope you find where the problem is.
So in an indirect fashion, we have IBM to thank.
Anyone know if Redhat is planning on offering lower latency kernel RPM's for those of us who are loath to patch and recompile a kernel JUST to try something new out to see if we like it. Its kind of nice if I can drop in a quick RPM, decide weather I like it, and THEN compile a trimmed kernel properly if need be.
:-)
I'm just lazy.
--Nuintari
slashdot : where an opinion can be wrong.
...tency
heuristic algorithm seeks stochastic relationship
I'm missing on Clark Williams' paper how the patches influenced the OS overhead.
So many very good things happen to Linux kernel! I am impressed.
I would like to see similar response graphs for QNX or other RTOS's for comparisons sake.
Anyway IMHO to make a real assesment for any 'hard' realtime tasks is much too much effort for most of the readers here. =)
But here are more white papers than you can shake a stick at....
http://www.ece.umd.edu/serts/bib/index.shtml
Well, Windows CE 3.0 provides 50 ms latency response time running on a 166 MHz Pentium.
Slashdot = Sarcasm
Look carefully at the history of the postings. I use to think that a lot of these bad postings were overzealous users, but I no longer think so. Slashdot use to be really interesting, but it is now controlled by trolls. But, I am really starting to believe that many of these trolls are paid ppl whose jobs are meant to destroy OSS esp. Linux. But other than that, thanks for the info. I get tired of all the FUD about who has what.
Personally, I only run and code on linux these days so I no longer am aware of who has what.
I never realized there was competition between the two. I did hear the low-latency crowd claim that it was lower risk due to its less invasive nature. However, that hardly says anything about the performance of either approach - or that they should be mutually exclusive.
Two wrongs doesn't make a right, and vice-versa (but two Wrights make an airplane).
-
In X (KDE), I can move windows around, load programs, webpages etc. without my MP3-player ever beginning to skip.
- When doing massive file IO, the MP3-player begins to skip. tar cvzf file.tar.gz bla/ is still ok, but cp -R bla1 bla2 causes massive skipping.
- When I use the notebook as a samba server,
things get worse. Still, massive skpping. Additionally, the samba becomes dog-slow and even the mouse falls asleep.
-
Often times, after such phases of heavy load, the skipping and sound-distortion remains! So I have to reboot the machine from time to time to enjoy music again. Closing the player and opening it again is not enough. Somehow, under heavy load things get messed up enough to make a recovery impossible.
I did use the preemptive patch before, but performance under heavy load was even worse and the similar problems with rebooting occurred. I was using kernel 2.4.12 for preemptive and I am using kernel 2.4.17 currently. The machine is a Celeron 466 with 128 megs of ram. Still, the low-latency patch makes sense for machines that are primary for playing MP3s and reading emails (that's what my notebook is), but not for desktops with a wider variety of usage patterns. It's just not ready for primetime yet, but it's promising and fun!I see you've been studying the Trolling FAQ and you seem to have put many of the key principals into effect in your post. It's a little too obvious though.
Never mind, keep practicing.
Why the suprise? So many time I find that the best solution to a problem is a compromise between two or more extreme solutions.
To paraphrase the great philospher Hobbs, Linux is theworst of all possible worlds.
Some very thoughtful analysis clearly went into this. It's well written up with explanations that hit the right balance of having the key technical details but focusing on the big picture of how to make applications run better under Linux. As a casual follower of kernel development, I now understand far more of the trade-off than I used to.
I always think that tests and write-ups like this are a great way that people can contribute to Linux development without having to hack the kernel directly. There's no substitute for a thorough testing to help you improve your designs and theories.
Nice job!
Er... while some misinformed folks have in fact been arguing over "which approach is better," both Robert Love (preemption) and Andrew Morton (low latency), the authors of the patches, have agreed since before November that a hybrid approach is probably correct, and it seems to me (though I don't speak for them) that they're faintly embarassed at the number of True Believers who have stepped up to champion one or the other's side in this nondeathmatch. They're attacking different sections of the same problem.
Hey, genius, your task needs to be SCHED_RT to get any advantage from the low latency patch.
Nope, that's not true.
Pushin' 'n dealin', shovin' 'n stealin'
I wrote an article about low-latency for audio
/ 17 /low_latency.html
applications under Linux, you can read it here if interested:
http://linux.oreillynet.com/pub/a/linux/2000/11
It's more of a hands-on article, tells you how
to do it yourself with Andrew Morton's patches.
First, I wanted to give my view of the results - what they mean and what that means. Note there are multiple notions of latency performance. Average latency and worst-case latency, among others, but those are most important. This test measured worst-case latency. Both are important - for user experience average case is very important and for real-time applications worst-case is very important.
... it is going to be fun.
It is not a surprise the low-latency patches scored better, or that the ideal scenario was using both. The preemptive kernel patch is not capable of fixing most of the worst-case latencies. This is because, since we can not preempt while holding a lock, any long durations where locks are held now become our worst-case latencies. We have a tool, preempt-stats, that helps us find these. With the preempt-kernel, however, average case latency is incredibly low. Often measured around 0.5-1.5 ms. Worst-case depends on your workload, and varies under both patches.
Now, the results don't mention average case (which is fine), but keep in mind with preempt-kernel it is much lower. The good thing about these results are that it does indeed show that certain areas have long-held locks and the preempt-kernel does nothing about them. Thus a combination of both gives an excellent average latency while tackling some of the long-held locks. Note it is actually best to use my lock-break patch in lieu of low-latency in combination of with preempt-kernel, as they are designed and optimal for each other (lock-break is based on Andrew's low-latency).
So what is the future? preempt-kernel is now in 2.5 and, as has been mentioned, Andrew and I are working on the worst-case latencies that still exist. Despite what has been mentioned here, however, we are not going to adopt a low-latency/lock-break explicit schedule and lock-breaking approach. We are going to rewrite algorithms, improve lock semantics, etc. to lower lock-held times. That is the ease and cleanliness of the preemptive kernel approach: no more hackery and such to lower latency in problem areas. Now we can cleanly fix them and voila: preemption takes over and gives us perfect response. I did some lseek cleanup in 2.5 (removed the BKL from generic_file_llseek and pushed the responsibility for locking into the other lseek methods) and this reduced latency during lseek operations -- a good example.
So that is the plan
according to the IBM (Immense Blue Monolith) dictionary...
Sectors, we don't need no steenking sectors... your files should all be contiguous cylinders!
According to the article, low latency patches allows manufacturers to create low quality LoseModems, as the article indicates:
'Another example is companies that are implementing DSL modems with minimal hardware. To reduce the cost of the device they want to do away with the Digital Signal Processors (DSPs) that are used to process the analog signals on a phone line into DSL cells or frames. They do this by offloading the signal processing algorithms to the main processor (similar to the things that the infamous WinModems do). To successfully implement this sort of scheme requires that the interrupt dispatch latency and thread scheduling latency be minimized in the kernel.'
WE CANNOT ALLOW THIS TO HAPPEN! An high quality operating system should NEVER allow or encourage hardware manufacturers to dumb it down to Windoze levels of hardware mediocrity!! We MUST AGAINST LOSEMODEMS ON LINUX!!!
It pisses me off when I hear brain-dead schemes like this. Offloading tasks to dedicated hardware is plain good engineering. It increases system performance. Give me a dedicated DSP any day. It's not like they are expensive. Less than 5 bucks for a high performance DSP. And the payback is amortized across thousands of hours of use.
Low latency, pre-emptive. All nice and good. However, what I really want is to get a super-fast connection between my database server and my application server. How much will the lower latency patches affect the throughput, given that I operate in multiple small queries? (No way around it, at the moment. So please don't flame (too hard))
Will Ethernet devices, TCP/IP stacks and the lot become more responsive? Will MySQL/PostgreSQL/SapDB/Oracle/DB2/Interbase be able to execute a small query even faster? How much?
Actually, I hope to measure this sometime not too far into the future!
Stop the brainwash
When will this become stable enough for major distros to start using it?
I don't think anyone doubts that this is a good approach. But, both patches are still being worked on right now. And while the preempt patch has already been merged with the 2.5 kernel, the low-latency patch is still nowhere to be seen.
I certainly think that this would indeed have a great impact on Linux Multimedia, but not until a company like RedHat or SUSE is willing to include it at least as an optional kernel. The reason is, a vendor doesn't have to support patches until they include it in one of their pre-compiled kernels.
This might not mean much to home users, but a company will not rely on an unsupported feature.
Like it or not, business still drives the industry.
It's the best shine you've ever tasted.
<karma hoard>
This is quite a good thing when doing ports, e.g. Wux applications from Unix to Windows PDF here. Particularly insightful is "Chapter 3.2.2 Operating Systems Differences". This document can also serve as Unix to Windows porting 101. I wonder if the Win 3.1 stuff they are talking about is still valid in the non-MSDOS WinME,NT,2000,XP ?
</karma hoard>
A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
How will these changes affect the performance of Beowolf clustering?
Linux needs to tread carefully here, lest it loses ground in the fast-ramping clustering market - especially with the added intertia HP now has to move the market towards windows.
When I tried a kernel patched by some arbitrary person (i.e., not -ac or -aa, but rather someone appearing just several times a month on lkml) so that it contains all kinds of new things like the new scheduler (which DOES help) and preempt, I found that the system was very stable after two days of desktop use. Then I decided to try `nice -n 30 make -j' PRBoom, and the system is still as responsive as ever. I was happy and posted on Slashdot about how slick it is just after the compilation ended successfully. Wanting to verify my results, I `make clean' and retried. Guess what? Solid lockup without any logs. I hit the reset button and tried the compilation again in console mode with only `top' running. 2 out of 3 times the system locked up solid, Alt-Sysrq just shows messages without doing real things.
I then rebooted and ran the kernel for another two days with constant fear that it may mysterically lock up at any minute. Luckily it didn't, and I then replaced the kernel with a -aa one, without most of the new features, but at least it is rock solid, hasn't crashed since. I think I will add the preempt (or ll) patches only after Marcelo or A.C. or A.A. incorporate it, or when I decide to do some kernel hacking.
So my advice is that try arbitrarily patched (I mean putting patches together by yourself or by someone else who did not test it extensively) kernels only in deep-kernel-hack-mode, just like when you try a kernel in which you modified several lines by yourself. After all, many of us has been spoilt by Linux's stability, and our nerves are not quite prepared for a lock-up when facing a screen with WindowMaker (rather than Win98) on it.
Is it true that STREAM gets lower RAM throughput results when the box has USB devices attached to?
Wide pages make baby jesus laugh.