Anticipatory Scheduler in Kernel 2.5+ Benchmarked
gmuslera points to this article at KernelTrap comparing available benchmarks for schedulers available for the 2.5 kernel, with the 2.4's scheduler as a reference poin. "In some cases, the new Anticipatory Scheduler performs several times better than the others, doing a task in a few seconds instead minutes like the others."
Somehow, I just *knew* this was coming. ;)
I watched C-beams glitter in the dark near the Tannhauser gate.
The anticipation is killing me .
If only kerneltrap.org were running the new schedu... oh, never mind.
What has *science* done?!? -- Dr. Weird (ATHF)
Seems to be slashdotted already... Wonder if it saw that coming.
In some cases, the new Anticipatory Scheduler performs several times better than the others, doing a task in a few seconds instead minutes like the others.
The task in question was anticipating things, so the test might not be all that fair.
"Probably the toughest time in anyone's life is when you have to murder a loved one because they're the devil." -Philips
I have a multithreaded perl app (yes perl has descent thread support in 5.6 and later) which does a lot of mixed write/read I/O to and from a database. With 2.4 and 40 threads I can hardly use the system (of course I dont have to abuse my computer with 80 threads but Im trying to prove a point here).
Switching to a new tabbed terminal in fluxbox it takes ages to redraw and switching between virtual desktops is an act of futility.
With 2.5 It get good interactive performance and don't see this effect much at all. For sure this is also a bit due to the new VM code.
Of course I would probably get the best interactivity with the SFQ scheduler but thats secondary in this case. At least xmms doesn't skip with this during very heavy I/O. I do not use the new NPTL code which would help further I suppose.
It reminds of the old song and Heinz Ketchup commercial:
"Anticipation is making me wait"...
I'm still anticipating the "t" in "point" myself...
--
http://www.aikiweb.com - AikiWeb Aikido Information
Too many connections
Things you think are in the Constitution, but are not.
You know, you can ALMOST feel an admin over there just itching to type in "Fuck you Taco! And your site!" instead of the connection stuff...
Hate me!
Me: Computer, I would like to open Netscape
Computer: I have anticipated you would like to open IE and have already opened it for you.
Me: Ok, then I would like to go to the game review site to see what I want to buy.
Computer: I have already begun the download of the new Age of Empires game, your account has been charged.
Me: Can I at least go to the bathroom?
Computer: No.
"Probably the toughest time in anyone's life is when you have to murder a loved one because they're the devil." -Philips
It's a trap! We've got to give this website more time. Concentrate all firepower on the next article below!
/ackbar
... the Mozilla developers added a special "Slashdotted" plugin you know. So you could launch a special tab that would keep hammering away at a site in the background until it did bloodywell load ;-)
Code, Hardware, stuff like that.
Quoeth the driver:
* These chips are basically fucked by design, and getting this driver
* to work on every motherboard design that uses this screwed chip seems
* bloody well impossible. However, we're still trying.
Hilary Rosen's speech was about her love of money and her desire to roll around naked in a pile of money.
Given that kerneltrap has "Too many connections", i don't know if they have this link: http://www.cs.rice.edu/~ssiyer/r/antsched
where it explains what anticipatory scheduling does.
(btw, it seems that freebsd had it for ages)
What does it do for kernel performance? Or does it do anything for performance? ...I haven't run Linux on a desktop for almost a year. It hurts mommy.. please make the bad man go away!
- Jimbob
The blurb didn't mention that the article is comparing disk schedulers, not CPU schedulers.
I remember Ingo Molnar introducing this scheduler running in O(1) time months ago, sometimes late in 2002... AFAIK it is a part of the 2.5 kernel for quite a long time.. and at the time it was first tested there were some benchmarks.. I vaguely remember something about "we tried to launch several hundreds of processes, w.o. the scheduler: 15 minutes, w. the scheduler: 2 seconds." So what is so new about some benchmarks being available?
Or am I completely off-topic? ;)
...of course, LWN readers knew about the anticipatory scheduler back in January. We also looked at the SFQ and CFQ I/O schedulers two weeks ago.
Jonathan Corbet, LWN.net
I second that (albeit only 2 months). I used to have many problems compiling 2.5, but the newer releases are truly wonderful, also the new module system kick ass, esp. now when I can compile ALSA in and not worry about a thing. 2.5 is WAY easier to get up and running than 2.4, I really recommend trying it out, but be sure to read Documentation/Changes ! (although I run Debian/Woody and didn't bother to upgrade anything except module-init-tools, works for me ;)
Oh, and while ur at it u might want to take a look at devfs too (I run it without devfsd and I didn't have any problems, u just need to change inittab from ttyX to vc/X, fstab and lilo/grub).
The only kinks I had was with xine which doesn't like v4l, but mplayer works fine... Hmm, oh and same FPS for Q3, I doubt that any of u will feel much difference anywhere (unless u didn't use the preempt patch for 2.4). But I haven't played with NPTL yet, does anybody have any experience with that?
what goes up must come down, ask any sysop / sig11
If you're really curious, you can check out the mailing list for more info. Try searching for "IO scheduler benchmarking" or "iosched". To save the mailing lists, here's a few interesting benchmarks:
/dev/null &
...(up to)
...(up to)
...(up to)
...(up to)
/dev/null ) &
...(up to)
...(up to)
...(others)
...(up to)
...(up to)
Parallel streaming reads:
Here we see how well the scheduler can cope with multiple processes reading
multiple large files. We read ten well laid out 100 megabyte files in
parallel (ten readers):
for i in $(seq 0 9)
do
time cat 100-meg-file-$i >
done
2.4.21-pre4:
0.00s user 0.18s system 2% cpu 6.115 total
0.02s user 0.22s system 1% cpu 14.312 total
0.01s user 0.16s system 0% cpu 37.007 total
2.5.61+hacks:
0.01s user 0.16s system 0% cpu 2:12.00 total
0.01s user 0.15s system 0% cpu 2:12.12 total
0.01s user 0.19s system 0% cpu 2:13.51 total
2.5.61+CFQ:
0.01s user 0.16s system 0% cpu 50.778 total
0.01s user 0.16s system 0% cpu 51.067 total
0.01s user 0.18s system 0% cpu 1:32.34 total
2.5.61+AS
0.01s user 0.17s system 0% cpu 27.995 total
0.01s user 0.18s system 0% cpu 30.550 total
0.01s user 0.16s system 0% cpu 34.832 total
streaming write and interactivity:
It peeves me that if a machine is writing heavily, it takes *ages* to get a
login prompt.
Here we start a large streaming write, wait for that to reach steady state
and then see how long it takes to pop up an xterm from the machine under
test with
time ssh testbox xterm -e true
there is quite a lot of variability here.
2.4.21-4: 62 seconds
2.5.61+hacks: 14 seconds
2.5.61+CFQ: 11 seconds
2.5.61+AS: 12 seconds
Streaming reads and interactivity:
Similarly, start a large streaming read on the test box and see how long it
then takes to pop up an x client running on that box with
time ssh testbox xterm -e true
2.4.21-4: 45 seconds
2.5.61+hacks: 5 seconds
2.5.61+CFQ: 8 seconds
2.5.61+AS: 9 seconds
copy many small files:
This test is very approximately the "busy web server" workload. We set up a
number of processes each of which are reading many small files from different
parts of the disk.
Set up six separate copies of the 2.4.19 kernel tree, and then run, in
parallel, six processes which are reading them:
for i in 1 2 3 4 5 6
do
time (find kernel-tree-$i -type f | xargs cat >
done
With this test we have six read requests in the queue all the time. It's
what the anticipatory scheduler was designed for.
2.4.21-pre4:
6m57.537s
6m57.916s
2.5.61+hacks:
3m40.188s
3m56.791s
2.5.61+CFQ:
5m15.932s
5m50.602s
2.5.61+AS:
0m44.573s
0m53.087s
This was a little unfair to 2.4 because three of the trees were laid out by
the pre-Orlov ext2. So I reran the test with 2.4.21-pre4 when all six trees
were laid out by 2.5's Orlov allocator:
6m12.767s
6m13.085s
Not much difference there, although Orlov is worth a 4x speedup in this test
when there is only a single reader (or multiple readers + anticipatory
scheduler)
GCC (and I assume others) can do this. Basically, you compile with -fprofile-arcs, run the executable enough to generate sufficient data, then compile with -fbranch-probabilities. This will try to order basic blocks so that the CPU predicts branches correctly most often.
I have never done it, but it is supposed to work. Unfortunately, it is pretty much limited to static analysis -- it doesn't allow for programs whose usage patterns change with time. For that you need some kind of dynamic recompilation, such as provided by HP's Dynamo, Transmeta's code morphing, or perhaps some Java JITs (I don't know if any of them implement this).
Personally, I think profile directed optimization done by a static compiler is a waste of time. All optimizations should be done at the best place, and for many optimizations, that is the static compiler, but many others can be better done by run time optimizers, or the CPU, and this is one of them.
here is the clickable link
mirror of story to be found here
http://www.stuwo.net/download/ktrap.html
The snippet you quote is from the cmd640 driver, which covers only the chipset by the same name. Subsequent CMD chips, including the 649, use the cmd64x driver and are not fucked.
Also, I doubt that one could alter the I/O scheduler (let alone install an alternative) in the win* operating systems.
The AS I/O scheduler is very very interesting. I hope some kind soul would backport it to 2.4.
Although I _suspect_ they will run fine on 2.5, I don't want to risk it. It's still a little too bleeding edge for me. They call it bleeding edge for a reason, because you _will_ bleed and get hurt from time to time.
I guess I am a big fat ninny when it comes to bleeding edge stuff (although I do lust for all the new toys, the waiting just increases my contentness when such cool stuff gets part of stable stuff) :-)
Speaking of avoiding the bleeding edge, it would be sooo cool if this IO scheduler was backported to 2.4.
When I first started hacking on Linux, I was working with a seasoned Linux kernel hacker who my company hired as a consultant. He helped us with some I/O issues and such, did some other tweaks and gave us a ton of inspiration to go get after it ourselves. (You be amazed at how many people are afraid to just start making changes to kernel code) He is a wickedly cool individual and as someone whose had a lot of schooling and experience it was one of the best learning experiences I can remember.
The first thing I started dorking with after that experience was the scheduler because I, like all other hakers, know how to schedule stuff. At the time, (early 2.x) the scheduler was also a fairly easy to digest piece of code that could have impacts on the system in great ways.
Well all my stuff got bit bucketed. I called up our consultant guy who my friend by now, "what's the deal? Linus doesn't like my stuff. How do you mail him stuff?" And his answer was that pretty much every body wants to tweak the scheduler, everybody sends stuff in. Linus is sage in his wisdom, schedulers are freaking hard because there is always a pedantic worst case that sucks and actually shows up in the real world. Linus has always done fairly simple things that aren't best but certainly aren't worst. So 2.0 had pretty straight round robin. 2.2 and 2.4 they started to add queuing schedulers with niceness. 2.5 we're going to get a pretty killer scheduler that has taken a ton of effort to tweak and there are still discussions to expose parameters to the user via /proc or something because you can find cases were it doesn't perform as well.
Now this IO scheduler is opening up a whole new can of worms, it's a new chunk of code called "scheduler" and all hackers know scheduling. In the past it has been fairly simple. It should be fun to watch and the kernel is going to kick mucho ass in the end. There will be a lot of talk and debate about this stuff. It's also distilled down to the trusted set that Linus will let play with things called "scheduler"
I'm currently enrolled in cs 162 (OSs) at UC Berkeley, todays lecture was on different flavors of schedulers and this scheduler was mentioned breifly. For more theoretical info see http://webcast.berkeley.edu/courses/archive.html?p rog=116&group=52 for a webcast of the lecture or http://inst.eecs.berkeley.edu/~cs162/Lectures/L11. pdf for a pdf of the lecutre notes.
The problem is that such I/O layers need to be implemented at least partially outside user-space in the case where the file is being simultaneously accessed to allow interprocess coordination. Also, to get best use, everything should use it.
The article mentions multiple simultaneous writes and reads... Doing two tasks at once in much more expensive than doing them sequentially.
.5 second to get a login screen.
Why not use the download manager programs... for all file transfering?
My priorities:
1. user interface responds effectively in realtime.
2. CD writes don't fail
3. Video doesn't skip
4. files transfer quickly.
I would actually like the ability to switch the mode of the file schedualer.
If I am not doing 2. or 3. then why not switch to something that makes 4 happen?
I saw something rediculous, like a 10 second wait for a login prompt??!?!
The system should have that all ready ahead of time, and it should take no more than
--I don't care about spelling enough to spend the vast quantity of time to get this to the spell checker.
Please use [ informative / summarizing ] SUBJECT LINES
Flame me here
Actually Mach was developed at CMU under Rick Rashid. Tevanian was a graduate student of his. Most of the Mach team is at Microsoft Research, including Rashid (who heads up Microsoft Research). They tried to convince Tevanian to come there, but he decided instead to go to NeXT Computer, which also was based on the Mach microkernel. When Apple acquired NeXT, it took most of their OS and development philosophy also.
X11 tru, but not even what makes timing so different. it is windows taking care of its gui (or at least much of it) in process with the OS that really changes things. x11 has to go through system calls.
Nah, it handles certain start-up costs for complex applications better. This may or may not have anything to do with multithreading per se.
I don't run KDE, but I understand that it has had speed issues in the past because it uses a lot of interconnected C++ shared libraries, which really tax the dynamic loader. The Windows link scheme, by the way, is much more primitive (read: fast at runtime). Microsoft also uses a hack (disk layout profiling) to speed up load time further. (Not that "hack" is necessarily a bad thing - after all it does get the job done.)
A couple of years ago, Jakub Jelinek came up with a utility similar to IRIX Quickstart for ELF binaries / libraries, which does "prelinking" to dramatically reduce relocation overhead at runtime in the common cases (without sacrificing flexibility, for the uncommon cases). A side effect is reducing memory usage due to COW. I never heard what happened to that project - anyone know if it is considered production-quality yet, or if binutils / glibc will be shipping it any time soon? Apparently it helped KDE quite a bit.
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Are you sure that you are not seeing the improved desktop interactivity from the kernel premption and low latency patches in 2.5? I suspect that they would affect desktop interactivity more than this scheduler...
KDE has gained quite some speed with the last version changes. The gap is not as large as you remember.
Here is the explanation of what anticipatory scheduling is. From what I have understood (please correct me if I am wrong, I am not a kernel hacker), 'anticipatory scheduling' means the following:
The I/O subsystem (the part of the operating system that reads/writes to/from the hard disk) waits a little longer before servicing an I/O request from an application other than the current one; if the current application issues another I/O request while the I/O subsystem is waiting, the overall system throughput is higher because the hard disk's head moves less.