Slashdot Mirror


Anticipatory Scheduler in Kernel 2.5+ Benchmarked

gmuslera points to this article at KernelTrap comparing available benchmarks for schedulers available for the 2.5 kernel, with the 2.4's scheduler as a reference poin. "In some cases, the new Anticipatory Scheduler performs several times better than the others, doing a task in a few seconds instead minutes like the others."

20 of 252 comments (clear)

  1. how it works by diegocgteleline.es · · Score: 5, Informative

    Given that kerneltrap has "Too many connections", i don't know if they have this link: http://www.cs.rice.edu/~ssiyer/r/antsched

    where it explains what anticipatory scheduling does.


    (btw, it seems that freebsd had it for ages)

  2. Disk I/O, not CPU schedulers by Wesley+Felter · · Score: 4, Informative

    The blurb didn't mention that the article is comparing disk schedulers, not CPU schedulers.

  3. Re:Ok.. I'll bite... what's a Scheduler? by Caoch93 · · Score: 5, Informative
    In this case, they're talking about an I/O Scheduler, which is in charge of planning I/O through some resource so that activities on that resource correctly complete as quickly as possible. To be specific, this scheduler is for the hard disk I/O Scheduler, which plans reads and writes from/to your hard disks.

    The Anticipatory Scheduler is designed to optimize your disk I/O based on the assumption that reads of related material tend to happen in short succession, while writes tend to be singular and larger. As a result, when the scheduler encounters a read, it anticipates that there will be other reads in short succession, so it waits and then checks for them and, if they're there, they move to the front of the line. The name comes from the tiny waiting period when it anticipates future reads.

    This is, of course, a condensed version of what I think I've learned from reading KernelTrap for the last few months. Someone's bound to tell you I'm talking arse.

  4. Good stuff, but... by Corbet · · Score: 5, Informative

    ...of course, LWN readers knew about the anticipatory scheduler back in January. We also looked at the SFQ and CFQ I/O schedulers two weeks ago.

    --
    Jonathan Corbet, LWN.net
  5. Check the mailing list for more info by Miles · · Score: 5, Informative

    If you're really curious, you can check out the mailing list for more info. Try searching for "IO scheduler benchmarking" or "iosched". To save the mailing lists, here's a few interesting benchmarks:

    Parallel streaming reads:
    Here we see how well the scheduler can cope with multiple processes reading
    multiple large files. We read ten well laid out 100 megabyte files in
    parallel (ten readers):

    for i in $(seq 0 9)
    do
    time cat 100-meg-file-$i > /dev/null &
    done

    2.4.21-pre4:

    0.00s user 0.18s system 2% cpu 6.115 total
    0.02s user 0.22s system 1% cpu 14.312 total ...(up to)
    0.01s user 0.16s system 0% cpu 37.007 total

    2.5.61+hacks:

    0.01s user 0.16s system 0% cpu 2:12.00 total
    0.01s user 0.15s system 0% cpu 2:12.12 total ...(up to)
    0.01s user 0.19s system 0% cpu 2:13.51 total

    2.5.61+CFQ:

    0.01s user 0.16s system 0% cpu 50.778 total
    0.01s user 0.16s system 0% cpu 51.067 total ...(up to)
    0.01s user 0.18s system 0% cpu 1:32.34 total

    2.5.61+AS

    0.01s user 0.17s system 0% cpu 27.995 total
    0.01s user 0.18s system 0% cpu 30.550 total ...(up to)
    0.01s user 0.16s system 0% cpu 34.832 total

    streaming write and interactivity:
    It peeves me that if a machine is writing heavily, it takes *ages* to get a
    login prompt.

    Here we start a large streaming write, wait for that to reach steady state
    and then see how long it takes to pop up an xterm from the machine under
    test with

    time ssh testbox xterm -e true

    there is quite a lot of variability here.

    2.4.21-4: 62 seconds
    2.5.61+hacks: 14 seconds
    2.5.61+CFQ: 11 seconds
    2.5.61+AS: 12 seconds

    Streaming reads and interactivity:
    Similarly, start a large streaming read on the test box and see how long it
    then takes to pop up an x client running on that box with

    time ssh testbox xterm -e true

    2.4.21-4: 45 seconds
    2.5.61+hacks: 5 seconds
    2.5.61+CFQ: 8 seconds
    2.5.61+AS: 9 seconds

    copy many small files:
    This test is very approximately the "busy web server" workload. We set up a
    number of processes each of which are reading many small files from different
    parts of the disk.

    Set up six separate copies of the 2.4.19 kernel tree, and then run, in
    parallel, six processes which are reading them:

    for i in 1 2 3 4 5 6
    do
    time (find kernel-tree-$i -type f | xargs cat > /dev/null ) &
    done

    With this test we have six read requests in the queue all the time. It's
    what the anticipatory scheduler was designed for.

    2.4.21-pre4:
    6m57.537s ...(up to)
    6m57.916s

    2.5.61+hacks:
    3m40.188s ...(up to)
    3m56.791s

    2.5.61+CFQ:
    5m15.932s ...(others)
    5m50.602s

    2.5.61+AS:
    0m44.573s ...(up to)
    0m53.087s

    This was a little unfair to 2.4 because three of the trees were laid out by
    the pre-Orlov ext2. So I reran the test with 2.4.21-pre4 when all six trees
    were laid out by 2.5's Orlov allocator:

    6m12.767s ...(up to)
    6m13.085s

    Not much difference there, although Orlov is worth a 4x speedup in this test
    when there is only a single reader (or multiple readers + anticipatory
    scheduler)

  6. Re:It would be neat... by norton_I · · Score: 4, Informative

    GCC (and I assume others) can do this. Basically, you compile with -fprofile-arcs, run the executable enough to generate sufficient data, then compile with -fbranch-probabilities. This will try to order basic blocks so that the CPU predicts branches correctly most often.

    I have never done it, but it is supposed to work. Unfortunately, it is pretty much limited to static analysis -- it doesn't allow for programs whose usage patterns change with time. For that you need some kind of dynamic recompilation, such as provided by HP's Dynamo, Transmeta's code morphing, or perhaps some Java JITs (I don't know if any of them implement this).

    Personally, I think profile directed optimization done by a static compiler is a waste of time. All optimizations should be done at the best place, and for many optimizations, that is the static compiler, but many others can be better done by run time optimizers, or the CPU, and this is one of them.

  7. Re:I've tried it and it rocks by bill_mcgonigle · · Score: 2, Informative

    I'm sorry to sound like an Trolling AC, but I can't imagine why you would need 80 threads. Unless you using threads instead of a queue. I love threads, but they are and over used and hard to control and debug. The most complex thread system I've needed had ~25 threads running on 5 computers.

    It's far easier to maintain any multiplicity of state machines using threads rather than queues. Think of multi-user servers using stateful protocols. You might have hundreds or thousands of threads.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  8. cache / copy / mirror of page is here (mod up!) by Anonymous Coward · · Score: 1, Informative

    a valid copy and cache of the page is here to be found.

    mod up:

    http://www.stuwo.net/download/ktrap.html

    1. Re:cache / copy / mirror of page is here (mod up!) by Anonymous Coward · · Score: 5, Informative

      here is the clickable link

      mirror of story to be found here

      http://www.stuwo.net/download/ktrap.html

  9. Re:Great! by DeeKayWon · · Score: 4, Informative

    The snippet you quote is from the cmd640 driver, which covers only the chipset by the same name. Subsequent CMD chips, including the 649, use the cmd64x driver and are not fucked.

  10. Re:What about other OSen? (*BSD, Windows) by CoolVibe · · Score: 2, Informative
    Well, the AS scheduler started life as a FreeBSD implementation (BSD dead? yeah right... whatever) but doing those benchmarks on other OSen on similar hardware is like comparing Apples and Oranges. Hardly worth it.

    Also, I doubt that one could alter the I/O scheduler (let alone install an alternative) in the win* operating systems.

    The AS I/O scheduler is very very interesting. I hope some kind soul would backport it to 2.4.

  11. Re:It's about time... by Anonymous Coward · · Score: 2, Informative
    They did just threaten to sue some guy for using the term GOOGLE as a verb on his website.
    No, they didn't. And it wasn't a cease-and-desist letter. All they said was, "Hi, I'm a google lawyer (TM). We like our trademark, which for obvious reasons you'll understand. Could you please indicate that google is a trademark in your definition, or, if it's easier, just remove it? Thanks.".

    Unlike you, I'm not too lazy to type "google" into the Slashdot search box you probably see at the top-right of this page, click search, and click through to the article and from there the (NON-cease-and-desist) letter.
    It reads, in full:

    Dear Mr. McFedries:

    I am trademark counsel for Google. I have recently become aware of a
    definition of "google" on your website, www.wordspy.com. This definition
    implies that "google" is a verb synonymous with "search." Please note that
    Google is a trademark of Google Technology Inc. Our brand is very
    important to us, and as I'm sure you'll understand, we want to make sure
    that when people use "Google," they are referring to the services our
    company provides and not to Internet searching in general. I attach a copy
    of a short, informative piece regarding the proper use of "Google" for your
    reference.

    We ask that you help us to protect our brand by deleting the definition of
    "google" found at wordspy.com or revising it to take into account the
    trademark status of Google.
    [basically saying: Note: Google (TM) is a registered trademark of Google Technologies].

    Now quit spreading FUD. I love Google.
  12. Re:Ok.. I'll bite... what's a Scheduler? by RainbowSix · · Score: 5, Informative

    Other forms of disk scheduling are a little more simple. Assume that the disk is really slow, and lots of requests are coming in that are buffered somewhere. Obviously you want to handle requests that are close to where the disk head currently is since it is faster and you won't have your head going all over the place.

    FCFS (first come first serve) - easy stupid way. Take requests as they come. If you need front end front end, performance suffers because obviously you want to do front front end end.

    SSTF (shortest seek time first) - do the request that is shortest to the head first. The problem with this is if you keep asking for stuff at the front of the disk and have a lone request for the end of the disk, the lone request could get ignored for a long time (starved) since the scheduler does the stuff at the front since seek times are much lower.

    SCAN - the head starts at one end of the disk and goes to the end, servicing requests along the way, but never going back so that that lone request from the previous method does get serviced. When it gets to the end the head moves back toward the front, servicing requests along the way. It keeps going back and forth.

    C-SCAN -Variant of SCAN where it doesn't go back and forth. It goes from front to back servicing requests then goes all the way back to the front before it starts servicing again. It gives more uniform times because if you're using SCAN and your request at the beginning is just missed by the head, you have to wait until it goes all the way to the other end and comes all the way back. In this method it goes to the end and then you're the next request to be serviced if you are at the beginning.

    LOOK - The same as CSCAN and SCAN except it doesn't blindly go to the end of the disk; it stops and turns around when there aren't any more requests in the direction. Of course, if you show up right after the head changes direction, sucks to be you :)

    --
    --------
    It's OK to be social, just don't tell anyone about it.
  13. File I/O primitives by anonymous+cupboard · · Score: 3, Informative
    It seems that the main problem is that the file I/O primitives in Linux are a little too primitive. On some operating systems, there is another layer sitting between the file system and the user which provides record and block management. If I open a file for sequential read or write then the record/block manager knows that it could use read-ahead or write-behind.

    The problem is that such I/O layers need to be implemented at least partially outside user-space in the case where the file is being simultaneously accessed to allow interprocess coordination. Also, to get best use, everything should use it.

    1. Re:File I/O primitives by Wesley+Felter · · Score: 3, Informative

      On Linux the kernel detects sequential I/O and does the readahead automatically. So it's not really clear what Linux is missing.

    2. Re:File I/O primitives by anonymous+cupboard · · Score: 2, Informative

      Your idea is quite good for minimalistic databases like MySQL but not so good for the Oracles or whatever that do their own I/O management. They are aware of the diffeernce between tables and values and attempt to logical about the way that they manage them. Frequently they have their own LRU cache and read-ahead strategies independent of the underlying OS.

  14. Not exactly by the+ed+menace · · Score: 2, Informative

    Actually Mach was developed at CMU under Rick Rashid. Tevanian was a graduate student of his. Most of the Mach team is at Microsoft Research, including Rashid (who heads up Microsoft Research). They tried to convince Tevanian to come there, but he decided instead to go to NeXT Computer, which also was based on the Mach microkernel. When Apple acquired NeXT, it took most of their OS and development philosophy also.

  15. Preemption patches by Sits · · Score: 2, Informative

    Are you sure that you are not seeing the improved desktop interactivity from the kernel premption and low latency patches in 2.5? I suspect that they would affect desktop interactivity more than this scheduler...

  16. Re:Finally by Phosphan · · Score: 2, Informative
    Prelinking is already there. Have a look at this page for an introduction (well, with some gentoo-specific stuff).

    KDE has gained quite some speed with the last version changes. The gap is not as large as you remember.

  17. Anticipatory scheduling is: by master_p · · Score: 3, Informative

    Here is the explanation of what anticipatory scheduling is. From what I have understood (please correct me if I am wrong, I am not a kernel hacker), 'anticipatory scheduling' means the following:

    The I/O subsystem (the part of the operating system that reads/writes to/from the hard disk) waits a little longer before servicing an I/O request from an application other than the current one; if the current application issues another I/O request while the I/O subsystem is waiting, the overall system throughput is higher because the hard disk's head moves less.