Slashdot Mirror


The New Linux Speed Trick

Brainsur quotes a story saying " Linux kernel 2.6 introduces improved IO scheduling that can increase speed -- "sometimes by 1,000 percent or more, [more] often by 2x" -- for standard desktop workloads, and by as much as 15 percent on many database workloads, according to Andrew Morton of Open Source Development Labs. This increased speed is accomplished by minimizing the disk head movement during concurrent reads. "

27 of 426 comments (clear)

  1. I've noticed it... by Anonymous Coward · · Score: 5, Interesting

    I'm having trouble getting ACPI working in my laptop in the 2.6 kernel (it's a bad implementation on the part of my laptop). The 2.4 series used to work (sometimes) so I installed Mandrake's 2.4 kernel and 2.6 kernels on my laptop. Using 2.4.x again was like switching to a horse and buggy from a sport-cars; KDE was that much faster with the 2.6.x kernel running the show.

  2. Cache? by Anonymous Coward · · Score: 4, Interesting

    Whatever happened to cache. If you can anticipate the head movement surely you have already read the data before and it should be in the cache????

    1. Re:Cache? by Erik+Hensema · · Score: 5, Informative

      Sure, and both Linux 2.4 and 2.6 do caching and read-ahead (reading more data than requested, hoping that the application will request the data in the future).

      The I/O scheduler however lies beneath the cache layer. When it's decided that data must be read from or written to disk, the request is placed in a queue. The scheduler may reorder the queue in order to minimize head movements.

      Also, 2.6 has the anticipatory I/O scheduler: after a read, the scheduler simply pauses for a (very) short period. This is done in the assumption that the application will request more data from the same general area on the disk. Even when other requests are in the I/O queue, requests to the area where the disk's heads are hovering will get priority.

      While this increases latency (the time it takes for a request to be processed) a bit, throughput (the amount of data transfered in a time period) will also increase.

      It did take a fair amount of experimenting and tuning in order to make the I/O scheduler work as well as it does now. However there still may be some corner cases where the new scheduler is much slower than the old.

      --

      This is your sig. There are thousands more, but this one is yours.

  3. SCSI by Zo0ok · · Score: 4, Interesting

    Dont SCSI drives do this themselves?

    1. Re:SCSI by DuSTman31 · · Score: 5, Informative

      Yeah, I think so. IIRC it's called tagged command queueing - the drive can have multiple requests pending and instead of doing them first come first served, they're fulfilled in order of estimated latency to that point.

      I believe Western Digital's recent Raptor IDE drives have the same feature.

      The benefit of this seems contingent upon having multiple requests pending, which AFAIK is hard on linux as there's no non-blocking file IO. To me, this reads like a workaround for that.

    2. Re:SCSI by KagatoLNX · · Score: 4, Informative

      ATA is basically the SCSI protocol (the good part) over IDE. There's a reason why some SATA drives appear as SCSI adapters under Linux.

      Expensive, yes. Aging, no. Ten years ago people said SCSI was the future. Now everyone runs it, they just don't know it.

      IDE in its original form has never been able to keep up with a 10k RPM (or higher) disk.

      I think what the parent post is alluding to is Tagged Queuing. Tagged Queueing allows you to group blocks together and tell the drive to write them in some priority. That sort of thing is used to guarantee journaling and such. Interestingly, the lack of this mechanism is why many IDE drives torch journalled fs's when they lose power during a write--they do buffering but without some sort of priority. You can imagine I was pretty torqued the first time I had to fsck an ext3 (or rebuild-tree on reiserfs) after a power failure.

      The reason that the kernel helps even with the above technology is that the drive queue is easily filled. Even when you have a multimegabyte drive cache and a fast drive, large amounts of data spread over the disk can take a while to write out.

      This scheduler is able to take into account Linux's entire internal disk cache (sometimes gigs of data in RAM) and schedule that before it hits the drives.

      --
      I think Mauve has the most RAM. --PHB (Dilbert Comic)
    3. Re:SCSI by jesup · · Score: 5, Insightful

      ATA is definitely not SCSI-over-IDE.
      ATAPI is SCSI-over-IDE however.

      I wrote the IDE/ATA drivers for the Amiga. The Amiga SCSI drivers accepted "SCSIDirect" commands from applications. Internally, all IO commands were converted to SCSIDirect commands for execution. To implement ATA, I added a SCSIDirect->ATA translator (which wasn't that hard - about 3 weeks from start to working, booting system - and I implemented just about all SCSI commands even semi-reasonable (all of CCS I think, plus quite a bit).

      Doing it this way made implementing support for ATAPI CDROMs (something I did as a contract after Commodore folded) Very Easy. :-)

  4. Cool by JaxWeb · · Score: 4, Informative

    It seems there are two IO modes you can choose from, at boot time.

    "The anticipatory scheduling is so named because it anticipates processes doing several dependent reads. In theory, this should minimize the disk head movement. Without anticipation, the heads may have to seek back and forth under several loads, and there is a small delay before the head returns for a seek to see if the process requests another read. "

    "The deadline scheduler has two additional scheduling queues that were not available to the 2.4 IO scheduler. The two new queues are a FIFO read queue and a FIFO write queue. This new multi-queue method allows for greater interactivity by giving the read requests a better deadline than write requests, thus ensuring that applications rarely will be delayed by read requests."

    Nice, but this is making things more complex. I admit I'll just keep all kernel settings at wherever Mandrake sets them as. Will other people play about and specialise their system for the task that it does?

    --
    - Jax
    1. Re:Cool by PyromanFO · · Score: 4, Informative

      This troll comes up in any thread that has anything to do with Linux at all. Who the hell said anything about asking people to choose? This is for developers and hackers to mess with. The distro you're using will choose for you, just like Microsoft chooses what Windows drivers you have loaded by default. Does every person who runs a Dell Windows machine have to decide what version of the driver to use? No Dell installs it for them. However power users can install newer/beta drivers if they want. Same thing here, power users can enable this if they want. If not you'll never have to know about it or touch it.

      Sorry for biting on the troll but I felt like explaining it.

  5. Why not combine those two methods? by maxwell+demon · · Score: 4, Interesting

    Is there any reason why the prediction code (anticipatory scheduler) and the extra queues (deadline scheduler) couldn't be combined in a single scheduler to give us the best of both worlds?

    --
    The Tao of math: The numbers you can count are not the real numbers.
    1. Re:Why not combine those two methods? by mirko · · Score: 4, Insightful


      what would you have expected the kernel 2.8 to bring you ?
      </joke>

      Basically, I think this is like the windows system settings : you either privilegiate front end services (GUI) or back end services (apache, etc) but you cannot do both because some would be optimized for reactivity, the others to handle the workload... like a ferrari and a truck... this doesn't work nor excel in the same way.

      --
      Trolling using another account since 2005.
    2. Re:Why not combine those two methods? by Anonymous Coward · · Score: 5, Informative

      I believe that the anticipatory sched uses the model of the deadline sched. See "Linux Kernel Development" by Robert Love.

  6. Amiga Disks by tonywestonuk · · Score: 5, Interesting


    When I had an Amiga (aroung '91ish), even though It was fully multitasking, I learnt to never open any app while another was loading. If you did, you could hear the disk head moving back and forward between two sectors on disk every half second or so, slowing both app launches to a crawl. Waiting until one loaded, and launching the second was many times faster.

    I've always wondered why there wasn't something in the OS to force this behaviour, Ie, making sure that App 2 access to the disk is queued until app 1 has finished. Isn't this one of the reasons Windows takes ages to boot? (many processes all competing for the one disk resource?).

    1. Re:Amiga Disks by jarran · · Score: 4, Informative

      Because it's a lot more complicated that you suggest. What happens if A gets in first, but is doing an extremely long a disk-bound task? B will never get chance to access the disk. It could even be that B would stop after a very short amount of disk access, in which case it will have to wait until A is done, even though interleaving the reads would have been the "right thing to do".

      Being multi-user complicates things even further. Sure, you are a single user on a desktop machine, and you double click on two programs in rapid succession, queuing them for loading one after the other may be the right thing to do. But what if those programs are actually being loaded by two different users? Can we completely lock out one user just because they started loading their program slightly later? Again, what if user A runs emacs, and a fraction of a second later, user B runs ls? Under your system, B effectively has to wait as long as it would take to load emacs, plus as long as would take to load ls?

      You can't even realistically seperate the queues by user. In many situations, a single unix user may be running on behalf on many physical users (AKA human beings ;) ), e.g. in the case of any kind of server.

      I'm not saying that any of these problems are intractable (Linux is now doing a pretty fine job), just that they aren't as even remotely as trivial as queuing loads one after another.

      Oh BTW, thanks for bringing back happy Amiga memories. Them were the days! :-)

  7. Re:1,000 percent? by gowen · · Score: 5, Informative
    My guess is that it's a fairly specific, non-standard load that will garner a 1000x gain
    My guess is that you haven't spotted that 1,000% is not 1,000x. A 10-fold increase isn't completely implausible for a workload whose read pattern matches the assumptions built into the anticipatory scheduler.
    --
    Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
  8. Re:Linux Speed (Or Lack Thereof) by Eastree · · Score: 5, Funny

    >Try going outside. Find out about these things called "women".

    And this would help my computer how?

  9. Disk Transfer QoS by johnhennessy · · Score: 4, Interesting

    I think Solaris 10 (or maybe a later version, I can't remember) is suppose to support a concept of Quality of Service applied to disk accesses.

    Is anyone in the Linux world considering this ?

    This is probably more applicable to the enterprise market, but surely any scheme of informing the scheduler about the expected disk transfer characteristics has to improve performance.

    On the other hand, it might be just Sun trying to re-invent uses of buzz words to sell their products.

    --
    [ Monday is a terrible way to spend one seventh of your life. ]
  10. Re:1,000 percent? by tonywestonuk · · Score: 5, Informative

    Isn't 1000%, 11x?
    15% = 1.15x
    100% = 2x
    200% = 3x
    300% = 4x ..
    900% = 10x
    1000% = 11x

    a % = (a+100)/100 x

  11. Benchmark by zz99 · · Score: 5, Informative

    Here's an older benchmark made by Andrew Morton showing the anticipatory scheduler vs the previous one.

    The benchmark was made before 2.6.0, but I still think it shows the big difference from the 2.4 IO scheduler.

    Quote:
    Executive summary: the anticipatory scheduler is wiping the others off the map, and 2.4 is a disaster.

  12. Re:1,000 percent? by maxwell+demon · · Score: 4, Insightful

    1,000 percent is 10x, but 1,000% improvement, being improvement by 10x, is 11x as good.

    Just as 50% is half, but 50% improvement is three halves as good.

    --
    The Tao of math: The numbers you can count are not the real numbers.
  13. Retro is still cool ? by Anonymous Coward · · Score: 4, Insightful

    It's great watching the "modern" computer industry discover all the toys and optimisations that where essential engineering for the systems I used to use in the '70s & '80s.

    All the wonderful stuff like disk seek optimisation, interleaved memory (Even MMU came to the moden computer about 15 years after everyone else had it) were technologies that made systems stand out from each other.

    Because of the speed of things these days, lots of that tech has been largely ignored, until now when we're starting to hit hard performance barriers again. Now we have to invent the technology og the '70s all over again. It's nice to see all this stuff comming back though.

  14. Oh, come on... by mdb31 · · Score: 4, Interesting
    ...to achieve the O(1) timing, quite a leap forward that we had not even thought of!

    The NT scheduler has been O(1) like, eh, forever.

    Our kernel produces far superior performance due to providing hooks for the COM layer

    Yeah, whatever. There is no COM anywhere near the NT kernel, and the latest and greatest from Microsoft, the .NET framework, isn't even based on COM anymore

    Nice troll...

  15. You're misunderstanding something... by warrax_666 · · Score: 5, Informative

    AFAIK the "anticipation" bit is not so much about predicting head movement, but is more about reducing head movement. Reads
    cause processes to block while waiting for the data (and can thus stall processes for long amounts of time if not scheduled appropriately), whereas writes are typically fire-and-forget. This last bit means that you can usually just queue them up, return control to the user program, and perform the actual write at some more convenient time, i.e. later. Since reads (by the same process) are usually also heavily interdependent, it is also a win to schedule them early from that POV.

    That's my understanding of it.

    --
    HAND.
  16. Re:I've found the opposite by bflong · · Score: 5, Informative

    Make sure that you set X's "nice" value to 0. Some distros set it to something like -10 so that X is not disturbed by other procs. Under 2.4, this was a good thing. However, under 2.6, with it's superior scheduler, the kernel will keep interrupting X and you will see lagging performance. Google for it to get a better explanation.

    --
    Why is it so hot? Where am I going? What am I doing in this handbasket?
  17. [ot] by mirko · · Score: 4, Funny

    Thanks but my father is Croatian and my Mom's French :o)
    Anyway, you found out that I indeed am not a native English speaker, hence the neologistications.

    --
    Trolling using another account since 2005.
  18. Re:what's old is new again by Yokaze · · Score: 5, Informative

    Elevator seeking is looking at the current request queue and bundle requests which are close together to minimise head movement. This is indeed old. IRC, Linux had it since 2.2 something.

    The anticipatory scheduler tries to anticipate future requests (who would have guessed that?), and is relatively new

    --
    "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
  19. Re:Anti-MS Patent by TheNetAvenger · · Score: 4, Interesting

    ok, i know this is evil and all - but lets say MS decide to implement this as a concept (so without "stealing" code)... the linux community will have given them something and received (probably) nothing in return.

    Not to burst your bubble, but the NT scheduler already implements predictive disk I/O concepts.

    Nice that Linux is finally catching up though...