Slashdot Mirror


Andrew Morton And The Low-Latency Kernel Patch

An Anonymous Coward writes: "KernelTrap has interviewed Linux kernel hacker Andrew Morton, author of the low-latency patch. Though his patch has received less attention than Robert Love's preemptible kernel patch (recently merged into the 2.5 kernel), it results in quite significantly lower latencies. The interview is quite interesting, delving into the low-latency patch, explaining how it works and the differences between it and the preempt patch. He also talks about his ext3 work, porting that journaling filesystem from the older stable 2.2 kernel to the current stable 2.4 kernel."

20 of 151 comments (clear)

  1. Botched Fixes by Henry+V+.009 · · Score: 5, Funny
    This part was funny: One hot tip: if you spot a bug which is being ignored, send a completely botched fix to the mailing list. This causes thousands of kernel developers to rally to the cause. Nobody knows why this happens. (I really have deliberately done this several times. It works).

    A day in the life of a kernel hacker.

    1. Re:Botched Fixes by Tony.Tang · · Score: 4, Insightful

      This is quite funny from a social psyc perspective. Geeks have a superiority complex as is often seen here on /. Sometimes, you'll see a thread that goes down 60 deep, and it's just two guys arguing back and forth. Us geeks have a tendency to rail on and on about obscure things, showing off, telling each other we're wrong, etc. We do that because it makes us feel smarter and such. It's not very funny when you're in the midst of it, but when you step back, it's kind of amusing, really.

  2. Re:realtime? by Zenki · · Score: 5, Informative

    A realtime os, which usually has low latency, has nothing about the duration of latency, but rather a guarantee of latency.

    For example, suppose you send a packet off into the internet, a realtime os would guarantee that the packet was sent within x number of nanoseconds. A realtime os would main this guarantee, regardless of the load on the system, the size of the packet, etc.

  3. Process scheduling by lupetto · · Score: 5, Interesting

    I've been waiting for years for Linux to have finer control of process scheduling.

    I hope someday that Linux will use a method similar to Irix, where you can specify a priority from 0 to 255, modify it's timeslice, and make it realtime or timeshared. This was one of the best things about Irix, and something I could really use for Linux.

    1. Re:Process scheduling by rtaylor · · Score: 3, Interesting

      Yes, and no...

      It'll waste CPU cycles all right. But if it makes the network, disk and interface responsiveness faster odds are the CPU will have more information to do processing with.

      There are very very few CPU constrained jobs a computer does anymore. The ones that are (Graphics rendering, key cracking) either have the budget to add an extra machine per 100 to get back the 1%, or are already working with a timeframe that the timelost doesn't really matter.

      If you wait 3 months for something, whats an extra 12 hours?

      That said, I don't know how much this actually slows a conjested machine down. But, one of the large benefits of Solaris on Sun hardware is that you can get it up to a load of about 1000 before it starts to choke (become choppy). Sure, no task is moving quickly -- but they're all moving.

      FreeBSD I find gets slammed around 150, and Linux (last I tried was 2.0.x) was around 60.

      It's the type of stuff that makes Bigiron worth the money.

      DISCLAIMER: Load numbers are by my own independent testing on varying hardware. It was a large Sun box, but not an order of magnitude above the Linux / BSD one. Test consisted of FTP connections downloading varying sized files at varying speeds.

      --
      Rod Taylor
    2. Re:Process scheduling by r6144 · · Score: 3, Funny

      I'm now running 2.4.18pre9mjc2 with preempt & O(1) patches. Now I'm running a crazy prime-factoring program that forks a new process to do one division. It is now niced to 19. The system is running quite smoothly. (X is niced to -10)

      `uptime`:
      4:06pm up 1:44, 6 users, load average: 337.62, 241.84, 115.30

      My box is a plain-old PII/233.

      The only problem is that now any unniced process that does real cpu-intensive work (as opposed to interactive ones) can get only about 20% of cpu. It is just blatantly unfair to let one unniced process compete with 500+ others, even though they are niced to 19.

      Of course, the programs I'm running does not take too much memory. When one run out of memory (like make -j), the system will swap like crazy, then it IS unresponsive.

    3. Re:Process scheduling by captaineo · · Score: 5, Informative

      Linux has been able to do what you describe (many priority levels, selectable real-time policies) for a long time. What Irix does have over Linux currently is scheduling of resources other than the CPU - disk I/O being the most important one.

      On Linux, a low-priority process won't take much CPU away from a high-priority process... But if the low-priority process does a lot of disk I/O, it can cause significant delays in the high-priority process's own disk I/O. i.e. the notion of priority does not carry over to disk I/O. Whereas on Irix, you can set up a process to get a guaranteed level of disk bandwidth...

      Look for this feature to appear in Linux soon though. The newly-introduced I/O elevator should make it easier to implement prioritization for disk I/O.

    4. Re:Process scheduling by TimMD909 · · Score: 3, Insightful

      I used to have a severe problem with my machine becoming unresponsive and pausing for 10 seconds at a time while the buffers where synced. Then one day I was inspired to type

      hdparm -t /dev/hda; hdparm -d 1 /dev/hda ; hdparm -t /dev/hda'
      Suddenly a dim bulb brightened and I saw the light :) (Went from 2- MB/sec to 27+ MB/sec)

      It's even more hilarious if you only knew how long I has unaware that the DMA/32bitIO/etc would never save of a reboot. Then how I never even thought about how slow my hard disk was working when I know that IDE can easily do 25+ MB, I say it's hilarious! ...but I'm not saying how long that realization took ;)

    5. Re:Process scheduling by captaineo · · Score: 3, Informative

      Yep, sounds familiar =).

      Thankfully Andre Hendrick's IDE patch seems to find the optimal hdparm settings for a drive automatically - once I started using the patch, I got uniformly high transfer rates (20-30 MB/sec) without running hdparm manually.

  4. This is a great example of why I love Linux by Anonymous Coward · · Score: 4, Insightful

    I really like reading things like this.

    That's why Linux is so great -- even if you're not good enough to work on the kernel, you can read about some of the issues that pop up. If you use Linux for awhile, and if you get to the point where you roll your own kernels and apply patches, you end up learning a lot about how the system works.

    The MS guys are smart, and they're making some good systems now, but you can spend your whole life with them and not have much of a clue about what's going on under the hood.

    If MS would open up their internal developer discussions to the public, it would take MS system administration to a whole new level. I understand why they can't do that, but it is a great example of what's nice about Linux.

  5. I like this sentance the best by Anonymous Coward · · Score: 5, Funny

    "With an internally preemptible kernel the explicit task yielding is not necessary, because the context switch is performed in the interrupt return path and via open-coded yields which are hidden in the unlock code. But you cannot preempt an in-kernel process while it holds locks, so all the unlock, relock and fixup code is needed in either approach."

    Try getting your head round that one when needing sleep :)

    1. Re:I like this sentance the best by frantzdb · · Score: 4, Informative
      Fortunatly, some of that illegability is due to poor punctuation. Try this:


      "With an internally preemptible kernel, the explicit task yielding is not necessary because the context switch is performed in the interrupt return path via open-coded yields, which are hidden in the unlock code. But you cannot preempt an in-kernel process while it holds locks, so all the unlock, relock and fixup code is needed in either approach."

      --Ben

  6. Re:realtime? by Error27 · · Score: 5, Insightful

    The difference is that hard real time doesn't mean low latency it just means that there is a _guaranteed_ maximum latency.

    Soft real time means that you can almost gaurantee the latency. Generally, of course, you want these latencies to be pretty small. Soft real time is for when you use check the "use real time where available" option on xmms and run it under sudo.

    I hear that Linux (probably with patches) is a little better than windows and a little worse than os X for latency.

  7. Re:realtime? by s390 · · Score: 5, Informative

    Is there a formal difference between low latency and a realtime OS?

    Yes. A realtime OS _guarantees_ that certain events trigger defined responses within specified times. A realtime OS is almost by definition an embedded OS, i.e., its hardware is rigorously specific and very tightly bound. A realtime OS also typically provides a very limited set of functions, as opposed to a general purpose OS. A low-latency OS, on the other hand, provides generalized structures for 1st-level/2nd-level interrupt handlers, real/virtual memory management, and facilities for locking, preemptive-priority dispatching, etc., but offers low latency on a merely best-efforts basis depending upon what all happens to be inflight at the moment. See the difference?

    Examples of realtime systems: automotive control systems including engine power/emissions management, suspension and braking management, even airbag controls; aircraft fly-by-wire systems that control aerodynamically unstable airframes.

    Examples of low-latency systems: mainframes - if you're a high-priority system task, you get _very_ low latencies - but exact timings aren't guaranteed in all situations.

  8. It's a baby step, so what's the big deal? by Kogun · · Score: 4, Insightful

    "The low-latency patch yields worst-case latencies of around 1.5 milliseconds at present. The preempt patch is around 80 milliseconds,
    but with the locking changes it should also yield 1-2 millisecond latencies." On what speed processor? 1.5ms is way too long for any kind of processor being sold these days. Try 100us maximum latency on a 133Mhz Pentium for starters and go down from there. And learn to use the term "deterministic" and I might raise an eyebrow. Make it POSIX 1003.1 compliant and someone will have a serious solution.

    Programmers either need deterministic response in their applications or they don't. If they do, then Linux is not their OS. If they don't, then these half-baked solutions to reduce context switching time and interrupt latency are probably going to be fun to play with, but will cause nightmares in the long run.

    1. Re:It's a baby step, so what's the big deal? by Spy+Hunter · · Score: 3, Insightful

      What are you talking about? It's a BIG step. I hear stock kernel (2.4.x) worst-case latencies are in the 100-300 ms range. While the low-latency patch isn't going to solve many "real time" computer science problems, it will let me play mp3s under load with no skips and a reasonably small buffering delay, and it will increase the responsiveness of my mouse pointer. It is a good thing for desktop Linux. That's all it needs to be. It doesn't need to guarantee 100us max latency to be useful.

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
  9. Time for bed... by InsaneCreator · · Score: 3, Funny

    Andrew Morton And The Low-Latency Kernel Patch

    Sounds just like a title of a bedtime story. :)

    I also recommend you read "How CowboyNeal saved the world (with a little help from / and .)&quot

  10. Why not SoftUpdates for Linux iso Journalling? by redelm · · Score: 5, Interesting
    I've used Kirk McKusick's SoftUpdates for *BSD and been very impressed. Pulled the plug on four kernel compiles near the end. In three of the four cases, `make` just picked up the compile losing ~45 seconds. In the fourth, a `make clean` was necessary. In _all_ cases the fsck on reboot was minor. I've only lost power once in Linux during a kernel compile. I had to reinstall. It was too far gone for e2fsck.


    IMHO, SoftUpdates are better than Journalled File Systems. There's no journal file to maintain, just careful ordering of the writes. Why no discussion of it for Linux?

    1. Re:Why not SoftUpdates for Linux iso Journalling? by smnolde · · Score: 3

      I agree. I use FreeBSD and have had my computer lose power during a "make buildworld". Upon rebooting the fsck took a few minutes, but with softupdates I didn't lose much work. In fact, I issued the "make buildworld" command again and it completed without a hitch.

      For those of you that don't know, or aren't familiar with FreeBSD, you can build the entire OS from source with one command. It's not a port or package, but the entire base OS (kernel, filesystem utils, OpenSSH, OpenSSL, bind, sendmail, all the crypto, etc...).

      I do agree that softupdates would be preferencial in most cases. McKusick had his shit in order when he wrote SU. Journaling had its place a year or two ago, but with today's more robust systems and affordable UPSs, why not invest more attention in a unified VM, or better system tools?

      For me, FreeBSD has a kick-ass VM and a rock solid filesytem. Using SU in linux wouldn't hurt, but you'd need to port over UFS to make it work. But that wouldn't be hard since BSD code is pretty much there for the taking. YMMV.

  11. Re:realtime? by Grab · · Score: 3, Interesting

    You want your autopilot to never have a task scheduler? Obviously you have no experience in embedded systems design at all, or you wouldn't say something so blatantly stupid. I'm sorry, but that's just rubbish.

    In EVERY embedded application, there's multiple layers of stuff happening, ranging from ultra-high priority interrupts that need micro-second accuracy scheduling, down to background loop stuff that doesn't need to be done more often than every few seconds. Every embedded system uses this approach.

    A single loop running round is fine if your code needs to do nothing more complex than a Windows program, which any 16-year-old kiddie can write. The moment it breaks this complexity, you're screwed. For example, consider a car engine controller (which I design software for, BTW). Scheduling the start and stop times for injector and ignition pulses requires the processor to recalculate the times a fraction of a second before the pulse, to make sure the fuel and ignition pulses are accurate for the current conditions. And importantly, the number of times you need to do this changes with engine speed, since you need to update every engine rev. It is unacceptable to burden this ultra-fast processing with stuff which doesn't need to be run 7000 times a second, eg. toggling the indicators.

    So the solution is to go to a multi-rate system. Stuff which needs to run fast, runs fast; stuff which can run slow, runs slow. This frees up processing time for the fast stuff which can then handle more iterations per second. And in order to work this, you need something to tell all your functions when to run. Sometimes it's designed as part of your main application, sometimes it's a separate bit of object code bought-in, but it's always required. Even your autopilot will be doing this - as a minimum there'll be a fast loop controlling the aircraft, and a slow loop sending info back to the pilots.

    So there's many different task rates, all running at their own time frame. For example, in the Ford project I'm working on currently, there's a task happens twice per rev to schedule fuel and spark, there's another task happens once per cam, and there's time-based tasks at 10ms, 16ms, 32ms, 50ms and 100ms rates. And this allows us to allocate resources to the processing that needs it, such as critical tasks like keeping ppl alive.

    Grab.