ArsTechnica Explains O(1) Scheduler

Ars' Piece by WarpFlyght · 2003-12-25 11:03 · Score: 5, Interesting

I read this piece yesterday, and while it did "dumb down" the basics (as the first poster noted), I thought it did a very good job of putting it all into a nutshell that those of us not as familiar with Big-O and schedulers in general might easily understand. For Linux.Ars' format, I thought it was of appropriate length, and had enough detail to "belong." I'm sure there are more detailed writeups on the O(1) scheduler in place in 2.6. Does anyone have any links?

--

"Aye, and if my grandmother had wheels, she'd be a wagon!" -- Montgomery Scott, ST:III

Re:Ars' Piece by Sloppy · 2003-12-25 15:45 · Score: 5, Interesting

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

I actually saw this behavior once on a customer's machine. I closed the CD door, and the machine went unresponsive for several seconds. I was shocked. Then I realized: millions of people think this is normal. And those millions of people live in 2003, not 1983. It was like some sort of Lovecraftian revelation.

--
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.

initramfs by FrostedWheat · 2003-12-25 11:06 · Score: 2, Interesting

I setup a small computer today with 2.6. It boots from an initrd but I noticed during bootup the kernel will pause for a few seconds with:

checking if image is initramfs...it isn't (no cpio magic); looks like an initrd

I couldn't google much info on it. Anybody know more about it? Or how I can stop the kernel checking for it.

Re:initramfs by njdj · 2003-12-25 22:01 · Score: 2, Interesting

checking if image is initramfs...
I couldn't google much info on it. Anybody know more about it?

The short answer is that the purpose of this message is to help people familiar with the kernel diagnose problems which prevent completion of bootup. Your system works, so ignore it. Preventing the message is more trouble than it's worth.

If you are interested, there is a little more info here.

Hmm. by Anonymous Coward · 2003-12-25 11:11 · Score: 5, Interesting

I clicked on this article expecting to see an explanation of how it was that the O(1) scheduler worked, and by what tricks it was able to schedule in O(1) rather than having to spend extra time as extra processes are added, and what the real-life effects of such a situation are.

Instead the article was just "This is what O(1) is. This is what a scheduler is. This is what "preemptive" means. The 2.6 SMP can load-balance effectively across many processors. I used this really cool mp3 player this week." and just about nothing else.

Anyone want to explain exactly how it is that an O(1) scheduler is a difficult thing, and how exactly linux 2.6 achieves it?

-- Super Ugly Ultraman

Re:Hmm. by WarpFlyght · 2003-12-25 11:32 · Score: 2, Interesting

I'm sorry to hear that some of the article's technical details were edited out. I can understand why the editors would do so, though. Ars obviously isn't only targeting Slashdot readers, hehe.

Would you be willing (or able) to post your original version of the article somewhere for interested parties to read?

--

"Aye, and if my grandmother had wheels, she'd be a wagon!" -- Montgomery Scott, ST:III
Re:Hmm. by AmirPC · 2003-12-25 12:03 · Score: 2, Interesting

Balancing is never called O(1) in the article. I guess it was somewhat unclear but the scheduler itself is O(1) while load balancing is not constant. I guess it really depends on where you drawn the line between the scheduler and the rest of the kernel.
Re:Hmm. by Crypto+Gnome · 2003-12-25 15:06 · Score: 4, Interesting

Personally, I *really* like the idea of split-brained articles. (ie such a shame that wasn't done here)

By that I mean the 'default'/'main' article having the not-too-geeky/JoeSixPack-reads-slashdot version, AND includes a link to "for those who want the gory details".

That way the editors have an article which appeals to the vast majority of "reasonably intelligent" people, as well as catering to the hardcore techo-savvy crowd. (ie wrote a brainfuck program before breakfast)

--
Visit CryptoGnome in his home.
Re:Hmm. by Anonymous Coward · 2003-12-25 21:43 · Score: 1, Interesting

No. Big-Oh notation is the worst case (upper bound). The average case is Theta. The best case is Omega (lower bound). Part of the problem is that there's no good way to write Theta in plain text. But, suffice it to say, Quicksort is Theta(n*log(n)), but O(n^2).
Re:Hmm. by Anonymous Coward · 2003-12-25 23:22 · Score: 1, Interesting

Hmm.. I always thought Theta was more like "==". With Big-Oh, you guarantee that the running time is = (best case).

So, an algoritm which would compute the sum of all numbers in an array would therefore be Theta(n) as it always have to check exactly n elements. Sure, O(n) is also true, but can be misleading as it will always run in "worst case" (Omega(n) is also true of course). That is, if an algoritm is a member of both O(f(n)) and Omega(f(n)), then it is also a member of Theta(f(n)).

Care to back up your statement about Theta being average case?

Yay for Ars by be-fan · 2003-12-25 11:12 · Score: 3, Interesting

While Ars definately isn't targeted at the same audience as, say, KernelTrap, its nice to see there are a few technology websites/publications that aren't dumbed down. I remember when Byte magazine used to publish articles detailing the PowerPC architecture, down to the level of registers and the types of pipelines in the first set of implementations. Compare this to the ZD rags, which are a hair away from calling the CPU the "brain" of the computer!

--
A deep unwavering belief is a sure sign you're missing something...

Re:Multi core proc's by KrispyKringle · 2003-12-25 11:42 · Score: 2, Interesting

I'm not sure I understand how a hardware scheduler would work. It would need to keep track of the process's state, including virtual memory and registers, no? I don't know how the hardware would do that without the OS's support. You would need a software scheduler to manage these things; otherwise, it seems like you'd be limited to one process per CPU. Correct me if I'm wrong, though.

Deep algorithm analysis? by Anonymous Coward · 2003-12-25 12:32 · Score: 1, Interesting

While O(1) certainly might sound impressive to someone who's taken a first semester computer science course in university, anyone beyond that would obviously realize you've got to worry about the average case analysis.

To achieve O(1) you usually have to do some tricks that will get you quite a large constant. With an appropriately small number of processes, which I assume is the case with most computers, even an O(n^n) scheduler could be faster if the constant is very, very small.

Have any sites done an average case analysis to show at what point the O(1) scheduler is faster than the old 2.4 kernel? Certainly it isn't always faster or else the old one really did suck.

Re:Deep algorithm analysis? by Anonymous Coward · 2003-12-25 16:40 · Score: 1, Interesting

You forget that O() notation doesn't actually tell you how fast something is. It describes the rate of change with relation to the number of items. So, a scheduler that's O(1) and very complicated could take longer to run than an O(n) scheduler that's less complicated. Then, only for larger values of n is the O(1) better. That was the argument used on the mailing list against this new, much slower scheduler. It is slower for the vast majority of Linux users. The only system I've used it on where 2.6 is faster is one that runs several thousand Java threads in a misdesigned application. 2.6 is about 2% faster than 2.4.23. That's nice since most of our systems are running "on the edge" since our traffic is going up much faster than our budget, but for the others, they're stuck on 2.4.23 for now.

How is this different from the BSD/VMS schedulers? by Anonymous Coward · 2003-12-25 12:48 · Score: 2, Interesting

Is this new? BSD and before that VMS used the same mechanism, priority queues for twenty years. The VAX architecture even had instructions to do this in 28 instructions. Here is some BSD code:

static struct rq queues[32]; static u_int32_t queuebits; int qi; qi = (current->priority >> 2); SetBit(&queuebits, qi); InsertQueue(&queues[current->priority ], current); qi = FindFirstSet(queuebits); if (RemoveQueue(&queues[qi], &current) == Q_EMPTY) ClearBit(&queuebits, qi);

You might want to look at this article:
www.usenix.org/publications/library/proceedings/al s00/2000papers/papers/full_papers/sears/sears_html /

Technically, this is not O(1) by Anonymous Coward · 2003-12-25 12:49 · Score: 2, Interesting

Let me explain: the Ars article says

When it comes time to select a process, the scheduler finds the highest priority level containing available processes. It then selects a process from the desired priority level. Because the number of priority levels is fixed, the scheduler always takes a constant amount of time.

Yes, but it does need to inspect all the processes within a priority level, right??

As the total number of processes grows, the average number of processes within a priority level also grows, and so does scheduling time. So it's not O(1).

Unless choosing a process from a priority level is also constant time -- say, if it were a round-robin queue.

What am I missing?

Question by wolfb · 2003-12-25 13:34 · Score: 3, Interesting

The mechanism for recalculation of timeslices in previous Linux kernel's was very simple. When every process had its timeslice completely depleted (they were all 0) the kernel would simply go through every process and recalculate its timeslice and start execution again at the highest priority runnable process. While this is the most obvious solution it is also very inefficient, executing in O(n) time.

Ok, its easy to see why this is O(n).

The 2.6 scheduler uses a simple yet effective method for getting rid of this problem, it uses two priority arrays! One priority array is for processes that are runnable, and one priority array is for processes that are not runnable (they have depleted their timeslice). This way if when a process has depleted its timeslice the scheduler simply recalculates its timeslice, removes it from the active array, and inserts it into the expired array.

How is this not O(n)? The time slice calculation still occurs for each process, just not all at once for all processes. Each process still gets its time slice calcuated, it is removed from one queue, and inserted into another. Is there some other unmentioned trick that eliminates the calculations? Or was there something else that made the 2.4 scheduler O(n), such as finding the highest priority process?

So when all processes have depleted their timeslices there is no need to recalculate timeslices for every process, the two arrays are just switched (for the code oriented among us: they are accessed via pointers and the pointers are simply switched).

So the calculation is done per process as they finish their time slice, rather then at the end when all the processes are done. I still don't see why this would imply better efficiency. Am I missing something?

At any rate, thanks for the link, it was much more informative than the published article.

Re:Preemption and disk requests - educate me by robhancock · 2003-12-25 19:23 · Score: 2, Interesting

This is more of a virtual memory or disk scheduling issue than process scheduling. Even if a process has full access to the CPU, it still can't run if either it has code or data it needs that are not paged into RAM, or need to be loaded in initially from disk.

Especially in Windows, it seems like if a process is hammering the disk in the background, other processes can be starved from accessing the disk. Windows seems especially braindead at times when multiple processes fight over the disk, and the disk is wildly seeking all over the place and not accomplishing much - it would be much more efficient if it would just devote the disk for a longer time to each request.

I don't know what Windows has in it for disk I/O scheduling, but it's probably safe to say that Linux is a bit more intelligent in this regard..

Good programmers should know their Big-O... by javabandit · 2003-12-26 03:55 · Score: 2, Interesting

The article was okay, I guess. Very, very elementary. I would assert that most schedulers are O(1), though. Its really nothing new. This article was probably good for someone who never heard of a scheduler or worked with one.

Although the article was about scheduling, I would have like to have seen some more text about Big-O notation and why it is important to programmers. To me, any good programmer should have a good concept of Big-O notation, how it relates to traversing data structures (such as trees), et cetera.

The article did not make a point as to why good schedulers use priority queues. Which is simple, really. All priority queues are O(1) for getting the next item in the queue. It has nothing to do with scheduling, but it is the nature of the priority queue itself... as a data structure.

I would have liked to have seen the article in its whole form. I wonder if it went into more depth about Big-O, et cetera.

Slashdot Mirror

ArsTechnica Explains O(1) Scheduler

19 of 295 comments (clear)