ArsTechnica Explains O(1) Scheduler

Ars' Piece by WarpFlyght · 2003-12-25 11:03 · Score: 5, Interesting

I read this piece yesterday, and while it did "dumb down" the basics (as the first poster noted), I thought it did a very good job of putting it all into a nutshell that those of us not as familiar with Big-O and schedulers in general might easily understand. For Linux.Ars' format, I thought it was of appropriate length, and had enough detail to "belong." I'm sure there are more detailed writeups on the O(1) scheduler in place in 2.6. Does anyone have any links?

--

"Aye, and if my grandmother had wheels, she'd be a wagon!" -- Montgomery Scott, ST:III

Re:Ars' Piece by sfe_software · 2003-12-25 11:50 · Score: 5, Informative

Regarding "dumbing it down":

There is one exception to this load balancing rule -- some special processes may be fixed to a certain runqueue. This attribute is called "thread affinity;" a thread is simply another name for a process.

I think it over-simplified this a bit. Granted, Linux doesn't make all that much distinction between a "thread" and a "process" as compared to (say) Windows, but the distinction is still important.

Otherwise, it's a decent article, and gave me some insight to one of the major improvements in 2.6 (which I've yet to try out).

I remember when the "pre-empt" patch came out for 2.4, before it was integrated into the kernel. For a desktop (or otherwise interactive, like my media box) system, it really made a major perceived improvement in performance (while *actual* performance wasn't affected; it was just about priorities and scheduling, making it "feel" more responsive).

The O(1) scheduling algorithm likely improves this further, and is enough to tempt me to spend XMas evening upgrading... these things are important, especially on a desktop system.

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

I enjoy seeing such improvements in the Linux kernel. The pre-empt patch (later integrated into the mainstream kernel) made a drastic usability improvement (especially when the box is also a web server, MySQL server, firewall/router, etc) and I couldn't see a Windows box handling similar tasks without becoming unusable (note: not unstable, just not practical to use when responsiveness is hindered).

I think the kernel team is on the right track with this, specifically for desktop applications (though it does help in many other applications as well; anything interactive, really, even over the network/Internet).

--
NGWave - Fast Sound Editor for Windows
Re:Ars' Piece by moonbender · 2003-12-25 12:28 · Score: 2, Insightful

For a desktop (or otherwise interactive, like my media box) system, it really made a major perceived improvement in performance (while *actual* performance wasn't affected; it was just about priorities and scheduling, making it "feel" more responsive).
It doesn't just "feel" more responsive, it is more responsive! It's a matter of definition, but I'd certainly call that an improvement of performance. Taking from the article (or comp.sci classes), while the throughput remains equal - I suppose it's even reduced - the latency is lower, too. Performance is both factors, not just the former.

--
Switch back to Slashdot's D1 system.
Re:Ars' Piece by sfe_software · 2003-12-25 14:53 · Score: 1

It doesn't just "feel" more responsive, it is more responsive!

I agree, though what I meant to say was that it feels faster. It certainly is more responsive, but it's not any faster.

I recall when the pre-empt patch first came out, it was tested on a variety of systems. On a desktop-like environment it made a huge difference (responsiveness) but in a server environment, it made very little difference.

So yes, it certainly is more responsive to any interactive application, but for sheer CPU loads it didn't make any difference.

--
NGWave - Fast Sound Editor for Windows
Re:Ars' Piece by Anonymous Coward · 2003-12-25 15:07 · Score: 1, Informative

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

Did you every try it? ever heard of something called threads? we windows people have them since ages *eg*
Re:Ars' Piece by bolthole · 2003-12-25 15:28 · Score: 1

I agree, though what I meant to say was that it feels faster. It certainly is more responsive, but it's not any faster.
What you are probably trying to say by that, is that, while throughput is the same, latency has been decreased.
Re:Ars' Piece by chez69 · 2003-12-25 15:30 · Score: 1

my experience with windows is that things do slow down when a cd rom spins up. they don't always block, but they do slow down.

--
PHP is the solution of choice for relaying mysql errors to web users.
Re:Ars' Piece by Stween · 2003-12-25 15:37 · Score: 1

He did say that.

You're just trying to be a smartarse.
Re:Ars' Piece by Sloppy · 2003-12-25 15:45 · Score: 5, Interesting

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

I actually saw this behavior once on a customer's machine. I closed the CD door, and the machine went unresponsive for several seconds. I was shocked. Then I realized: millions of people think this is normal. And those millions of people live in 2003, not 1983. It was like some sort of Lovecraftian revelation.

--
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
Re:Ars' Piece by Anonymous Coward · 2003-12-25 17:27 · Score: 0

all this did is prove that you're an idiot. And plus... if you hate the keyboard so much why the hell did you take all the time to type all of that up?!
Re:Ars' Piece by ssstraub · 2003-12-25 18:17 · Score: 1

Apparently you've never used windows. I'm using it right now and CD-ROMs definitely cause windows to slow down/lock up for some time.
Re:Ars' Piece by Anonymous Coward · 2003-12-25 18:51 · Score: 0

I concur, but have one nitpick. Where preemption only increases responsiveness by sacrificing some performance (throughput), an O(1) scheduler actually improves performance by increasing throughput. In particular, when there are many processes running, the O(1) scheduler will use less processor time than an O(n) scheduler.
Re:Ars' Piece by ikewillis · 2003-12-25 19:47 · Score: 3, Informative

I'm sure there are more detailed writeups on the O(1) scheduler in place in 2.6. Does anyone have any links? http://67.163.90.104/~jim/scheduler.html Unlike the Ars Technica article which seems to be nothing more than an overview of how schedulers work in general, this article goes into great technical detail about the actual implementation detauls of Linux's O(1) scheduler.
Re:Ars' Piece by Anonymous Coward · 2003-12-25 20:26 · Score: 0

The bus probably reset due to crappy hardware.

I had a Linux box that had the same behavior with a Panasonic "PD-CD" drive, so count me in with those who think it's normal, but sucky, behavior.
Re:Ars' Piece by spectecjr · 2003-12-25 22:02 · Score: 5, Insightful

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

The kernel hasn't slowed down in that case; it's the GUI layer which has slowed down. It hit a bottleneck.

Interestingly, it would appear that Windows has used an O(1) scheduler in Windows NT and up since at least NT 4.0 (which came out in 1996).

It's also interesting to note that it has the same priority-boosting for IO threads that Linux has.

"Windows NT also uses additional priority adjustments based on other events, such as momentarily boosting thread priority when it returns from an I/O call"

Detailed explanation of the NT4.0 scheduler

Easier to read version of the above

Part 1 of a slightly more indepth (but in some places inaccurate)* article

Part 2 of the article

*The article makes it sound like it's O(N) in places; it's not. The scheduler is using a set of priority lists, which are kept in-order at insertion time when a thread exits its context, by the thread inserting itself at the end of the list for its priority.

--
Coming soon - pyrogyra
Re:Ars' Piece by Anonymous Coward · 2003-12-26 00:39 · Score: 0

Can a scheduler in an one-to-one threaded model OS make a distinction between a "thread" and a "process"?

Doesn't the scheduler just run in kernel space and handle all processes and the threads within processes as more processes?

I think I need less dumbed down article :P
Re:Ars' Piece by The+Infamous+Grimace · 2003-12-26 01:02 · Score: 1

Apparently, the subtley of the piece was lost on you.

Re-read it. If you still don't get it, then re-read it. If you still don't get it...

(tig)

--
Ignorance and prejudice and fear
Walk hand in hand
Re:Ars' Piece by jandrese · 2003-12-26 06:27 · Score: 1

What's worse is if you're running low on memory (swapping stuff out) and something in Windows decides to play a sound. Windows will literally freeze for 2-3 seconds while it swaps stuff out to swap in the sound to play. I've burnt coasters on a slow 4x burner because the otherwise unloaded 800Mhz Duron running 2k froze up waiting for the sound to load. It's extremely frustrating.

--

I read the internet for the articles.
Re:Ars' Piece by drsmithy · 2003-12-28 18:00 · Score: 1

Back in the day, when my main machine was just a P100 (o/ced to 110Mhz) with 40M of RAM and a 2G 5400rpm drive (and that was fastish), I used to burn CDs (admittedly with a SCSI burner at the time, although realistically an IDE burner on a dedicated channel should have been able to as well) at 4X in the background while I used to play Quakeworld. This was with NT4. If you've got an 800Mhz machine, undoubtedly with a couple of hundred megs of RAM burning coasters at 4X, it has serious configuration problems.
Re:Ars' Piece by Arthur+Dent · 2003-12-30 06:52 · Score: 1

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

My machine at work becomes unresponsive when you create a directory from explorer. You can actually watch/hear the hard drive grinding away.

Multi core proc's by niko9 · 2003-12-25 11:05 · Score: 2, Insightful

With all the recent talk about multi core processors, will this sort of thing be relegated to hardware; or will there still be a need for software scheduler?

Re:Multi core proc's by be-fan · 2003-12-25 11:18 · Score: 2, Insightful

A multi-core processor appears as multiple processors to the software. Even SMT processors appear as multiple processors to the software. So the software scheduler is needed to juggle the available processors among the threads competing for them.

--
A deep unwavering belief is a sure sign you're missing something...
Re:Multi core proc's by autopr0n · 2003-12-25 11:23 · Score: 4, Informative

With all the recent talk about multi core processors, will this sort of thing be relegated to hardware; or will there still be a need for software scheduler?

Well, only if you want to run only as many threads as you have CPUs. The windows machine I'm using right now has 37 processes, some I'm sure with multiple threads. I suppose you could have a hardware scheduler, but I don't think it would be a good idea. Scheduling doesn't take much time (certainly not this O(1) jobby), so you would lose a lot of flexibility without a lot of gain.

--
autopr0n is like, down and stuff.
Re:Multi core proc's by KrispyKringle · 2003-12-25 11:42 · Score: 2, Interesting

I'm not sure I understand how a hardware scheduler would work. It would need to keep track of the process's state, including virtual memory and registers, no? I don't know how the hardware would do that without the OS's support. You would need a software scheduler to manage these things; otherwise, it seems like you'd be limited to one process per CPU. Correct me if I'm wrong, though.
Re:Multi core proc's by Anonymous Coward · 2003-12-25 14:36 · Score: 0

What A$$munch modded this flamebait? It seems like a decent question...
Re:Multi core proc's by Anonymous Coward · 2003-12-25 17:37 · Score: 0

I'm pretty sure we'd be able to solve all of those problems in NP if this were the case (find and check all possibilities in parallel, with some sort of magically-forking hardware).
Re:Multi core proc's by John+Courtland · 2003-12-25 17:59 · Score: 1

Well, the 80386s and on have some simple instructions to allow you to do fiddle with the current task state. IIRC, Windows does not use these anyhow because the processor was neither robust nor fast at task switching. It takes a ton of cycles (or at least it did, somewhere in the 1-2 hundereds) to flip to a different task state on a processor. At least, that's what my processor manual says.

And yes, it would still require OS support, because the processor would have no idea when the current task is copmleted. I'm also not sure if you could pre-program the processor (hardware) to task switch at a specific time interval, but I'm pretty sure that isn't available on IA-32 processors.

--
Slashdot is proof that Sturgeon's Law applies to mankind.
Re:Multi core proc's by cculianu · 2003-12-26 01:49 · Score: 1

Well, you can setup a timer event to interrupt the CPU, and then the timer expires your ISR can kick of the scheduler to do a task switch, but no, you cannot really preload all those values so that the CPU does the task switch faster.. :(
Re:Multi core proc's by KrispyKringle · 2003-12-26 05:28 · Score: 1

This would still just do context-switching at regular intervals, no? Even that would be useless without the OS still managing the rest of the scheduling, I would think (I don't know a whole lot about this, though).
Re:Multi core proc's by John+Courtland · 2003-12-26 16:17 · Score: 1

The only problem with that is that it would still require OS-level support (in the callback defined in the interrupt service handler, it would need to know the task ID of the next process to retrieve its task state). That and then you'd simply be doing round robin, which isn't the greatest form of multitasking. Some instructions on the 80X86 don't complete execution for a long time, becuase they can br preceeded by the REP (example REP STOSD). And you can't task switch during an instruction.

--
Slashdot is proof that Sturgeon's Law applies to mankind.

initramfs by FrostedWheat · 2003-12-25 11:06 · Score: 2, Interesting

I setup a small computer today with 2.6. It boots from an initrd but I noticed during bootup the kernel will pause for a few seconds with:

checking if image is initramfs...it isn't (no cpio magic); looks like an initrd

I couldn't google much info on it. Anybody know more about it? Or how I can stop the kernel checking for it.

Re:initramfs by adamy · 2003-12-25 11:31 · Score: 0, Offtopic

I wonder if this is a Lilo issue. Which bootloader are you using?

--
Open Source Identity Management: FreeIPA.org
Re:initramfs by jrockway · 2003-12-25 16:05 · Score: 0

Who cares? Windows probably checks for a lot of things and doesn't tell you about it. If you don't want to hear about it a) don't use linux or b) comment out the printk that makes that. Oh, that was hard.

--
My other car is first.
Re:initramfs by XipX · 2003-12-25 18:01 · Score: 1

Obviously he does. And while the question was a bit off topic, he deserves a valid answer, not some preprogramed BS reply that could be used in place of nearly any linux question.
Re:initramfs by mabinogi · 2003-12-25 21:25 · Score: 1

why does he deserve an answer?..what makes slashdot a Linux support forum?

If I blunder into the nearest gathering of people and ask an offtopic question, do I deserve an answer?

--
Advanced users are users too!
Re:initramfs by Anonymous Coward · 2003-12-25 21:46 · Score: 0

he deserves an answer because you took the time to comment on it. if you thought for one second that no answer would be needed then you wouldn't have made an attemp to comment.

seeing how you don't know the answer then maybe you should shut up, if someone does know the answer then they could answer it.

what makes you so high and mighty that you can post somthign so uninteligent? i know everyone has a rite to be an ass (like i am doing rite n ow and you wwere demonstrating before). lets make the world a better place and either watch you be constructive or comit suicide. (i would prefere the later and mioght even join you but you do it first and then i'l decide)

thank you and have a nice day.
Re:initramfs by njdj · 2003-12-25 22:01 · Score: 2, Interesting

checking if image is initramfs...
I couldn't google much info on it. Anybody know more about it?

The short answer is that the purpose of this message is to help people familiar with the kernel diagnose problems which prevent completion of bootup. Your system works, so ignore it. Preventing the message is more trouble than it's worth.

If you are interested, there is a little more info here.
Re:initramfs by Anonymous Coward · 2003-12-26 02:01 · Score: 0

Its farely obvious that youdonnot know anything. Ive been hacking the linux kernal for four years now and there are some tings that are just as planely obvious as the nose on your face and the original poster didn ot know what he was talking about because it was a obvious question which had an obvious answer and it didnt need to be said by any of us expert hackers reading. so shuttup
Re:initramfs by Anonymous Coward · 2003-12-26 02:50 · Score: 0

So you have been "hacking" at something you can't spell? "Kernal"
Re:initramfs by jrockway · 2003-12-26 22:46 · Score: 1

As they say, a stupid question gets a stupid answer. What's next? "Linux says my CPU is 2136.34MHz but the box says it's 2135MHz. Why is it overclocking and damaging my processor?!"

How do you answer that without a lart?

--
My other car is first.
Re:initramfs by Anonymous Coward · 2003-12-27 04:51 · Score: 0

Ive been hacking the linux kernal for four years now
::snicker:: Um, yeah, sure. And you're married to Morgan Fairchild, too.

A more detailed description of O() notation... by ctrl-alt-elite · 2003-12-25 11:07 · Score: 4, Informative

...can be found here (PDF file).

Re:A more detailed description of O() notation... by gordyf · 2003-12-25 12:08 · Score: 1

That's a set of slides from my class! Perkins would be proud.
Re:A more detailed description of O() notation... by HungWeiLo · 2003-12-25 12:40 · Score: 1

Ah. Good old CSE 143. A very good class to learn about divide and conquer - The class size halves after every midterm (2 or 3 of them in the quarter?)

--
There are a huge number of yeast infections in this county. Probably because we're downriver from the bread factory.
Re:A more detailed description of O() notation... by ctrl-alt-elite · 2003-12-25 13:27 · Score: 1

Good ol' Hal. I'll never forget his "one with the boolean" statement. I didn't know so many UDubbers were Slashdotters.
Re:A more detailed description of O() notation... by gordyf · 2003-12-25 13:36 · Score: 1

There are a loooot of slashdotters at UW. Almost everyone I work with in MGH reads (or at least knows of) slashdot...
Re:A more detailed description of O() notation... by PeteyG · 2003-12-26 11:21 · Score: 1

here's another UW CSE 143 veteran. Perkins was a great professor.

You know, I felt so smart when I read the article, already knowing what O(1) meant.

Oh, and if any of you made a 3d maze... fsck you. ;)

--
no thanks

that article sucked by Anonymous Coward · 2003-12-25 11:08 · Score: 1, Informative

Most of the article was just about scheduling definitions - the states of a process, what preemption is etc... What a waste of reading.

Re:that article sucked by AndroidCat · 2003-12-25 15:53 · Score: 1

It depends on the audience. For people that have taken OS courses and worked with scheduling, it's a basic refresher. For the rest, it's a good explaination of the issues so that they'll have a hope of understanding a technical article.
You have to sucker them in slowly before smashing their brains out with a small timeslice of Linux wrapped around a large gold brick.

--
One line blog. I hear that they're called Twitters now.

Hmm. by Anonymous Coward · 2003-12-25 11:11 · Score: 5, Interesting

I clicked on this article expecting to see an explanation of how it was that the O(1) scheduler worked, and by what tricks it was able to schedule in O(1) rather than having to spend extra time as extra processes are added, and what the real-life effects of such a situation are.

Instead the article was just "This is what O(1) is. This is what a scheduler is. This is what "preemptive" means. The 2.6 SMP can load-balance effectively across many processors. I used this really cool mp3 player this week." and just about nothing else.

Anyone want to explain exactly how it is that an O(1) scheduler is a difficult thing, and how exactly linux 2.6 achieves it?

-- Super Ugly Ultraman

Re:Hmm. by AmirPC · 2003-12-25 11:21 · Score: 5, Informative

I'm the author of that article..

The original article was MUCH MUCH more in depth it was toned down because of the "glaze over" effect people were getting. I didn't have much say in the editorial process in the matter.

I think this way atleast it appeals to a broader audience, not just the CS audience who would know what a binary min heap is if I were to use it.

To each his own I suppose.
Re:Hmm. by TwistedSquare · 2003-12-25 11:27 · Score: 1

That would be handy, I don't quite understand why O(1) is so hard, unless it is all related to priority scheduling checks. Anyone have links/explanation?
Re:Hmm. by WarpFlyght · 2003-12-25 11:32 · Score: 2, Interesting

I'm sorry to hear that some of the article's technical details were edited out. I can understand why the editors would do so, though. Ars obviously isn't only targeting Slashdot readers, hehe.

Would you be willing (or able) to post your original version of the article somewhere for interested parties to read?

--

"Aye, and if my grandmother had wheels, she'd be a wagon!" -- Montgomery Scott, ST:III
Re:Hmm. by GebsBeard · 2003-12-25 11:39 · Score: 1

There are a few specific data structures that operate O(1). In general they are really gnarly to implement. You may want to look into Fibonacci heaps as the majority of their operations are either O(1) or O(1) amortized. There are other heap structures that are perfect for scheduling but they tend to be variations on a theme: Pairing heaps, Trinomial heaps, etc.
This is simply off the top of my head, since I have no further knowledge of the inside of Linux 2.6 kernel. I am fairly certain given the pain in designing a good data structure, let alone one with O(1) characteristics, this is a good place to start.
Re:Hmm. by AmirPC · 2003-12-25 11:43 · Score: 1

Sure let me see if I have it around somewhere, the writing is of much lower quality (I'm a programmer after all) but it goes further in.
Re:Hmm. by KrispyKringle · 2003-12-25 11:46 · Score: 4, Informative

this guy says that the scheduling itself is O(1) but the balancing is O(n). I don't know this myself, but that makes a whole lot more sense; according to the ars article, the scheduler simply keeps two stacks of processes--which is can seemingly do in constant time--but of course doing the load balancing is proportionate to the number of processes (times some constant--you remove constant multipliers in big-O--to do the context switching).
Re:Hmm. by Anonymous Coward · 2003-12-25 11:47 · Score: 0

Shut up, you ars!
Re:Hmm. by Anonymous Coward · 2003-12-25 11:49 · Score: 1, Insightful

You have 100 processes. Each has a priority. If you check each process individually every time you do a scheduling, you go into O(n) where n is the number of processes.

In other words, the magic lies in selecting a process based on its priority, without checking each processes priority, while still not leaving low priority processes starved of resources.

Now, I don't know how they did it, I'm not a kernel hacker. But just thinking about it, O(1) is never easy to achieve when dealing with n items .
Re:Hmm. by AmirPC · 2003-12-25 12:01 · Score: 5, Informative

http://67.163.90.104/~jim/scheduler.html Thats the completely unedited first draft its all I could find. That has zero editing done, and is even missing a quote up at the top. It goes further in depth, but as you can see...my unedited writing isn't so great ;)
Re:Hmm. by windows · 2003-12-25 12:03 · Score: 4, Informative

I'm not particularly familiar with the O(1) scheduler, but I know enough about it to give an explanation that I believe is accurate enough.

There are two queues used in the Ingo scheduler. One contains processes that are ready to run. The other contains processes that have already run. The run queue is scanned, and when a process is found that can be run, it is automatically run without scanning the rest of the queue. When the process finishes its time quantum, it is put into the second queue.

When the first queue (the active one) is empty, the queues are switched. This is just done by swapping the pointers to the queues.

This scheduler is O(1) because it runs the first process it can run without scanning the entire queue as an O(n) scheduler would do. That's why it's O(1).

If you want a bit more information on it, search for the "Ingo scheduler" on Google or your other search engine of choice. I don't see any detailed explanations of what happens, but I suspect as the 2.6 kernel goes into wider use, those will come.
Re:Hmm. by AmirPC · 2003-12-25 12:03 · Score: 2, Interesting

Balancing is never called O(1) in the article. I guess it was somewhat unclear but the scheduler itself is O(1) while load balancing is not constant. I guess it really depends on where you drawn the line between the scheduler and the rest of the kernel.
Re:Hmm. by Jesus+2.0 · 2003-12-25 12:18 · Score: 1

I don't understand why it's so difficult to make an O(1) scheduler.

Trivial example (I'm sure it can be improved upon, it's just off the top of my head to show that O(1) is not difficult to make):

There are a fixed number of possible priorities. Keep an array of that many integers. At a particular index, store the number of processes with that particular priority.

Keep a linked list of the processes. When a new process is created, put it on the end of the list. When an old process dies, remove it from the list.

When it comes time to decide what process to give a time slice to:

(1) Loop through the array of processes at particular priorities. For a priority P, apply some O(1) function to the number of processes at that priority (basically, a weighting function - the higher the priority, the higher the weight). Sum over all priorities. Since there are a constant number of priorities, and since your function f is O(1), this operation is also O(1).

(2) Pop the top process off of the list (this is also O(1), of course). Check the priority of the process. Apply some O(1) function g to the priority and to the number that you arrived at in step (1). The result of function g is an amount of time. Basically, you're getting something like a percentage of total time that should be applied to this particular process, based on its priority and those of all the other processes.

(3) Let the process execute for that amount of time.

(4) Put the process on the tail of the list (also O(1), of course).

(5) Goto step (1).

Sorry about the heretical use of goto. But anyway, there you have an O(1) scheduler. What's the big deal?
Re:Hmm. by Otter · 2003-12-25 12:37 · Score: 1

That has zero editing done, and is even missing a quote up at the top.
OK, now you're targeting Slashdot readers!
Seriously, I never would have considered the Ars readership to be an intellectual step down from /..

--
What I'm listening to now on Pandora...
Re:Hmm. by SilentStrike · 2003-12-25 12:40 · Score: 3, Informative

A thread is not the same thing as a process. I hope that was the editors fault too.
Re:Hmm. by Anonymous Coward · 2003-12-25 12:53 · Score: 0

go to kernel.org, get the kernel source, and write your own. i'm sure linus would be up for a better scheduler.
Re:Hmm. by Jesus+2.0 · 2003-12-25 12:58 · Score: 1

I'm not saying that I just designed the world's greatest scheduler. Far from it, I'm sure.

I'm just saying, I don't get the concept that merely making an O(1) scheduler is so incredibly difficult, and it's so utterly amazing that they were able to do so.
Re:Hmm. by Anonymous Coward · 2003-12-25 13:09 · Score: 1, Informative

This scheduler is O(1) because it runs the first process it can run without scanning the entire queue as an O(n) scheduler would do. That's why it's O(1).

I don't follow your logic here.
Assume that you have a queue with n processes. In worst case when scanning through the queue, the first process that it can run is the last one; thus, it must check n processes. If the processes in the last are placed "in random", e.g. just put there, on average, the scanner would have to check n/2 processes before finding the correct one = O(n).

How can this be called O(1)? I'm not saying the scheduler in 2.6.0 isn't O(1) but I believe your explaination is flawed.
Re:Hmm. by Otter · 2003-12-25 13:19 · Score: 1

Maybe I misunderstand what you're doing in step 1 but isn't it O(N) for the number of processes?

--
What I'm listening to now on Pandora...
Re:Hmm. by Anonymous Coward · 2003-12-25 13:25 · Score: 0

I think this way atleast it appeals to a broader audience, not just the CS audience who would know what a binary min heap is if I were to use it.
All these years playing Counter Strike and now it makes sense...
Re:Hmm. by Lord+Kholdan · 2003-12-25 13:29 · Score: 1

Seriously, I never would have considered the Ars readership to be an intellectual step down from /..

And it is only so if you consider the knowledge about schedulers to be the ultimate benchmark for smarts.
Re:Hmm. by Saeed+al-Sahaf · 2003-12-25 13:32 · Score: 0

I think this way at least it appeals to a broader audience, not just the CS audience who would know what a binary min heap is if I were to use it.
Hate to tell ya, but the CS audience is probably the only audience interested in the inner workings of the 2.6 kernel and the O(1) Scheduler... "Hey pops, check out this article on the O(1) Scheduler... You know, in Linux?"

--
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
Re:Hmm. by Jesus+2.0 · 2003-12-25 13:34 · Score: 1

No. I am looping through, and operating upon, a table of priorities, not a table of processes.

There are a constant number of priorities. Hence, the step in question is O(1).
Re:Hmm. by jejones · 2003-12-25 13:45 · Score: 1

Well... lots of schedulers at least look at each process when context switches occur (e.g. to keep them sorted in priority order)--that's O(n).

The article appears to say that 2.6 divides processes into a fixed number of priority levels, rather than having what could in theory be a large number of possible priorities--and that's what lets you use constant time methods.

Once you set that fixed number, you can do things like keep an array of lists subscripted by priority level and then a single value whose bits correspond to whether each list is empty or not. Then the least significant bit tells you which list to grab a process from, and it's easy to quickly find the LSB, since lsb(x) = x ^ (x & (x - 1)).

Of course, I haven't touched a Linux kernel, so the preceding paragraph is just handwaving on my part. The thing to do is to "Use the Source, Luke."
Re:Hmm. by moocat2 · 2003-12-25 14:18 · Score: 1

> The run queue is scanned, and when a process is
> found that can be run, it is automatically run
> without scanning the rest of the queue

> This scheduler is O(1) because it runs the first
> process it can run without scanning the entire
> queue

It sounds to me like you are missing an important piece of information to make it O(1). If the run queue was truly in some random order, then on average, the scheduler would have to scan 1/2 the list before finding a process to run. Then it would continue to be an O(n).

If there was something that could guarantee that a runnable process would always be close to the head of the run queue, then that would make it O(1).

So the question I have is what makes sure the run queue is ordered properly and is itself O(1)?
Re:Hmm. by windows · 2003-12-25 14:19 · Score: 4, Informative

You raise a good question. Here's a link that might explain a little better than I can.

Your assumption is flawed, though. You seem to be assuming that only one process in the queue is runnable. We can't make that assumption that at any given time there's only one runnable process.

I believe you have a point about the worst case of the algorithm. That's not really relevant here, though. Consider the Quicksort algorithm. It has a nasty worst case - data that's already been sorted. As a matter of fact, the worst case is an O(n^2) and approaches the slowness that is the bubble sort. Despite this well-documented worst case, we say that quicksort is an O(n*log n) algorithm, which is a more typical case.

I believe the worst case of the Ingo scheduler is O(n). If only one process can be run, it indeed has to check n/2 processes on average, and therefore is O(n).

You raise a good question, and figuring out what order an algorithm is, is something I'm not particularly good at. Generally the average case is what matters, but I don't know how you determine what the average case is for the state of processes.

Anyone care to try to explain this? ;-)
Re:Hmm. by windows · 2003-12-25 14:27 · Score: 1

I believe you're discussing a worst case of the algorithm, not the average case. While I'm not sure of how to determine an average case for this sort of thing, I don't think yours is. You're suggesting there's only one process that would be "runnable" in the queue. I don't think this is an assumption you can make.

Again, I'm not sure what the average case for the state of the queue would be, but that's not it.
Re:Hmm. by Crypto+Gnome · 2003-12-25 15:06 · Score: 4, Interesting

Personally, I *really* like the idea of split-brained articles. (ie such a shame that wasn't done here)

By that I mean the 'default'/'main' article having the not-too-geeky/JoeSixPack-reads-slashdot version, AND includes a link to "for those who want the gory details".

That way the editors have an article which appeals to the vast majority of "reasonably intelligent" people, as well as catering to the hardcore techo-savvy crowd. (ie wrote a brainfuck program before breakfast)

--
Visit CryptoGnome in his home.
Re:Hmm. by BlueBiker · 2003-12-25 15:07 · Score: 1

Thank you, this draft was immensely more interesting than the Ars version.

I'm guessing that the NUMA extensions added some complexity to the load balancing as it moves threads between runqueues?
Re:Hmm. by Anonymous Coward · 2003-12-25 16:00 · Score: 0

Do you bite your thumb at alt.religion.scientology sir?
Re:Hmm. by vondo · 2003-12-25 16:01 · Score: 1

Often times ARS articles are written at a much higher level than the /. crowd would prefer. Their earlier articles on CPU technology sold me on the site, although I haven't seen much of that kind of reporting in the last year or two. Basically, they *have* been playing down to the slashdot level.
Re:Hmm. by jrockway · 2003-12-25 16:12 · Score: 0, Flamebait

Theirs is a good scheduler, while yours isn't. That's what it comes down to. Or more likely, theirs got implemented and yours didn't. Implement it and submit a patch, maybe (ha) yours is better.

--
My other car is first.
Re:Hmm. by Anonymous Coward · 2003-12-25 16:12 · Score: 0

Nope, you both missed it. What makes the scheduler O(1) is that rather than just take a process from the 'runnable' queue and plunk it in the 'waiting' queue, it also calculates the processes next time-slice, based on it's current priority. once the runnable queue is empty or full of processes waiting on I/O, it swaps the pointers for the 'runnable' and 'waiting' queue.

If you'd like a better explanation, got up about 10 comments and the original creator of the Linux.Ars writeup posted a link to the full gory detailed version that Hannibal edited for the Ars general populace.
Re:Hmm. by BlueBiker · 2003-12-25 16:59 · Score: 1

This solution has no notion of higher priority processes taking precedence over lower priority ones, which is the main point of scheduling. It adjusts the relative timeslice of each process scheduled as it round robins through the [solitary] linked list, but it makes no attempt to schedule higher priority ones first.
Re:Hmm. by LarryRiedel · 2003-12-25 17:55 · Score: 1

Anyone want to explain exactly how it is that an O(1) scheduler is a difficult thing

Recently I have seen a lot of Dodge(TM) commercials where they say something to the effect of "it's got a Hemi". I presume it is not supposed to be some sort of self-parody, which leads me to think they must have some evidence that some people will somehow think if "it's got a Hemi", that must be something cool. Like "Intel Inside" or "hyperthreading". From a technical perspective, I do not think there is anything particularly interesting or special about the "O(1) scheduler", but it sounds like it is something cool, like a "Hemi".

Larry
Re:Hmm. by Jesus+2.0 · 2003-12-25 18:01 · Score: 1

I fail to see why letting an individual process run for a constant length of time, but having a greater chance of switching to a higher priority process at any particular switch, somehow qualifies as "scheduling" in a way that having an equal chance of switching to each process at any particular switch, but letting the process run for a length of time dependent upon its priority, does not.
Re:Hmm. by Jesus+2.0 · 2003-12-25 18:25 · Score: 1

Yes, I am claiming that I have just designed, a few posts up, the most wonderful scheduler of all time. All other schedulers, including the one mentioned in the article, pale in comparison to the greatness of my scheduler. The world will be changed by my scheduler. Nations will fall, new ones will rise; the mighty shall be brought to their knees, and the righteous shall prevail, all thanks to my incredibly amazing scheduler.

Did you miss the part where I said "off the top of my head"? "Trivial example"? "I'm not saying that I just designed the world's greatest scheduler. Far from it, I'm sure"?

Did you miss the fact that the post you're replying to was a reply to a post that was essentially the same as your post? And that the post that you're replying to essentially said, "that's not what I'm saying"?
Re:Hmm. by BlueBiker · 2003-12-25 18:29 · Score: 1

Nobody suggested constant size timeslices. I just observed that while the algorithm above varied the timeslices, it didn't attempt to service higher priority processes sooner and therefore wasn't a viable solution.

If you claim that producing an O(1) solution is trivial, then the example you present must solve the problem under consideration!

FWIW, it wasn't clear to me how any of the methods presented here or at Ars would prevent thread starvation from lower priority threads that still appreciate being scheduled now and then. Maybe that's just an implementation detail of the O(1) function which calculates priorities.
Re:Hmm. by Jesus+2.0 · 2003-12-25 18:41 · Score: 1

Nobody suggested constant size timeslices.

Fine, remove that from my statement. I fail to see how deciding which process to switch to based upon its priority somehow qualifies as "scheduling" in a way that deciding how long a process will run when it is switched to will not.

it wasn't clear to me how any of the methods presented here or at Ars would prevent thread starvation from lower priority threads

I'm sure the real one resolves this problem, but my trivial example does, too, in a trivial way: Due to the round-robin nature of it, each process will be given a timeslice before any process is given two timeslices.
Re:Hmm. by Spy+Hunter · 2003-12-25 19:05 · Score: 2, Informative

Well, even if an algorithm such as quicksort takes time proportional to N*log(N) in the average case, and N^2 in the worst case, it is still incorrect to say it is O(n*log(n)). Big-O notation literally means "in the worst case" (though people often seem to forget this). So if the worst case of quicksort takes time proportional to N^2, then no matter how unlikely that worst case is, quicksort is still O(n^2). Actually the completely correct thing to say is "quicksort is a member of the set O(n^2)" because O(n^2) really denotes the set of algorithms whose worst case execution times are no worse than n^2. (As an aside, this set also includes algorithms whose worst-case execution times are *better* than n^2. So it is correct to say that merge sort is a member of the set O(n^2) or even O(n^99), even though it is also a member of the set O(n*log(n)).)
Therefore, if Ingo's scheduler really takes time proportional to the number of processes in the worst case, no matter how unlikely that case is, it is incorrect to say that it is O(1).

--
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
Re:Hmm. by BlueBiker · 2003-12-25 19:07 · Score: 1

Sometimes a subsystem really wants attention NOW, not after another 172 of a total 187 threads have been given a chance to execute in the round-robin. Your solution essentially ignored the management of N threads and then claimed credit for the savings yielded by not managing those N threads.

Hell, I can build a 100mpg car if my car doesn't have to carry passengers [seats, sturdy chassis] or handle bad roads [full suspension] or accelerate [multi-gear transmission] or slow down [brakes] or turn [steering rack]...
Re:Hmm. by Jesus+2.0 · 2003-12-25 19:40 · Score: 1

And how, exactly, does a subsystem say that it really wants attention NOW, under your scheme, and why, exactly, does that solution not work under my scheme?
Re:Hmm. by BlueBiker · 2003-12-25 20:11 · Score: 1

My scheme? Try Linus and Bill et al.

Threads may represent daemons / drivers for devices which need immediate attention and run at high priorities or whose priorities are adjusted on the fly as needed by the OS.

You wouldn't want SETI or Folding@home or even animated .GIF's in your browser sucking up machine cycles needed to satisfy system devices or keep your MP3's playing without interruption. Under the previous 10ms 'jiffy' setting, it could take a long time to chug through a round-robin of a large set of threads.

I'm not saying it couldn't be made to work, you'd just have terrible latency and probably bad throughput too.
Re:Hmm. by Tokerat · 2003-12-25 20:25 · Score: 1

Simply for reading decscriptions here, it seems to me that this is the exact problem that O(1) solves; as each process is given its time slice, it is moved into the "already did that" list...therefore the kernel will never find a process in the list that hasn't run yet, it has already been moved out of that list and therefore is out of mind until the lists are swapped. The only challenge then is sorting based on priority, which at this point in my post-Christmas drunken-ness is probably beyond my comprehension. :-D

--
CAn'T CompreHend SARcaSm?
Re:Hmm. by BlueBiker · 2003-12-25 20:54 · Score: 1

The only challenge then is sorting based on priority, which at this point in my post-Christmas drunken-ness is probably beyond my comprehension. :-D

You're in luck: The priority directly indexes an array to find the appropriate linked list and no sorting is required. Party on!
Re:Hmm. by neopara · 2003-12-25 21:05 · Score: 1

It uses a hash table instead of a link list. It pretty much comes down to using more memory for better performce. Not really that hard of a thing to do. The preemptive scheduler is a little bit harder to do, but Robert Love mix alot of the SMP code to handle the inherited problems involved. A really sweet hack, if I may so myself. Hope that helped.

--
Nothing more, For me to say; About my life, A life of dreams....
Re:Hmm. by MechaStreisand · 2003-12-25 21:11 · Score: 1

It seems to me that it's just a matter of semantics. You can qualify the statement and say that an algorithm is O(n*log(n)) in the average case, and that statement is correct, even though it may be O(n^2) in the worst case. From what I recall of the algorithms course I took, they never said that O(f(n)) is defined specifically for the worst case of an algorithm. I think it's pretty much up to you how you mean it.

And after all, what matters is that people understand what you mean. If the person or people you're communicating with understand that you mean average case when you say O(f(n)) and don't specify average or worst case, then that would seem to me to be perfectly correct.

--
Disclaimer: IANAL. This post is, however, legal advice, and creates an attorney-client relationship.
Re:Hmm. by Anonymous Coward · 2003-12-25 21:21 · Score: 0

"glaze over" effect people were getting.

I was getting that glaze over effect too, because you spent the entire article describing ulti-tasking at not the O(1) scheduler. The information you gave applied to the 2.4 scheduler equally.

Thankyou for wasting my time.
Re:Hmm. by Anonymous Coward · 2003-12-25 21:43 · Score: 1, Interesting

No. Big-Oh notation is the worst case (upper bound). The average case is Theta. The best case is Omega (lower bound). Part of the problem is that there's no good way to write Theta in plain text. But, suffice it to say, Quicksort is Theta(n*log(n)), but O(n^2).
Re:Hmm. by fucksl4shd0t · 2003-12-25 22:22 · Score: 1

Right, and then you're surfing a linked list. Why not use a dynamic array? At least then you have a table of pointers (sorta) so you don't have to loop through the list.
The problem with looping through a list like you're doing is that that operation is O(n). I don't know exaclty the O notation for when you stop when you reach the task for which you're looking, though.

--
Like what I said? You might like my music
Re:Hmm. by Anonymous Coward · 2003-12-25 22:31 · Score: 0

Do you have trouble reading, or did you not notice the part where he points out the article you've just read was heavily edited which removed almost all the technical details? Did you spend so long posting your "witty" reply that you didn't notice the link to the original draft of the article which includes all the 2.6 specific technical details?

Thank you for wasting all of our time.
Re:Hmm. by fucksl4shd0t · 2003-12-25 22:34 · Score: 1

Hmm, it's like this:
O(1) Linux Scheduler : Linux 2.4::Hemi : Dodge's old standard issue V8.
The Hemi is a design that has all kinds of horsepower compared to a regular V8 and tends to meet emission standards and stuff. So you really do get more power with a Hemi. Therefore your comparison is quite valid. :) And your conclusion of "don't care" is still just as valid. You just missed the "act on conclusion part", which is to go away.

--
Like what I said? You might like my music
Re:Hmm. by Anonymous Coward · 2003-12-25 22:37 · Score: 0

Behold my O(1) scheduler!

void schedule( void ) { do{ /* Constant time operation */ } while( 0 ); pick_process(); /* Rest of the kernel; trivial */ }
Re:Hmm. by BlueBiker · 2003-12-25 22:48 · Score: 1

No, he's not surfing any lists. He's grabbing an element off the head and stuffing another at the tail -- both O(1) operations.
Re:Hmm. by Anonymous Coward · 2003-12-25 23:22 · Score: 1, Interesting

Hmm.. I always thought Theta was more like "==". With Big-Oh, you guarantee that the running time is = (best case).

So, an algoritm which would compute the sum of all numbers in an array would therefore be Theta(n) as it always have to check exactly n elements. Sure, O(n) is also true, but can be misleading as it will always run in "worst case" (Omega(n) is also true of course). That is, if an algoritm is a member of both O(f(n)) and Omega(f(n)), then it is also a member of Theta(f(n)).

Care to back up your statement about Theta being average case?
Re:Hmm. by tigga · 2003-12-25 23:49 · Score: 2, Insightful

I'm just saying, I don't get the concept that merely making an O(1) scheduler is so incredibly difficult, and it's so utterly amazing that they were able to do so.
Well, they (afaik Ingo Molnar) created universal scheduler which should work properly on user workstation and on server, on uni- and multiprocessor systems with small latencies and without performane loss. It may be easy to create a model in your head, but proper implementation takes a lot of effort and keyword is 'proper'.
Re:Hmm. by Jesus+2.0 · 2003-12-26 02:06 · Score: 1

Fine, fine, so just have multiple linked lists, one for each possible priority; an array of the count of timeslices you've given out to each priority; an array of the time you've given out to each priority. Loop through priorities (O(1)), decide the priority of the next process to give a timeslice to, and the amount of time that slice should be, using O(1) functions based on the time you've given to each priority and the number of slices you've given to each priority (O(1)), pop the head of the selected priority (O(1)), let it run (O(1)), push it back onto the tail of it's priority's queue (O(1)), rinse (O(1)), repeat (O(1)).

Again, as I have explained several times, I am not claiming that my scheme is so totally wonderful. I'm just saying that it's not difficult to satisfy the mere requirement "O(1) scheduler", so I don't see why so many posters are amazed by that mere fact. I'm sure that this latest version of mine is also not wonderful, but I'm not trying for a wonderful one, nor am I even claiming that if I did really try, mine would be even remotely as good as the real one described in the article. I'm just showing an O(1) scheduler, off the top of my head.

And to do that, all you need to do is make your decisions based on information that you've associated with priorities, rather than information that you've associated with processes. That is not a difficult concept; it's also not a "Eureka!" bolt out of the blue. Rather, it's probably the first idea that should occur to anybody when posed with the problem. And that's all I'm trying to convey. Maybe I should've just said that to begin with, instead of including a trivial example.
Re:Hmm. by Jesus+2.0 · 2003-12-26 02:11 · Score: 1

Right, and then you're surfing a linked list.

No I'm not. I'm popping the head, and pushing onto the tail. Both of these are O(1) operations.

I am never traversing the list.
Re:Hmm. by scrytch · 2003-12-26 03:41 · Score: 1

The original article was MUCH MUCH more in depth it was toned down because of the "glaze over" effect people were getting. I didn't have much say in the editorial process in the matter.

Good lord .. considered syndication then? Because that article was absolutely empty of useful content, and I'd like to get to the actual meat of the article if you did indeed put technical detail in there.

Seriously, demand a reprint. They are destroying your career over there by making you look about as knowledgable as a the average PC Week technology columnist. Or maybe helping it by sanding off the sharp edges of content from your style... seems no article can possibly stand technical detail, even one aimed at an audience of people interested in a damn operating system kernel.

Sorry, I get a little steamed when talked down to...

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Hmm. by scrytch · 2003-12-26 03:46 · Score: 1

> The run queue is scanned

This by definition is not O(1), unless the queue is bounded. Then what happens, does it just not schedule anything?

I mean, I could consider the travelling salesman problem to be O(1) if my algorithm consists of the following steps:

1. Solve the problem.

Sorry to be flip here, but when I hear of queues, the best behavior I can think of is O(log n) (priority queue). I'm just not seeing the constant bound here. Maybe if the article actually explained how it worked...

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Hmm. by Anonymous Coward · 2003-12-26 04:35 · Score: 0

If I remember correctly:

Big O is the worst case running time

Big Omega is the best case running time

Big Theta is a bound that is both worst case and best case at the same time (with one expression). (Sometimes referred to as a "tight bound" of the algorithm)

Keep in mind that the expressions don't take in lower polynominals or constants (for instance if the running time is 25n^2 + 5n + 7000 then the expression in any bounds is (n^2))

Keep in mind that big O doesn't mean "tightest worst case running time". As the previous poster indicated, an algorithm with worst case of O(n) is also O(n^2) and O(2^n).

Also, the mixture of constants and multipliers can be different between the upper bound and lower bound in theta.

The bound can be incorrect for some set of numbers less than a certain input number. However, the bound must hold true for all numbers above a certain "n". (This is true of algorithms that take a constant time until they hit a certain level, then they grow at a predictable rate from that point, but the constant time may be higher than the predictor would state it is for low values of n)

Finally, there isn't an "average case" bound that I'm aware of (I'm sure its out there, somewhere).

This is going back 5 or 6 years to my last algorithms analysis class from college, but I think its close to accurate. (Maybe a bit simplistic, but close enough)
Re:Hmm. by Anonymous Coward · 2003-12-26 06:18 · Score: 0

Aww, too bad. :(

The toning down actually made it *more* difficult to the layman. It became too broad and vague, ending up telling nothing specific. Even if a layman wouldn't understand the details, they would give various kinds of information on the context and the relations of things, which is crucial for even an elementary understanding... Ars articles do this too often. They hide all the grassroot level detail, and the editors don't seem to understand that it makes reading harder for laymen like me!
Re:Hmm. by LarryRiedel · 2003-12-26 06:57 · Score: 1

Hmm, it's like this: O(1) Linux Scheduler : Linux 2.4::Hemi : Dodge's old standard issue V8.

"O(1) Scheduler" : consumer/patron/customer :: "Hemi" : consumer/patron/customer

Larry
Re:Hmm. by Spy+Hunter · 2003-12-26 11:47 · Score: 1

Saying "the algorithm is O(n*log(n)) in the average case" is questionable. If you rigorously specify the conditions that an "average case" must satisfy, then you can say "the algorithm is O(n*log(n)) under these conditions." But if you don't define what an average case is, then saying something is O(n*log(n)) in the average case is just talking nonsense, because you can't talk about the "worst" average case. It's not well-defined because the concept of an average case is fuzzy. You can say that "on average, the algorithm takes time proportional to n*log(n)," but that's not the same thing as saying it is O(n*log(n)).
If your algorithms class didn't say that Big-O notation specifically indicated the worst possible case, it was a bad algorithms class. Big-O notation is *all about* the worst case. It definitely specifies absolute worst-case behavior.
Understanding isn't all that matters either. You can be understood by someone but give a bad impression at the same time. If you misuse math notation, that might give a bad impression to someone who really knows how it works (although Big-O notation is so widely abused that most likely nobody would care in this case). Also, if you're actually trying to do formal proofs of things using Big-O notation, all this stuff really does matter.

--
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
Re:Hmm. by Hast · 2003-12-26 13:08 · Score: 1

That is pretty much what their scheduler does as far as selecting the process to run. (Though they use a trick to find the highest/lowest priority level faster.)

The trick is expanding it to scale properly onto multi processor systems, dynamic priorities and all that. And naturally still keep it O(1).

The expanded article sheds some light on it all.
Re:Hmm. by wedg · 2003-12-26 20:03 · Score: 1

HEY AMIR! Hahahaha. -- Jake.

Nice article. Props and what. W00t. And where the hell have you been mofo'? And how's Jimmah doing?

Back to smokin' and drinkin'.

--
Jake
Dating: while( 1 ){ call_girl(); get_rejected(); drink_40(); } return 0;
Re:Hmm. by Wolfrider · 2003-12-28 18:08 · Score: 1

--Care to explain the diff, or have a handy link?

--
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??

Yay for Ars by be-fan · 2003-12-25 11:12 · Score: 3, Interesting

While Ars definately isn't targeted at the same audience as, say, KernelTrap, its nice to see there are a few technology websites/publications that aren't dumbed down. I remember when Byte magazine used to publish articles detailing the PowerPC architecture, down to the level of registers and the types of pipelines in the first set of implementations. Compare this to the ZD rags, which are a hair away from calling the CPU the "brain" of the computer!

--
A deep unwavering belief is a sure sign you're missing something...

Re:Yay for Ars by Anonymous Coward · 2003-12-25 11:50 · Score: 0

Yeah, Byte magazine used to kick ass. I remember when they used to carry hardware hacking articles and print PCB solder side masks.

Computercraft/Microcomputer Journal used to be a good one too before they went belly up. Jan Axelson's articles were great.

I don't know of any current magazines that are like those two were.
Re:Yay for Ars by p3d0 · 2003-12-26 00:50 · Score: 1

While Ars definately isn't targeted at the same audience as, say, KernelTrap, its nice to see there are a few technology websites/publications that aren't dumbed down.
Are you joking? I guess you missed this.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Happy BirthYear by Anonymous Coward · 2003-12-25 11:25 · Score: 0

The newer G3 will be the powerful chip of the year 2004 and it is very useful for the decent linux-2.4.23.

The AMD64 architecture is very, very bad because the long mode PAGE TRANSLATION MULTILEVEL (4 LEVELS) is an ASSHOLE versus legacy mode PAGE TRANSLATION BILEVEL (2 LEVELS) of the chips 32 bits.

open4free

Interesting by Gary+Whittles · 2003-12-25 11:31 · Score: 5, Informative

The scheduler is an important part of the kernel, so this is an important thing to understand. This article describes it well, but just to summarize a bit more technically, the benefits of the O(1) scheduler are:

Lean and mean (low overhead).
Scales well with the number of tasks (O(1)).
Scales well with the number of processors (O(1) for scheduling, O(N) for balancing).
Strong affinity avoids tasks bouncing needlessly between CPUs.
Initial affinity makes it likely that request/response-type tasks stay on the same CPU (i.e., good for LMbench lat_udp et al)

BTW, It's good to see that the starvation and affinity problems that plagued the early versions of the O(1) scheduler have been ironed out.

--

3 replies beneath your current threshold

What was the O value od the 2.4 scheduler? by pt99par · 2003-12-25 11:37 · Score: 0

see subject!

Re:What was the O value od the 2.4 scheduler? by Anonymous Coward · 2003-12-25 11:52 · Score: 0

n+1
Re:What was the O value od the 2.4 scheduler? by Anonymous Coward · 2003-12-25 11:54 · Score: 0

Linux 2.4 had a O(69) running time in it scheduler.
The Linux 2.6 O(1) scheduler is therefore better by a factor of 68.
Re:What was the O value od the 2.4 scheduler? by Anonymous Coward · 2003-12-25 12:00 · Score: 0

Actually, the O value of the 2.4 kernel's scheduler is inversely proportional to the square integral logarithmic equation of your dick's size.

Just kidding. The real O value of the 2.4 kernel is documented here
Re:What was the O value od the 2.4 scheduler? by Anonymous Coward · 2003-12-25 12:17 · Score: 0

Do you even know what an integral is?
Re:What was the O value od the 2.4 scheduler? by Anonymous Coward · 2003-12-25 12:22 · Score: 0

Why don't you shove it up your differential asshole, geek?
Re:What was the O value od the 2.4 scheduler? by Anonymous Coward · 2003-12-25 12:23 · Score: 0

Yes.

Do you? Or are you trying to leech knowledge for your calculus homework?
Re:What was the O value od the 2.4 scheduler? by LordBodak · 2003-12-26 01:29 · Score: 2, Insightful

This is actually a really good question and it's sad to see the trolls hijacked the thread.
Without explaining the O() of the 2.4 kernel, what good is an article explaining that 2.6 has a O(1)?

--
LordBodak's journal.

A question if anyone can answer by abhikhurana · 2003-12-25 11:46 · Score: 2, Insightful

That was kinda interesting though it doesn't really explain how they achieve the O(1)ness, and for example how did kernel 2.4 fare in comparison. Anyway, an O(1) scheduler is nice and all under low to moderate loads, but at high loads, when a lot of threads are running, the O(1) behaviour should enusre that the tasks are switching extremely fast, and as every context switch has some related overhead in terms of saving the context etc., that means that performance of the O(1) schedular should be worse than that of 2.4(where tasks are not switching as fast) in theory atleast. So is this really true?

--
What's under yellowstone?

Re:A question if anyone can answer by Tweaker_Phreaker · 2003-12-25 12:07 · Score: 1

The thing that makes it an O(1) scheduler is that it has a static overhead. In high load it should achieve the same throughput but have lower latency when reacting to important events like user input.
Re:A question if anyone can answer by abhikhurana · 2003-12-25 12:21 · Score: 1

Yeah, it has a static overhead per task. As the number of tasks increase, teh overhead will increase as well. It no doubt will be able to maintain a lower latency but if teh CPU load is 100%, then every task switch becomes expensive, so I am not sure about the throughput part in such a case.

--
What's under yellowstone?
Re:A question if anyone can answer by vadim_t · 2003-12-25 12:49 · Score: 1

No, the idea is that the O(1) scheduler has static overhead period. Not per task. It always takes the same time, independently of whether 10 or 10000 processes are running.

Simple example of an O(1) operation: Accessing a position in an array. Whether there is one or 1000 elements, accessing one of them takes the same time. This is simple to exploit as well.

Suppose that depending on the value of a byte (random, so that any value has the same probability of happening) you have to do some work. You can write a switch with 256 choices. This will raise the time needed to do all the required tests linearly.

Or, you can store pointers to the functions that handle each task in a table, and call the function by the pointer stored in table[n]. In this case, you can have any number of functions, and the time needed to decide which one to call will still be the same.
Re:A question if anyone can answer by Jimmy_B · 2003-12-25 15:58 · Score: 1

Suppose that depending on the value of a byte (random, so that any value has the same probability of happening) you have to do some work. You can write a switch with 256 choices. This will raise the time needed to do all the required tests linearly.
Actually, this is incorrect at least in gcc with optimization enabled. While a switch statement is conceptually equivalent to a long if-else if-else chain, the compiler produces a jump table so that the branch reduces to an array lookup of an address, which is O(1). That, by the way, is why you can't have complicated comparisons in a switch - the jump table trick wouldn't work anymore, so it would be no better than an if chain. Sparse possibility spaces are an exception, since in those cases the jump table would take an excessive amount of space.
Re:A question if anyone can answer by Anonymous Coward · 2003-12-25 17:37 · Score: 0

Hmm, I think any decent compiler (for C, that is) would turn a switch into highly optimized code where the number of choices would not matter much (which, anyway, is why the case's (in C) have to be values, can't be expressions).
Re:A question if anyone can answer by lindsley · 2003-12-25 19:02 · Score: 2, Informative

They achieved O(1)ness by making it extremely rare to have to do linear searches of greater than one. Not impossible, but very rare.
There are 140 possible priorities. Per processor. Each priority has its own pair of queues -- runnable-and-not-yet-run (active), and runnable-but-waiting-for-more-time (expired). (If a process isn't runnable, it doesn't even appear in the run queue; already an improvement over 2.4). In the 2.4 scheduler, the system task list WAS the runqueue list, and you had to inspect it, from beginning to end, holding a lock, to find the next best process to run. As the system got busier, guess what -- that search got longer and your holding of the lock had a bigger, cascading effect. With multiple cpus, you might finally have multiple processors decide on the same process ... then the first one grabs it and n-1 processors would have to go back and find the next best choice. Maybe they would conflict again. In 2.6, with 140 priorities often your queue of best processes is very short because you have so many places to put runnable processes, and on a per-processor basis no less.
But let's take a worst case example -- a heavily loaded system. You've got 1000 processes, but a lot of them are daemons waiting for something to happen. Still, as luck would have it, you have 200 processes runnable and waiting to run because your web server just got hit, and 4 processors handling them. Let's further assume that due to bad luck, all of them are stuck on only one of those cpus. And let's slap them with incredibly bad luck and place them ALL at the same priority, which is unlikely but certainly possible. Now that O(1) flies out the window, right? because you have 200 processes in one queue, just like before and now you have to look at all of them to figure out who runs next, right? Wrong. It's a FIFO linked list, and since you know they're all the same priority, you do the fair thing and take the one at the head. Took you a mere blink to determine the highest priority queue that had something (find first bit over a bitmap for the queues) and then a simple assignment to take the first thing off the list and select the fairest choice. As these processes use up their timeslices they are placed on the expired queue, again in FIFO order, and when the active is empty, with another simple assignment the expired becomes the active.
What's the down side? Well, the 2.4 scheduler had one queue (the tasklist) for all the cpus, and that meant it was self-balancing. You didn't end up with "200 processes on one processor"; as the processors needed processes they were handed one from The One Queue. However, in 2.6, a process doesn't randomly wander from its processor's queue. You do need to occasionally go move processes from processor to processor to insure they're not waiting too long for their turn. Otherwise, in the worst case, one processor has all of the runnable processes (all but one waiting) while the other processors sit idle.
So to summarize, under heavy load, 2.4's scheduler spent more and more time agonizing over which process to run, when you really needed it to be more and more efficient. 2.6 spends the same time under heavy and light load, which means more of the cpu is available to actually do the work. The effect is similar to removing three layers of middle management :)
Re:A question if anyone can answer by Anonymous Coward · 2003-12-25 19:44 · Score: 0

No, this is emphatically not true.

The O(1) scheduler doesn't ensure the tasks are running extremely fast, it ensures the kernel can pick the next task to be run extremely fast. The length of time the task is run for is not affected (unless this has been changed elsewhere). The only result of the 0(1) scheduler is that you would increase performance, because more tasks could be run, instead of deciding which one to run.

Yes, decreasing the amount of time allocated to a process will increase the percentage of time spent in a context switch (all other things being equal), but that isn't what the O(1) scheduler does.

Posting as AC cause I don't care to create an account.
Re:A question if anyone can answer by BlueBiker · 2003-12-25 20:41 · Score: 1

A wonderfully lucid explanation, someone pls mod parent up.

Does 2.6 have SMP load balancing based on priority? I'm wondering about the case of a system where CPU 0 has ten high priority processes and CPU 1 has ten low priority processes. Presumably both CPU's could continue crunching away on their respective runqueues, but you'd think it'd be better to distribute the high priority processes across available CPU's.

Re:Not a very INSIGHTFUL article. by Anonymous Coward · 2003-12-25 11:56 · Score: 0

Redundant? Where was the first comment mentioning this?

What about the hardware? by tloh · 2003-12-25 12:02 · Score: 0, Offtopic

My first experience with Linux a few years ago was rather awkward when I had to learn how to prune Redhat 5.2 enough so that my 486 worked decently as a GUI desktop. Granted, the latest Linux still runs better than the latest Redmond has to offer on the the same hardware. I still think in order for Linux to gain more legitimacy, we need to defeat some of our own FUD. Some Linux evangelists I've met have made outrageously vague and ambiguous remarks regarding the speed and efficiency of Linux that make no sense when applied to a modern multimedia PC. It wouldn't be supprising if there are Linux novices out there who have been told of the extreme efficiency of the Linux kernel and are trying to run the latest kernel on obsolete hardware. I knew nothing of linux when I first started out, but I had a geek background and I was stubborn. Most folks weaned on Microsoft may not be as persistant. If you are just starting out with Linux as I was a few years ago, how do you make the appropriate hardware decisions? Does all this new kernel stuff matter to you? If so, why? How do you know it's important?

--
Stay sentient. Don't drink bad milk.

English 101 by kantai · 2003-12-25 12:03 · Score: 1

introduces a number sweeping improvements

Looks as if someone forgot a certain important preposition :-)

Re:English 101 by Anonymous Coward · 2003-12-25 12:47 · Score: 0

It was not forgotten, it was there; but it was swept off. :-)
Re:English 101 by Anonymous Coward · 2003-12-25 13:03 · Score: 0

Looks as if someone forgot a certain important punctuation mark.
Re:English 101 by geogeek6_7 · 2003-12-25 13:22 · Score: 2, Funny

Err. Oops. I blame Soviet Russia.

sidebars??? by mefus · 2003-12-25 12:03 · Score: 3, Insightful

Isn't that what a sidebar is for?

Couldn't they have linked out into more in depth treatments, and saved the complexity for the interested (technically) reader while saving 'readability' for the non-technical ppl?

--
mefus
In Open Society, GPL Software frees YOU!

Ars got it slightly wrong about O(1)... by andy666 · 2003-12-25 12:06 · Score: 5, Informative

O(1) doesn't mean the time is constant, but that in the limit it is bounded by a constant. It can get enourmous, then shrink for large inputs. It could go to 0 in fact.

Re:Ars got it slightly wrong about O(1)... by qc_dk · 2003-12-25 12:55 · Score: 2, Insightful

that is not entirely true it means that for the worst case input, there exist a constant k and a smallest number of processes np where the following equation is fulfilled: Time_To_Switch_n_proces(p)< k where p>np that is not exactly the same as your definition.
Re:Ars got it slightly wrong about O(1)... by Anonymous Coward · 2003-12-25 19:19 · Score: 0

The part about p>np is there to handle the case t(n) = O(f(n)) where f(0)=0 but t(0) is non-zero. In our case f(0)=1, so the definition reduces to "the time is limited by a constant k". Just choose k = limsup(t(n), n=0..Infinity).

Not surprised by this moderation by Anonymous Coward · 2003-12-25 12:10 · Score: 0

Don't post anything about problems with Linux.. you'll get modded down for making Linux look back, and we can't have any of that

Re:Not surprised by this moderation by Anonymous Coward · 2003-12-25 12:12 · Score: 0

s/back/bad/;
Re:Not surprised by this moderation by Anonymous Coward · 2003-12-25 12:40 · Score: 0

>> Don't post anything about problems with Linux.. you'll get modded down for making Linux look back, and we can't have any of that

Linux is not Windows and Slashdot is not Microsoft, son.

Part of the zealotry here is improving Linux. Bugs are not costs, they are opportunities, ladders to fame... oh, and this is basic, you really should've noticed it on your own.

Check your facts.
Re:Not surprised by this moderation by Anonymous Coward · 2003-12-25 13:20 · Score: 0, Flamebait

Non-sequitur.

The person posted something that made Lunix look bad and guess what.. he got modded down.

End of story. Moderation around here is biased and sucks.
Re:Not surprised by this moderation by Anonymous Coward · 2003-12-30 01:41 · Score: 0

Meta Moderation fixes the system, Don't just click 'fair' on everything, take the time to read the post in context.

what is the best 64 bits architecture? by Anonymous Coward · 2003-12-25 12:12 · Score: 0

The best scheme of page translation for the 64 bits architecture quicker is:

Scheme page-16KiB translation tri-level

Each page-16KiB contains 2Ki-entries

The virtual memory of the processes can address upto 2Ki*2Ki*2Ki 16KiB-pages = 8Gi 16KiB-pages = 128 TiB (huge space for many big applications).

open4free

Ah-ha! by tsanth · 2003-12-25 12:18 · Score: 2, Insightful

Thanks; that third paragraph alone was much, much more informative than the linked article. Perhaps, if the editors can be bothered to edit a bit more (unlikely), could Ars provide links to these more technical articles?

Re:Not a very INSIGHTFUL article. by Anonymous Coward · 2003-12-25 12:18 · Score: 0

Nowhere. The moderators are just stupid.

Re:Flash Blocker? by Anonymous Coward · 2003-12-25 12:23 · Score: 0

Use a Mozilla with the Adblock-plugin

Deep algorithm analysis? by Anonymous Coward · 2003-12-25 12:32 · Score: 1, Interesting

While O(1) certainly might sound impressive to someone who's taken a first semester computer science course in university, anyone beyond that would obviously realize you've got to worry about the average case analysis.

To achieve O(1) you usually have to do some tricks that will get you quite a large constant. With an appropriately small number of processes, which I assume is the case with most computers, even an O(n^n) scheduler could be faster if the constant is very, very small.

Have any sites done an average case analysis to show at what point the O(1) scheduler is faster than the old 2.4 kernel? Certainly it isn't always faster or else the old one really did suck.

Re:Deep algorithm analysis? by vadim_t · 2003-12-25 12:54 · Score: 1

Well, my computer has 114 processes running at the time. O(n^n) would be *very* bad here. Heck, it'd probably be bad even if I had just 10 running.

I don't think this will happen any time soon, but a way of working around a slow O(1) algorhitm would be simply measuring the amount of data, and switching to the algorhitm that's faster. Say, with 10 processes running it uses the old one, with more than 100 it switches to O(1).
Re:Deep algorithm analysis? by Anonymous Coward · 2003-12-25 13:03 · Score: 1, Informative

Yes, those questions are central to this discussions. Maybe they are dealt with in the original extended article whose link the author kindly posted above (but is slashdotted right now).

I seem to recall seeing benchmarks about O(1) in Kerneltrap, which confirmed that for a small (less than 50? I can't remember) number of processes, the old scheduler was better.

My main interest is the desktop/notebook/pda environments, though of course servers will benefit from this scheduler (due to their usually higher number of processes).

Even on the desktop, the O(1) could lead to benefits for those who load lots and lots of apps and keep them running. This is not my traditional usage today, as I turn off the computer at night.

But this can change. If anyone is interested, there's a Linux ecology howto, which presents ideas on how to make computers less power hungry.
Re:Deep algorithm analysis? by lkaos · 2003-12-25 13:41 · Score: 0

I'm just using this machine as a typical desktop (I've just got mozilla and gaim running at the moment) and I've got 84 processes. Remember, the scheduler works with any process, thread, or whatever.

O(1) = 1
O(log n) = 7
O(n) = 84
O(n log n) = 537
O(n*n) = 7056

So, even with a damn good O(log n) scheduler, an O(1) scheduler can afford to have a constant factor 7 times worse than a O(log n). So, I don't think this is much an issue here since n is usually pretty large.

--
int func(int a);
func((b += 3, b));
Re:Deep algorithm analysis? by Iamnoone · 2003-12-25 13:50 · Score: 2, Informative

To achieve O(1) you usually have to do some tricks that will get you quite a large constant.
I dont think that is necessarily true. You can (as they are doing here) trade memory use for speed by enumerating a bucket for every possible option and having a method to decide which bucket to look in. The memory requirements change for the number of options but the speed of the decision stays at a (low constant) O(1).
In general, its a matter of building clever, info-rich lookup keys and organizing the data into a data structure that has the same access time for each key.

However, from http://67.163.90.104/~jim/scheduler.html :
Basically a function (find_busiest_queue) is called which will return another runqueue if there exists a runqueue that has 25%+ more runnable processes than the current process. If there is such a runqueue then the runqueues are locked (via each runqueues spinlock) and processes are transfered from the busier runqueue to the current runqueue, then the runqueues are unlocked.

Since I think people lump the loadbalancing portion of an SMP system in with the "scheduler" then clearly the overall scheduling and running of processes is not O(1).
I haven't heard of/read about Sun/HP/IBM/SCO* speaking about their SMP system schedulers and runqueue balancers separately. Anyone? Anyone?

* Yeah, that is a joke, because SCO would never talk about their scheduler (scheduler headers files, OK, yeah, maybe in an open letter to rich CIO's) since they already have the Linux 2.6 O(1) scheduler in their secret underground stash of "scanned into TIFFs and copied onto CD's" source code - Ha!
Re:Deep algorithm analysis? by MoP030 · 2003-12-25 14:04 · Score: 1

O(n)=84 ??? That is an interesting interpretation of the BigO notation. O(n) means that the function describing the time needed for the algorithm to finish can be upper-bounded by a polynomial of degree 1, p(t) = c1*t + c2, where c1 and c2 may be arbitrarily large. I don't think your value of 7 has any meaning in this context. And while an algorithm with low complexity is always interesting from a theoretical point of view, it may not always be feasible to actually use it if the constants are large.

--
the most sexp i get is my paren-mode.
Re:Deep algorithm analysis? by Fulcrum+of+Evil · 2003-12-25 15:36 · Score: 1

So, even with a damn good O(log n) scheduler, an O(1) scheduler can afford to have a constant factor 7 times worse than a O(log n). So, I don't think this is much an issue here since n is usually pretty large.

Of course, for any desktop application, the diff a factor of 7 is something like 1e-7 versus 7e-7 load. I would imagine this is only important on a big, busy MP box.

--
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
Re:Deep algorithm analysis? by Sloppy · 2003-12-25 16:03 · Score: 1

With an appropriately small number of processes, which I assume is the case with most computers, even an O(n^n) scheduler could be faster..
And indeed that has been the case; most of our PCs have been running fine, even with linear schedulers. The thing is, some people are running Linux on some pretty big computers that do a lot of work, and they don't have "an appropriately small number of processes." 2.6 makes Linux more suitable for that type of job.
If you read the original, non-dumbed down article (there's a link to it somewhere in this discussion) you'll see that what they did wasn't really all that complex, so 2.6's scheduler will run just fine on our personal computers with few processes, too.
It looks to me, like 2.6 is Just Plain Better. Not that I've actually dared to let it touch any of my machines, yet. ;-) I fear bugs.

--
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
Re:Deep algorithm analysis? by Anonymous Coward · 2003-12-25 16:40 · Score: 1, Interesting

You forget that O() notation doesn't actually tell you how fast something is. It describes the rate of change with relation to the number of items. So, a scheduler that's O(1) and very complicated could take longer to run than an O(n) scheduler that's less complicated. Then, only for larger values of n is the O(1) better. That was the argument used on the mailing list against this new, much slower scheduler. It is slower for the vast majority of Linux users. The only system I've used it on where 2.6 is faster is one that runs several thousand Java threads in a misdesigned application. 2.6 is about 2% faster than 2.4.23. That's nice since most of our systems are running "on the edge" since our traffic is going up much faster than our budget, but for the others, they're stuck on 2.4.23 for now.
Re:Deep algorithm analysis? by lindsley · 2003-12-25 19:28 · Score: 1

Since I think people lump the loadbalancing portion of an SMP system in with the "scheduler" then clearly the overall scheduling and running of processes is not O(1).

Correct. I think the part of scheduling that most people are interested in is "I have an idle processor; how expensive is it to give it a process?" That process (if you will overlook the pun) has been made effectively O(1). The overall load of the scheduler on the system continues to have some dependency on the load of the system. However, even the 2.6 scheduler has some heuristics now that say, "this level of finesse is no longer important after the load is this high" and start shedding some of the clever stuff after a point. Much of the work now is figuring out those heuristics and where the breakeven point on them are, and not surprisingly a lot of them have to do with load balancing between processors.
Re:Deep algorithm analysis? by Anonymous Coward · 2003-12-25 21:17 · Score: 0

With an appropriately small number of processes, which I assume is the case with most computers, even an O(n^n) scheduler could be faster if the constant is very, very small.

No, actually, you lose. "Most computers" have more processes than you have fingers in one hand.
Re:Deep algorithm analysis? by oo_waratah · 2003-12-25 23:20 · Score: 1

I cannot answer scientifically. All I can say is that my computer is now useable under heavy compiling load. I used to suffer "lost mouse", type ahead and correct ahead for large lines of text, now I see only very small pauses.

Downside is that I had a looping process for about 24 hours and I did not notice it until my compile took 4 times longer than it should, I was using the GUI all this time. With power comes responsability, I guess.
Re:Deep algorithm analysis? by Anonymous Coward · 2003-12-25 23:29 · Score: 0

In SVR4, the time-sharing class uses an event-driven scheduler (basically the same idea as the scheduler in Linux 2.6 but its implementation is quite different), which is O(1) in the number of threads. Thus all SVR4-based operating systems (including SCO UNIX and Solaris) have this scheduler. SCO does not own this concept. It was published in 1986 in a paper from Straathof.
Re:Deep algorithm analysis? by lkaos · 2003-12-27 12:19 · Score: 1

Something certainly can not be bounded by a function of any degree where you allow the coefficients to be arbitrarily large.

Rather, O(n) describes the upper bound of a function. A bubblesort has an O(n) of n*n and rightly so an O(n) of n*n*n because they both are upper bounds.

O(n) has the unfortunate nature to be described both in a traditional sense O(n) = log n where O is a function of the variable n and in a more relaxed sense of O(n) which really is O(n) = n.

O(1) is more properly written as O(n) = 1. This means that for any value of n, the performance is always the same.

I've chosen to demonstrate that since the average number of processes on my machine is 84, then the performance of an O(1) algorithm can be as bad as an O(n) function where n=84.

What I was trying to establish is that since n is relatively high usually, that if you consider a scheduler that was O(log n), then on my desktop, the O(1) scheduler could afford to be 7 times worse in terms of scheduling a single process.

BTW: it also is important to note that any algorithm has an O(1) solution using a lookup table. In fact, this solution also has an extremely low constant factor. However, these solutions are not often practical because of other factors. Performance analysis is a complicated beast.

--
int func(int a);
func((b += 3, b));

Many programs are multithreading now... by Kjella · 2003-12-25 12:32 · Score: 1

Well, only if you want to run only as many threads as you have CPUs. The windows machine I'm using right now has 37 processes, some I'm sure with multiple threads.

Did a quick check here, 54 processes, 461 threads. Something tells me I won't see a computer with 4-500 CPU cores in my desktop, ever. Not to mention that if you did, I'd want some to be more important. I'm sure many of those threads are just sitting there waiting for input, e.g. GUI threads...

Kjella

--
Live today, because you never know what tomorrow brings

Thanks. by Anonymous Coward · 2003-12-25 12:34 · Score: 0

That was extremely informative; thank you for posting it.

-- Super Ugly Ultraman

Re:Flash Blocker? by Anonymous Coward · 2003-12-25 12:37 · Score: 0

Plan9.

Preemption defined incorrectly, I think by vrt3 · 2003-12-25 12:39 · Score: 4, Insightful

In the article, you write:

It is important to remember that a process can be stopped in the middle of its timeslice; this is the purpose of preemption.

The purpose of preemption is not to stop processes in the middle of their timeslice; the purpose is to stop them when the scheduler decides to, not only when the process itself decides to give another process a chance to run. The latter is called cooperative multitasking, and is used in Windows 3.x and MacOS prior to X.

Timeslices are used in order to implement preemption: a process stops either when it blocks (waiting for I/O) or otherwise voluntarily gives up the rest of its timeslice, or when it's timeslice is used up.

About the article in general: I'm surprised about the dumbing down. Other articles on Ars, including but not limited to the series about processor technologies (e.g. the one about hyperthreading) are much more thorough and detailed.

--
This sig under construction. Please check back later.

Re:Preemption defined incorrectly, I think by iusty · 2003-12-26 02:00 · Score: 1

I think you are confusing the things a little: in 2.6, the kernel itself is preemptible, while user processes where always preemptible in Linux.

I mean, your definition is correct, but the big win in 2.6 is that processes can be preempted while being in a syscall (i.e. really the kernel runs), so it is wrong to suggest that previously (2.6) Linux had the same algorithm as Windows 3.x.
Re:Preemption defined incorrectly, I think by vrt3 · 2003-12-26 02:13 · Score: 1

Of course, I never meant to suggest that Linux 2.6 had the same kind of multitasking as Windows 3.x. I know that Linux has had preemptive multitasking for a very long time, even since the very beginning I suppose.

--
This sig under construction. Please check back later.

alpha by Anonymous Coward · 2003-12-25 12:40 · Score: 0

# Important Stuff: Please try to keep posts on topic.
# Try to reply to other people's comments instead of starting new threads.
# Read other people's messages before posting your own to avoid simply duplicating what has already been said.
# Use a clear subject that describes what your message is about.
# Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by
adjusting your threshold on the User Preferences Page)

See lovely pictures of Compaq and HP executives here

Re:Flash Blocker? by Anonymous Coward · 2003-12-25 12:48 · Score: 0

lynx?

How is this different from the BSD/VMS schedulers? by Anonymous Coward · 2003-12-25 12:48 · Score: 2, Interesting

Is this new? BSD and before that VMS used the same mechanism, priority queues for twenty years. The VAX architecture even had instructions to do this in 28 instructions. Here is some BSD code:

static struct rq queues[32]; static u_int32_t queuebits; int qi; qi = (current->priority >> 2); SetBit(&queuebits, qi); InsertQueue(&queues[current->priority ], current); qi = FindFirstSet(queuebits); if (RemoveQueue(&queues[qi], &current) == Q_EMPTY) ClearBit(&queuebits, qi);

You might want to look at this article:
www.usenix.org/publications/library/proceedings/al s00/2000papers/papers/full_papers/sears/sears_html /

Technically, this is not O(1) by Anonymous Coward · 2003-12-25 12:49 · Score: 2, Interesting

Let me explain: the Ars article says

When it comes time to select a process, the scheduler finds the highest priority level containing available processes. It then selects a process from the desired priority level. Because the number of priority levels is fixed, the scheduler always takes a constant amount of time.

Yes, but it does need to inspect all the processes within a priority level, right??

As the total number of processes grows, the average number of processes within a priority level also grows, and so does scheduling time. So it's not O(1).

Unless choosing a process from a priority level is also constant time -- say, if it were a round-robin queue.

What am I missing?

Re:Technically, this is not O(1) by Anonymous Coward · 2003-12-25 17:46 · Score: 0

What makes you think picking a process from a given priority level would not be O(1)? If all your M process are in the same priority level, you use any of them. Pickining a (any) process from a given list is O(1): it doesn't matter how many elements you have in the list, you can use the head and be done with it.

Re:Flash Blocker? by kars · 2003-12-25 12:52 · Score: 1

http://www.squarefree.com/userstyles/xbl.html has some info on setting this up in Mozilla. The nice thing about this method is that you'll actually get a clickable button. The flash animation doesn't show up until you click it.

Or you can go to http://www.squarefree.com/bookmarklets/zap.html where you can find bookmarklets that clean up a page after it's loaded. Very handy :)

--
Take life easy: one bit at a time.

Trade offs? by qc_dk · 2003-12-25 13:02 · Score: 1

Does anybody know what tradeoffs if any have been made to make the scheduler O(1)??
Is it a better algorithm implementing the same scheduling scheme or has the scheme changed??
I mean a O(1) scheduler is extremely easy if you are satisfied with a round robin scheduler:

newprocess = processqueue.pop();
processqueue.push(oldprocess);
do_context_switch;

Which would be O(1) for a sensible choice of queue.

Re:Trade offs? by mikera · 2003-12-26 01:26 · Score: 1

Your algorithm is of course O(1) but it doesn't allow for differing priority levels between processes. You need this if you want high priority processes to pre-empt low priority background processes at appropriate moments, which is pretty much essential for real-time or interactive applications.

The Linux O(1) scheduler does this. It implements a queue for each of 140 different priority levels, and then using an O(1) technique to find the queue with the highest priority that still contains a process waiting to run. It doesn't distinguish between processes in the same priority level, so effectively it's doing round-robin for processes of the same priority.

I think the Linux scheduler is pretty excellent from an algorithmic perspective. I don't see any really big trade-offs that have been made. You could write a faster, simpler scheduler that was optimized to handle a very small number of processes but that wouldn't scale well and could only really make sense for specialised embedded applications.

Remember that the real overhead in scheduling typically isn't actually the algorithm itself - it's the large cost of doing context switches, cache flushes etc involved in switching to a new process.

You therefore gain far more by making intelligent scheduling decisions than by squeezing a few cycles out of the scheduler itself. There's therefore still plenty to be done in the fine tuning of the current O(1) scheduler, e.g. getting the best scheduling behaviour for particular classes of processor architecture or application workload.

Re:Flash Blocker? by Anonymous Coward · 2003-12-25 13:04 · Score: 0

easy, dont install that crap

Re:Arse? by Anonymous Coward · 2003-12-25 13:04 · Score: 0

It gets even better: Do you know what the highest rank in the German navy is? It's called "Grossadmiral". I think thats pretty gross!

Well... by tsanth · 2003-12-25 13:06 · Score: 1

Yes, but it does need to inspect all the processes within a priority level, right??

The analysis in the thread above reveals that this needn't be the case: assuming that find_first_bit() takes O(1) time, that makes the search time only as large as the number of priority levels; presumably, that number is also constant. When the article says "selects a process," I believe it ought to mean, "selects the first eligible process in that priority level"--given either a bitmap or a priority queue, that operation takes O(1) time.

Re:Well... by mikera · 2003-12-26 00:45 · Score: 1

You're exactly right.

Note though that there's an advantage to the linux scheduler method over a priority queue: although both can access the (joint) highest priority process in O(1) time, it generally takes O(ln n) time to insert/delete a process in a priority queue, whereas the linux scheduler technique can do this in O(1) time as well.

Re:Multi core proc's -- flamebait? by Anonymous Coward · 2003-12-25 13:10 · Score: 0

Mod me down, but keep the parent post either offtopic, redundant, overrated, or whatever. It's not flamebait , read the fscking moderator guidelines. Djees, stupid mods.

Re:Not a very INSIGHTFUL article. by geogeek6_7 · 2003-12-25 13:12 · Score: 1

If you were say, something other than a reactionist troll, you might have realized a couple of things:

Big-O notation is not intuitive.
Linux.ars is a generalized Linux publication.

Because Big-0 notation is unintuitive, and indeed because the function of a scheduler is not known to most, Ars made an effort to explain it. If you already knew the information it provided, good for you! At least you have intelligence in some areas of your existence.

Because Linux.ars is a generalized publication, they also review software and other such tidbits relating to linux. I suggest following it week to week-- its an enjoyable and informative read.

UNIX catches up with mainframes by Animats · 2003-12-25 13:13 · Score: 5, Informative

For historical reasons, revolving around the "process table" design in PDP-11 UNIX, the schedulers in UNIX-like OSs were originally very dumb. The one in V6 UNIX was so bad that if three processes were compute-bound, the scheduler would stall.

Mainframe schedulers have been O(1) for a long time. The one for UNIVAC 1108 EXEC 8 was O(1) in 1967. (And you can still buy a Unisys ClearPath server with that code in it.)

The traditional implementation is a set of FIFO queues, one for each priority. For systems with only a few priorities, that's just fine. As the number of different priorities increases, you spend too much time checking the empty queues, so there are tree-like structures to get around that problem.

Scheduling today is more complex because of cache issues. If you context-switch too much, you thrash in the CPU caches. On the other hand, everybody has so much memory today that paging is mostly unnecessary.

Re:UNIX catches up with mainframes by Anonymous Coward · 2003-12-26 06:48 · Score: 0

Scheduling today is more complex because of cache issues. If you context-switch too much, you thrash in the CPU caches.

It depends which CPU you're using. For many modern CPUs, the addresses in cache are tagged with their physical addresses. So flushing the cache on a context switch is not necessary.

Linus Didn't steal it from SCO. by Anonymous Coward · 2003-12-25 13:16 · Score: 3, Funny

Please stop posting SCO code here, it only gets Darl all excited.

Re:Linus Didn't steal it from SCO. by Anonymous Coward · 2003-12-25 16:53 · Score: 0

I SUE YOU!!

Kindly,

From the Law office of the lawyer reprenting Mr. MC Bride

Look under "Timeslices and niceness" by Kjella · 2003-12-25 13:19 · Score: 1

Personally, I didn't feel it said much until you got here:

"The scheduler keeps track of all these timeslices with two lists of processes. The first list contains processes that have yet to use their timeslice; the second contains those processes which have used their time. When a process uses its timeslice, the scheduler calculates a new timeslice (...). The process then gets inserted in the second list. When the first list becomes empty, the second list takes the place of the first, and vice-versa."

That's it, basicly... They move the processes from one list (unused) to another (spent), and when empty reverse direction and role. They calculate the new timeslice every time a process is moved, so each task change is in constant time, O(1).

Of course, that doesn't really explain why that's a really good idea. To do that, you'd have to compare it to other ways of doing it, which would get rather involved, and rather technical quite quickly. For one, Linux has been doing it by calculating all the timeslices at once (i.e. the "unused" list), executing the tasks in that list and starting over. Which is O(n), since the task of calculating all the timeslices was increasing with n.

Another thing is the speed of inserting and deleting processes. Which is very simple in the O(1) scheduler - you simply append it to the unused list, or delete it (when it's processed? at a flip? dunno). As opposed to having some kind of list where you'd get "holes" from removals, the constant flow back and forth "closes" any such holes.

It might sound fairly simple, but it's far from trivial making it actually work well. It may be O(1), which is nice for scaling, but it still needs to make a smart schedule. For one, how can it "adapt" to say, many other high-priority processes running, and still being O(1)?

It can't simply assume that just because it has a high priority, award itself a huge timeslice. There may be many of them, or higher prioritated tasks, it's all relative. But if you loop through the process list to check, you have a O(n) algorithm again.

So, there's actually quite a bit of work behind it, which is also why it's first here now in 2.6. If it was that easy, I think someone would have thought of it sooner...

Kjella

--
Live today, because you never know what tomorrow brings

Sounds like an RPG by blixel · 2003-12-25 13:20 · Score: 4, Funny

When a process uses its timeslice, the scheduler calculates a new timeslice by adding the dynamic priority bonus to the static priority. The process then gets inserted in the second list. When the first list becomes empty, the second list takes the place of the first, and vice-versa. This allows the scheduler to continuously calculate timeslices with minimal computational overhead.

Sounds like the rules to an AD&D adventure.

Re:Sounds like an RPG by Tumbleweed · 2003-12-25 17:17 · Score: 1

Sounds more like the rules to Fizzbin...

Re:Flash Blocker? by Santa_Clause · 2003-12-25 13:28 · Score: 1

Thanks for the link. How anybody can read an article with a flashing/animated ad is beyond me.

--
Don't forget, Christmas is coming, and I check my list twice!

A real explanation by dtfinch · 2003-12-25 13:32 · Score: 1

Would tell you enough to understand how it works. We already knew what O(1) means.

Question by wolfb · 2003-12-25 13:34 · Score: 3, Interesting

The mechanism for recalculation of timeslices in previous Linux kernel's was very simple. When every process had its timeslice completely depleted (they were all 0) the kernel would simply go through every process and recalculate its timeslice and start execution again at the highest priority runnable process. While this is the most obvious solution it is also very inefficient, executing in O(n) time.

Ok, its easy to see why this is O(n).

The 2.6 scheduler uses a simple yet effective method for getting rid of this problem, it uses two priority arrays! One priority array is for processes that are runnable, and one priority array is for processes that are not runnable (they have depleted their timeslice). This way if when a process has depleted its timeslice the scheduler simply recalculates its timeslice, removes it from the active array, and inserts it into the expired array.

How is this not O(n)? The time slice calculation still occurs for each process, just not all at once for all processes. Each process still gets its time slice calcuated, it is removed from one queue, and inserted into another. Is there some other unmentioned trick that eliminates the calculations? Or was there something else that made the 2.4 scheduler O(n), such as finding the highest priority process?

So when all processes have depleted their timeslices there is no need to recalculate timeslices for every process, the two arrays are just switched (for the code oriented among us: they are accessed via pointers and the pointers are simply switched).

So the calculation is done per process as they finish their time slice, rather then at the end when all the processes are done. I still don't see why this would imply better efficiency. Am I missing something?

At any rate, thanks for the link, it was much more informative than the published article.

Re:Question by Nexx · 2003-12-25 14:48 · Score: 4, Informative

It's not. However, instead of having a slight pause (that scales linearly with the number of processes in-flight) after all the processes have exhausted their timeslices, there is a constant *tiny* pause after each process exhausting their timeslice. Since previously, the kernel was not preemptable, the O(n) recalculation would potentially cause, on large systems especially, a pause while the timeslices were recalculated. The new scheduler increases system responsiveness by eliminating this mass-recalculation.

I think the majority of major improvements on the 2.6 kernel over 2.4 were placed for increasing system responsiveness.

(and yes, I'm using my Karma bonus)
Re:Question by Fulcrum+of+Evil · 2003-12-25 15:29 · Score: 1

How is this not O(n)? The time slice calculation still occurs for each process, just not all at once for all processes. Each process still gets its time slice calcuated, it is removed from one queue, and inserted into another. Is there some other unmentioned trick that eliminates the calculations? Or was there something else that made the 2.4 scheduler O(n), such as finding the highest priority process?

Well, the scheduler spends roughly the same amount of time scheduling a thread irrespective of the number of runnable processes, so it's O(1). You could call it O(n), with n being the number of timeslices, but that's silly.

--
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
Re:Question by AndroidCat · 2003-12-25 16:08 · Score: 1

I can't dive into the pre-neutered version right now (turkey overloadzzzz), but won't this also help systems with large numbers of processors because it won't be recalcing and shuffling processes across processors as much?

--
One line blog. I hear that they're called Twitters now.
Re:Question by wolfb · 2003-12-25 16:32 · Score: 1

No question, I agree that a small calculation per process is better then a large, non-interruptible one every now and then. I just thought there might be something else I missed. Thanks for the response!
Re:Question by Nexx · 2003-12-25 17:57 · Score: 1

probably, but that's a separate issue. You can do processor affinity with the previous O(n) scheduler too.
Re:Question by InfiniteWisdom · 2003-12-25 18:06 · Score: 1

The mechanism for recalculation of timeslices in previous Linux kernel's was very simple. When every process had its timeslice completely depleted (they were all 0) the kernel would simply go through every process and recalculate its timeslice and start execution again at the highest priority runnable process. While this is the most obvious solution it is also very inefficient, executing in O(n) time.

Ok, its easy to see why this is O(n).

This still look like O(1) (i>amortized to me... or amy I missing something?
Re:Question by wolfb · 2003-12-25 18:08 · Score: 1

Well, the scheduler spends roughly the same amount of time scheduling a thread irrespective of the number of runnable processes, so it's O(1).

Well, based on the description I could also argue the same for the old scheduler. Isn't 1*O(n) = n*O(1), with n being the number of runnable processes? In the end, they both consume the same amount of CPU time regardless of n. I was hoping there was more, so I could learn some magic (sufficiently advanced techniques), and because I feel it is misleading to call this O(1) without qualifications or explanation.

You could call it O(n), with n being the number of timeslices, but that's silly.

True... And sorry, but I can't resist a silly reponse. There is an O(1) sort function, if only you don't expect a sorted array after a single call, and if you're willing to ignore how many times you will call sort, and if you only care about characterizing a single unit of work with the O() expression... If there was a suitable application for this algorithm, where the most important consideration was how long a single step of the sort process took, would that justify calling the algorithm O(1)?
Re:Question by krakrjak · 2003-12-25 18:15 · Score: 5, Informative

Let's see if I can remember anything from my algorithms class.

How is this not O(n)?

Read on for an explanation. I do believe you gave the answer to your own question in your post. I will also assume that you understand "Big-O" notation well enough to follow the answer.

Each process still gets its time slice calcuated, it is removed from one queue, and inserted into another.

Correct. You've nailed it. Insertion into a Que operates in O(1) time. Removal from the Que is also a constant time algorithm. A Que inserts at the tail of the list (or the end of the array) and De-Ques from the Head of the list (or the beginning of the array). Ques do not permit searching, sorting or accessing the data after the first element.

If we De-Que a process, run it then calculate some information and Que it up that would run in O(c1*O(1)+c2*O(1)) = O(1).

This is still an oversimplification and does not take into account the possibility of a Priority Que in which insertion runs in O(lg n) and De-Queing runs in O(1). So a heap sort using a Priority Que runs in O(n*lg n).

Hope that helps.
Re:Question by eggnet · 2003-12-25 18:49 · Score: 1

I can't tell if you're an idiot or what, but the complete function of the scheduler is to determine what process runs next. The point of the article is that despite the seeming complexity of adding more processes to a running machine, the scheduler always takes the same amount of time to determine which process should run next regardless of how many processes there are to select from.

A sorting algorithm isn't finished until the data is sorted. A scheduler isn't finished until the next process is selected. You wouldn't measure a sorting algorithm by the time it takes to sort all the data anyone in the world would ever use it to sort, would you?
Re:Question by lindsley · 2003-12-25 19:16 · Score: 2, Informative

So the calculation is done per process as they finish their time slice, rather then at the end when all the processes are done. I still don't see why this would imply better efficiency. Am I missing something?

Depends on whether you're measuring the time to select another process to run, or the overhead necessary to do correct scheduling. Timeslice recalculation must happen, that's true. But on 2.4, it needed to happen in conjunction with other information gathered from the task list. There was one queue: the system task list. And to safely access that you need to take a read side of a read/write spinlock. Seems harmless enough, until you consider the side effect that a read lock will block a write lock from being acquired. So every time the scheduler has to inspect that list, it may be stopping some other part of the scheduler (possibly on another processor) from implementing some decision. In the worst case, you hold a read lock long enough that other processors' decisions become invalid. Once they acquire write permission they realize that the process has died or gone to sleep or is no longer a good choice for other reasons and then they must go back, grab a read lock (!) and search again. Thus blocking others who have reached decisions, etc.

2.6 (O(1)) significantly reduces lock contention. That alone accounts for tremendous savings under load. The cost is a little more memory and moderately more complexity. In this case, it is a very good trade.
Re:Question by cdrudge · 2003-12-26 02:19 · Score: 1

Well, based on the description I could also argue the same for the old scheduler. Isn't 1*O(n) = n*O(1), with n being the number of runnable processes? In the end, they both consume the same amount of CPU time regardless of n. I was hoping there was more, so I could learn some magic (sufficiently advanced techniques), and because I feel it is misleading to call this O(1) without qualifications or explanation.

In this case, the 1*O(n) = n*O(1) doesn't fit this example. If I underdstand everything correctly, with the old scheduler it would have to look through all processes to find the next process to run. If you had ten processes, it would take ten look ups. A million processes would take a million lookups. So to find one next process, it would have to search through n processes, giving it a O(n) runtime. Now with the new scheduler, it does the look up in constant time. The time that it takes doesn't scale up as the number of processes goes up. Since it is fixed, it gets a O(1).

Your "equation" revised would look something like:
O(n) > O(1) or n*O(n) > n*O(1). In each case the left is the old scheduler and the right is the new scheduler. n = the number of processes to be processed.

Lets assume that there are 10 processes to be chosen from. With the old scheduler, you would have to look through all 10 processes to process 1, remove it, process it, then look through 9, remove it, process it, look through 8, etc. So you would have to look through 55 processes ultimately to process them all. With the new way, you only have to do 10.

Of course, I may be completely wrong so in that case, ignore me. :)
Re:Question by wolfb · 2003-12-26 02:35 · Score: 1

You're probably right, and if that is the case, then the answser is perfectly clear.

It just sounded like 2.4 scheduler didn't have to do any work until all timeslices were used up, then it figured out what processes should run next, wait until they all run, and so on.
Re:Question by wolfb · 2003-12-26 04:41 · Score: 1

Responses to my question included plenty of explanations (yours included, setting insults aside), that were not present when I posted my question. Do not take the following as further challanges to the article, the algorithms described, what a scheduler is; its simply a clarification for the example you felt deserved an insult. I might have tried too hard to reduce the sorting example to absurdity, but but the point remains the same.

Suppose an application needs to keep an ordered set of data. Further suppose that one implementation does this with quick sort every once in a while (does many insertions/delitions between sorts), and another implementation maintains the invariant of the sorted data set at each insertion/deltion with bubble sort. Obviously bubble sort is far from the best algorithm for this, but it is still suitable for sake of argument. Bubble sort can maintain the invariant in at most n steps per operation. On the other hand, quick sort needs nlogn steps to sort the data. If sorting is done infrequently enough, then quicksort will be faster overall (nlogn vs n^2). Which implementation is better? If you criteria is to minimize the worst delay, or to amortize the cost of sorting over each operation, then the bubble sort implementation will behave "better" then the quick sort one. However, comarping these two implementations without stating your criteria or qualifications does not give you the license to call bubble sort an O(n) algorithm in the same breath as calling quick sort an O(nlogn) algorithm. It would appear an apples to oranges comparison.

The article compared two algorithms. One that ran once for each timeslice taking a constant time, and another that ran once for every n timeslices, taking time proportional to n. Without further explanation, or your defininition for what the scheduler is, it was not clear why one could be called O(1) while the other was called O(n).
Re:Question by eggnet · 2004-01-04 07:14 · Score: 1

I believe the article explains the basic function of a scheduler before going into the extremely technical description that maybe most people here didn't read. A good summary taken from the article:

Whether your computer is executing 1000 tasks simultaneously or 10, the scheduler will always take the same amount of time to pick the next task.

There are things happening all the time that affect task priority, changing which task should run next, etc. The hard drive is ready with data, same w/ the network card or even your mouse and keyboard input. The point is that the items that you are sorting in effect change their values all the time, so in fact you do not have a maintained sorted list at the beginning of every time slice.

The best in-place sorting algorithm running on unbounded data on an infinite precision computer is O(log(n)). This is basically a component of a binary search type sort, algorithmically no worse than the best generalized sort, O(n log(n))

I'm assuming you've taken an algorithms course, possibly in C.

Let's say you redefine the problem so that you must sort a stream of integers between 1 and 10. You simply make a 10 element array (which, in C, is indexable in O(1) time) of linked lists. When a number comes in, you add it to the appropriate list in O(1) time, either at the head or tail (if you want a FIFO type system).

In an algorithms course, you learn that the best sort is O(n log(n)). If it was a good one, you learn that if you restrict the data being sorted as I have done above, it is possible to attain O(n).

To have an O(1) scheduler, you have to maintain constant execution time in the face of an arbitrary quantity of your processes changing their sorting values between executions. That's not hard, like I did above you can just simplify the problem until you get to O(1), although you'd have to simplify it much further than I did.

What is hard is doing it in such a way that the system doesn't lose data, doesn't force the user or other processes to wait unnecessarily (latency), and utilizes the processors in the system at near 100% efficiency (throughput).

Even to the average geek with a basic grasp of algorithms, it seems hard to beat O(n) when you factor in that external events can change process priorities.

That said, it doesn't seem impossible either. I wonder if it is truly better in every way, even on a 386 (i.e., is it O(1) at the cost of taking more time to run when n is small), or if it is only better on today's machines when the scheduler is using only a small fraction of the CPU power or may even be stored in an on-chip cache marked permanent or something.

Interesting by Anonymous Coward · 2003-12-25 13:38 · Score: 1, Funny

Post a link to an incomplete, semi-technical, Linux-related article on Slashdot front page
Include a Microsoft Windows full page ads in the abovementioned Linux-related article
Post comments with links to a somewhat more complete version of said article
Get modded up with O(n!) speed for saying "I'm the author of that article"
???
Profit!!!

For computing time.. by Kjella · 2003-12-25 13:46 · Score: 1

...I don't think I've ever seen an algorithm where O(1) didn't imply that the time was constant. They concentrated on the important part, O(1) vs O(f(n)). Ok maybe you can construct a formula that is O(1) and non-constant, but it's not very relevant to this article or even computing time algorithms in general.

Personally, I'd think that the difference between constant and constant time disappeared in the "dumbing down". Once you start talking about "being bounded in the limit by a constant", most people won't understand it at sight. Even the ones that once knew it would probably have to look up a math book to remember what exactly that ment.

Either that, or it was simply mixed up in editing. They ment to say "The scheduler is using a constant time, and is so O(1)", rather than "The scheduler is O(1), and is so using a constant time." Either way, I think it's quibbling compared to the relevance and intended audience of the article.

Kjella

--
Live today, because you never know what tomorrow brings

Re:For computing time.. by jhunsake · 2003-12-25 18:05 · Score: 1

I don't think I've ever seen an algorithm where O(1) didn't imply that the time was constant.

Umm, how about the O(1) scheduler presented in the article. When the queues are empty, there is the extra time of switching them. So if

n is the time to pick a process from a nonempty queue

and

p is the time to switch the queues

Then time n + p is bound, but frequently the scheduler takes only time n.
Re:For computing time.. by p3d0 · 2003-12-26 00:47 · Score: 1

How about hash table insertion? The time it takes to do an insertion depends on whether there is a collision, but it is still bound by a constant that does not depend on the size of the input.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Re:Not a very INSIGHTFUL article. by Curtman · 2003-12-25 13:52 · Score: 1

Who was it that started this belief that redundancy necessarily means something has already been stated? It doesn't.

Webster's Revised Unabridged Dictionary (1913)
Redundant Re*dun"dant (-dant), a. L. redundans, -antis, p.

1. Exceeding what is natural or necessary; superabundant; exuberant; as, a redundant quantity of bile or food.

2. Using more words or images than are necessary or useful; pleonastic.

Syn: Superfluous; superabundant; excessive; exuberant; overflowing; plentiful; copious.

Re:Arse? by Anonymous Coward · 2003-12-25 14:04 · Score: 0

whats wierd is they have a rear admiral upper half and rear admiral lower half. nice troll by the way

Preemption and disk requests - educate me by ortholattice · 2003-12-25 14:05 · Score: 4, Insightful

I think this may be important:

The benefits of preemption are great enough that the kernel developers decided to make the kernel itself preemptible. This allows a kernel task such as disk I/O to be preempted, for instance, by a keyboard event. This allows the system to be more responsive to the user's demands. The 2.6 kernel is able to efficiently manage its own tasks in addition to user processes.

One of the most time-wasting, counterproductive things for me on Windows, and to a lesser extent on Linux, is that background tasks can essentially take over the computer because of disk thrashing. Sometimes (on Windows XP) I've seen it literally take minutes to bring a window to the front and repaint it while the disk is thrashing doing god-knows-what in the background. It is impossible to get any serious work done when that is happening. Setting my process to highest priority (on Windows) doesn't seem to help when disk thrashing is involved. I can understand how it might take a few seconds to swap another process out and bring mine in, but not minutes.

Now I don't know exactly what's happening behind the scenes, but as soon as I click on a window to bring it in focus, I want everything on the computer to be devoted to doing just that until its done, regardless of what disk blocks from other processes are in the queue. My disk block requests should go in front of all of the others unconditionally. In other words I should see essentially no more latency than if the computer were completely idle in the background, other than just the minimum delay required to bring it back into memory if it's swapped out. For all I care just completely freeze all other processes on the computer until my request is finished, or at least give me the option to specify that it be done that way. My productivity should override everything else on my own desktop/workstation computer.

So, educate me about what I may be overlooking here, and will 2.6 help this? I mean for Linux of course, not Windows, which may be a lost cause (:

Re:Preemption and disk requests - educate me by edwdig · 2003-12-25 15:53 · Score: 1

What you're seeing is the problem with virtual memory implemented via paging. It works ok if you're only swapping a little bit, but if you have to swap a lot, its terrible since you're only doing it 4k at a time. Swapping a page to disk is has very high overhead, especially when you factor in the disk seek time. So when you're doing a lot of it, the system grinds to a halt. Pentiums support 4 meg pages, but when you get that big, you end up swapping stuff back in very soon after you swapped it out.
Re:Preemption and disk requests - educate me by sholden · 2003-12-25 16:48 · Score: 2, Informative

The computer *is* devoting everything to getting your window to the front.

The application wasn't being used and so its memory was swapped to disk, now that memory is needed again so it needs to be brought back. But before it is brought back some space needs to be cleared which means writing some existing data from memory to disk (which is a slow process).

The solution is simply to have enough memory installed for your working set of applications (which may mean installing more memory, using less applications, or using more memory efficient programs).
Re:Preemption and disk requests - educate me by borgasm · 2003-12-25 16:52 · Score: 4, Informative

Trashing occurs when you have to page out to virtual memory that is mapped to the hard drive. Your computer is spending more time moving data in and out of virtual memory than it is doing the actual computation.

This usually happens when a process is in a loop, and repeatedly uses the same memory blocks, but can't keep them all in physical memory. Thus, it has to replace the pages every time it accesses/modifies them.

Even if a process is at a low priority...if it comes on the CPU, it has the potential to trash memory and slow the works down.

If an application is trashing, then you might be able to get away with scheduling it to not hit the disk. If the OS is trashing, you are SOL.
Re:Preemption and disk requests - educate me by Anonymous Coward · 2003-12-25 19:00 · Score: 0

That CPU-sucking is due to your IDE disks and their software approach to managing the disk. Get yourself some SCSI hard drives and a SCSI card with true processing offload and all that will go away.
Re:Preemption and disk requests - educate me by robhancock · 2003-12-25 19:23 · Score: 2, Interesting

This is more of a virtual memory or disk scheduling issue than process scheduling. Even if a process has full access to the CPU, it still can't run if either it has code or data it needs that are not paged into RAM, or need to be loaded in initially from disk.

Especially in Windows, it seems like if a process is hammering the disk in the background, other processes can be starved from accessing the disk. Windows seems especially braindead at times when multiple processes fight over the disk, and the disk is wildly seeking all over the place and not accomplishing much - it would be much more efficient if it would just devote the disk for a longer time to each request.

I don't know what Windows has in it for disk I/O scheduling, but it's probably safe to say that Linux is a bit more intelligent in this regard..
Re:Preemption and disk requests - educate me by ortholattice · 2003-12-25 19:59 · Score: 1

Even if a process is at a low priority...if it comes on the CPU, it has the potential to trash memory and slow the works down.
Exactly, this is the problem. I want the ability the force all (non-critical, non-realtime) processes not associated with whatever window is currently in focus to freeze completely, until the process associated with the window in focus is idle, waiting for neither CPU nor disk. Or until the user brings another window into focus, in which case this behavior would immediately shift to the new window.
In other words I am suggesting a system option to do this: if the process associated with the window that's in focus is not in the idle state, make everything else freeze. Or even this: an option to freeze everything else unless the process associated with the window that's in focus has been idle for at least x seconds (user settable). This latter would prevent swapping out while the user is typing.
It seems to me this would greatly enhance the user's perception of speed on machines with limited memory, even if it means some background tasks might end up taking longer because they lose out on some otherwise idle CPU time.
Re:Preemption and disk requests - educate me by Halo1 · 2003-12-25 21:23 · Score: 4, Informative

Exactly, this is the problem. I want the ability the force all (non-critical, non-realtime) processes not associated with whatever window is currently in focus to freeze completely, until the process associated with the window in focus is idle, waiting for neither CPU nor disk.
The problem with this scheme is that you can get deadlocks. Suppose the high and low priority processes share a block of memory, and that the high priority process is waiting for a spinlock in that block of memory to be cleared (basically a variable which is currently 1 and which has to be set to 0 by the process/thread that owns the lock) by the low priority process.
In your scheme, the low priority process will never get a chance to run, because the high priority process is running all the time. So the high priority process keeps running forever doing nothing useful.
In other words I am suggesting a system option to do this: if the process associated with the window that's in focus is not in the idle state, make everything else freeze.
So if your email client was checking mail in the background, its connection will time out, file sharing will completely die, background printing fails due to communication time-outs, ...
Or even this: an option to freeze everything else unless the process associated with the window that's in focus has been idle for at least x seconds (user settable). This latter would prevent swapping out while the user is typing.
These kind of things were possible in older systems (e.g. Mac OS 9 and below froze every other application while you were dragging a window around), but modern operating systems cannot work that way anymore. You don't want downloads to stop when they're in the background, firewalls to stop working because they're never the frontmost application etc...

--
Donate free food here
Re:Preemption and disk requests - educate me by ortholattice · 2003-12-26 01:27 · Score: 1

These kind of things were possible in older systems (e.g. Mac OS 9 and below froze every other application while you were dragging a window around), but modern operating systems cannot work that way anymore.
It is too bad they weren't designed differently, so that they could work this way. With only a few realtime exceptions in a desktop machine - MP3 players, CDROM burners (which I think are flawed this way; maybe in the future RAM will be so cheap the necessary minimum data can be pre-buffered locally before writing) - there is no theoretical reason this has to be.
You don't want downloads to stop when they're in the background, firewalls to stop working because they're never the frontmost application etc...
I think that the way most timeouts are designed is fundamentally flawed. A good timeout protocol will always have a way for the client (or server) to say, "I'm busy and can't send more data now, but I'm still alive, so don't disconnect me." Anytime there is a hard timeout, the number must be selected according to human judgment, and there will always be unanticipated situations where it is too short or too long. But I guess it's too late to change this in many existing designs.
BTW my original post was about desktops, not servers, which have different needs. But if a firewall is local to the desktop, then in scheme I described the high-priority process will become "idle" as soon as it's not waiting for CPU or disk, so if it's waiting for a packet from the firewall, then the firewall would indeed start running long enough to deliver that packet and un-idle the high-priority process.
Re:Preemption and disk requests - educate me by BlueBiker · 2003-12-26 01:53 · Score: 1

The point is, you don't actually want what you're asking for. It's easy to say "I want the entire rest of the machine to halt and service my foreground task until I change to a different foreground task", but that has all sorts of bad consequences.

It's not because any protocols or interfaces are badly designed, it's because you're asking for a draconian approach that interferes with the normal functioning of the system when other effective approaches are available.

An example: When you go out jogging, you don't want your liver and kidneys to shut down just because they're not directly related to making you run faster.

Okay maybe a better example is you're downloading the latest ISO's from a relatively slow internet connection. Say for every ten seconds of download time it takes the system a tenth of a second to save it to disk. Surely you don't want to freeze your download by forbidding even an occasional 100ms timeslice to do the disk save.

Likewise, you might have a huge print job which the printer can only accept at a slow rate, but which is a trivial load to the main CPU. There's no reason to freeze the print job when doing so won't make the slightest difference to foreground GUI responsiveness.
Re:Preemption and disk requests - educate me by TheLink · 2003-12-26 02:58 · Score: 1

"I think that the way most timeouts are designed is fundamentally flawed. A good timeout protocol will always have a way for the client (or server) to say, "I'm busy and can't send more data now, but I'm still alive, so don't disconnect me." Anytime there is a hard timeout, the number must be selected according to human judgment, and there will always be unanticipated situations where it is too short or too long. But I guess it's too late to change this in many existing designs."

Your proposal appears badly flawed as it is. So what happens if someone kicks the power plug? Without a timeout the server will wait forever.

If you have good judgement, the timeouts would be fine. If your machine does not respond within a well chosen timeout, it is highly likely that it should be timed out.

Most of us have very limited time.
--
- Too many replies beneath your current threshold
Re:Preemption and disk requests - educate me by argent · 2003-12-26 03:20 · Score: 1

I would just like to agree with Sholden here. If you're thrashing then there is NOTHING any algorithm can do to improve it.
Re:Preemption and disk requests - educate me by tuxedobob · 2003-12-28 21:03 · Score: 1

For starters, Linux is meant to be a fairly lean OS. That said, if you're experiencing this on both OS's (i.e., in Linux), you need more RAM. Period. If that is not an option, you need to do less at once. Or perhaps nothing. ;)

I'm not sure how much you currently have, but I'd guess it's less than 256MB. (At least, I _hope_ so, considering my experience with OS X with 320 on a G3 iBook.*) Upping it to 384 or 512 should help.

Another way to increase your available memory-- and this is for OS X users-- is to upgrade your video card to something that has at least 32MB of VRAM. I got back 50 or 100MB of system RAM once I upgraded mine and OS X started using Quartz Extreme.

*If you're wondering, the only time my iBook thrashes is when I'm downloading 100,000 news headers in Thoth with other apps open.

I use Mac and I grok O(1) schedules, and kernels! by caveat · 2003-12-25 14:06 · Score: 1

My Mac runs a BSD layer on a Mach microkernel (duh), so I know just a little about Unix-ish OS structure and function. Dipshit. (sorry, feeding the troll.) What I want to know is what scheduler does Darwin use? If I had to guess, I'd say Apple uses whatever the BSDs use, but I have noticed a dearth in harcr0e technical press covering Darwin's internals. Sure, YDL might use the 2.6 kernel with that wonderful scheduling Ars talks about, but I like my Photoshop, and my Final Cut Express, and my iTunes - I even adore my OS X/Darwin; it's just as stable and flexible as YDL 3.0 IME, and it's a hell of a lot better supported. So, I'm sure some slashdotter wants to display his/her vast p00ter knowlege...how does the Darwin scheduler work, and how does it stack up against Linux? (Or does Mach handle the scheduling?)

--

Facts do not cease to exist because they are ignored. - Aldous Huxley

If every OS maker settled on a standard... by caveat · 2003-12-25 14:12 · Score: 1

...we could have a standardized hardware scheduler chip that would be uber-fast. Unfortunately, they wont standardize, and to ice the cake there's no truly 'optimum' scheduler for all systems and user loads, so it looks like you're SOL.

--

Facts do not cease to exist because they are ignored. - Aldous Huxley

Re:If every OS maker settled on a standard... by Inthewire · 2003-12-25 16:10 · Score: 1

You wrote ...we could have a standardized hardware scheduler chip...there's no truly 'optimum' scheduler for all systems and user loads...

Gosh, maybe the diversity of optimum schedulers prevents standardization?
Could it be?

--

Writers imply. Readers infer.

Pathetic article by any standards by TCPIllinois · 2003-12-25 14:41 · Score: 0, Troll

Considering that most /. readers are people who have a background in CS the article was too skimpy on details.

Re:Pathetic article by any standards by TheAvatar666 · 2003-12-25 16:34 · Score: 0

What does Counter Strike have to do with thread scheduling?

Couldn't possibly have been moded off-tpoic for... by IBitOBear · 2003-12-25 14:43 · Score: 2

(methinks) Someone has a little something stuck in his craw.

Could there possibly be a reason for an offtopic mod? Lets see:

-- Article/thread about an article about the linux scheduler.

-- Respondent asks a technical support question having nothing to do with the article referenced or the scheduler in general.

-- Respondent gets moded down for jumping off topic because this isn't a Linux support forum nor is it generally an audience who would want it to *become* a support forum.

-- Someone who doesn't like to see a lot of the linux-centricisim in the forum *complains* BECAUSE others who don't want to read a technical support thread mod the offtopic post... "offtopic."

===

As a side note, you rigorous thinker you, a question or comment in the range of "I don't know why" ("this pause here with that message there confuses me", et. al.) indicates that the user wants information about his system and is not prima facia evidence that there is something "making Linux look bad" in any meaningful sense.

Consider: Sort of like saying that a single customer asking "what are the ingreedents of the 'special sauce', cause it feels oily?" must automatically make McDonalds "look bad" for filling the oily sauce role with an oily sauce not understood by a single questioner.

If you took the time to gather one single iota of information you would know that the message he doesn't like is in the startup sequence and happens only once (per boot) and lasts a couple of seconds at most.

If, on the other hand, he even vaguely intimated that the message lasted X seconds and was repeated every N seconds, then there would be some "looking bad" to be had here. If he could *FURTHER* even remotely tie this to the scheduler then the post would suddnely not be "offtopic".

Shame none of that happened. Which kind of leaves you wearing the inflatable rubber ducky on the subway there dude. You probably shoud stop yelling about the rising watter too. (IF you can follow a cultural metaphore there dude. 8-)

====

Point of (bonus) fact, there wasn't enough information in the question to even address an answer. Even so, (and without even checking), if the initrd (etc) system is compiled in there but the user isn't using initrd then he should compile the kernel without this bit. If he *IS* using the initrd image then it is kind of normal that the unpacking and mounting of the initrd image would cause a pause at that point because the kernel is working on something that everything else has to wait for.

So, nothing to see here folks, conspiracy theorist presenting tinfoil-hat theory about an demonstrably offtopic post getting modded down for being demostrably off topic. These arn't the droids your looking for. Move along... 8-)

====
Not hard to see why you posted AC...

--
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press

O(1) now, what was it before? O(?) by phoenix.bam! · 2003-12-25 14:49 · Score: 1

What was the big-O of 2.4 and earlier?

Horse Costume by AtariAmarok · 2003-12-25 15:14 · Score: 1

whats wierd is they have a rear admiral upper half and rear admiral lower half. nice troll by the way

That sounds like they are filling parts in a large horse costume.

--
Don't blame Durga. I voted for Centauri.

Re:Horse Costume by Anonymous Coward · 2003-12-25 16:15 · Score: 0

Does that costume look like this by any chance?

From the Article Summary... by networkGhettoWhore · 2003-12-25 15:58 · Score: 1

"The recent release of Linux's 2.6 kernel introduces a number sweeping improvements"

"number sweeping"? Is this a new algorithm? Did i miss something here?

--
Natural Selection: self-destruction of the poor and lazy

Already slashdotted? by Anonymous Coward · 2003-12-25 16:06 · Score: 0

(AC, no whoring, etc.) Here's a mirror:

So what does O(1) mean anyhow? Well, the "O" in O(1) is a method of expressing running time for algorithms in computer science known as "Big-Oh" notation. A general mathematical definition of Big-Oh notation is: A function f(n) is known as O(g(n)) when there is a constant c and n0 such that for all n >= n0 there exists a c such that f(n) <= c * g(n). For those that aren't as mathematically inclined, this basically means that an algorithm is known as O(x) where x is always larger or equal to the execution time of the algorithm. So an algorithm that is O(n^2) will always execute in at maximum, quadratic time. Then an algorithm that is O(1) - such as the 2.6 scheduler - executes in constant time. A constant time algorithm is the crown jewel of algorithms so to speak. If an algorithm can be written in a way that makes it execute in constant time, it is almost invariably a very good algorithm.
The scheduler is responsible for choosing what process is going to be executed at an given time, but more than that the scheduler is responsible for choosing when one process will be stopped and another will begin. Andrew Tannenbaum did a good job of providing a high level overview of the scheduler in the following analogy from his book "Operating System Design":
"REMINDER: INSERT TANNENBAUM ANALOGY WHEN YOU GET HOME."
The 2.6 scheduler uses a very interesting method of keeping track of processes. It uses a priority array (struct prio_array) that contains a bitmap that provides one bit for every possible process priority; 140 is the default maximum number of priority levels however since the bitmap is represented as an array of unsigned longs (If this doesn't mean anything to you, don't worry it is not vital to the understanding of the process), a little math shows that the actual number of bits has to be 160. These bits are used to quickly determine the highest priority (lowest actual number) process in this priority array, via a platform specific function find_first_bit (asm-i386/bitops.h for x86 architectures). The bits are initially set to 0 but when a process is inserted into the priority array the bit corresponding to its priority is set to 1. Thus, finding the highest priority process is as simple as checking the bitmap. The priority array also contains an array of lists of processes. The size of this array is equal to the maximum number of possible priorities (MAX_PRIO). So once the bitmap is checked and the highest priority process is found the scheduler just checks the array at the proper spot (the priority found) and runs the first process in the list. This all occurs in constant time, there is no searching through the entire list of processes, which is why it is called the O(1) scheduler. These steps are also followed for insertion or deletion into the array making them O(1) as well.
Classic process management implementations involve using a priority queue (implemented via a binary min-heap) of processes, or even a linked list of processes. While the priority queue implementation would provide O(1) retrieval from the queue, deletion and insertion of new processes execute in O(logn) time. The linked list implementation is even worse executing a retrieval operation in O(n) time! By comparing the 2.6 O(1) scheduler to these past implementations (which are surprisingly still used in many operating systems) it becomes clear just how significant this improvement is.
You might ask now, if we have a method of determining which process is to be run...how does the scheduler know how long a process should be run for? To answer this question we must first understand a timeslice. A timeslice is the maximum amount of time a process may be executed for, and is calculated by the scheduler. A process may execute for as long as its timeslice has not been depleted, however at any time it may be preempted by another process of higher priority which has become runnable (and a few other sit

Re:O(1) now, what was it before? O(?) by Sloppy · 2003-12-25 16:29 · Score: 1

cd /usr/src/linux/kernel
gvim sched.c

Oops, looks like my copy of 2.4 (gentoo-sources) already has the O(1) scheduler backported? Heh, 2.4 already has the feature in question. ;-)

Ok, ssh over to my Slackware 7 box that is running 2.2... It looks like 2.2's scheduler is O(n), where n is number of processes. (It goes through a linked list of all the processes and picks one with the best "goodness".) So I guess 2.4's original scheduler was probably about the same: linear.

Heh, I hardly ever look at this stuff. Whenever I do, I'm kind of amazed at how small and managable the Linux kernel is. It really doesn't look like all that big of a project. Anyone can read and understand this stuff, and things are easy to find. I'm serious; dive in, and there it all is. A shame they indent with tabs instead of spaces, but I guess that's another religeous war. And the gotos, oh, the gotos...

--
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.

Now equivalent or better than BeOS for latency? by mattr · 2003-12-25 16:53 · Score: 1

I used to run BeOS on a dual 604e 9600 mac, it felt waaaay more responsive than my rh9 on a 128MB 500MHz laptop (dell inspiron 7.5k, which was pc of the year a few years ago). Thrashing is a big reason, I've got more running on linux I'm sure, but the BeOS' preemptive multitasking just made it feel like such problems which plague all windows systems at least, were nonexistent for BeOS.

Is the new kernel going to reach/exceed this level of performance for the desktop? This would be a big thing for the corporate desktop feel too, plus for me. Not that it is really important but can you play multiple movies and do other tasks without jumping? SGI Irix didn't used to be able to do that on even high end machines, don't know about now (well they will be running 2.6 I suppose..). I'd like to know if a modern distro will run in a sane manner for a gui desktop on 128MB of memory, (yeah I'll get more memory but..) and to know if I should make sure to put 2.6 on systems that need it (I'm thinking about a server which will do both I/O and visualization).

Re:Now equivalent or better than BeOS for latency? by CableModemSniper · 2003-12-25 17:03 · Score: 1

Ok, I didn't use BeOS for any extended length of time, (I installed MAX or something like that, one of the distros that were based on the PE) and this is purely anecdotal, but I recently put 2.6.0 on the same box as I had beos on, and my first thought was, damn this is as good as BeOS. I had a movie playing in mplayer, XMMS playing a song, (esd was doing the audio mixing btw), OpenOffice open, Mozilla, and Galeon, and it was lightning fast. This is on a box that in 2.4.2x would "skip" when playing music in xmms (and not doing anything else other than having GAIM open). Maybe its not quite at the responsiveness of BeOS yet, but it is damn close.

--
Why not fork?
Re:Now equivalent or better than BeOS for latency? by argent · 2003-12-26 03:13 · Score: 1

The scheduler is unlikely to be the reason BeOS seems faster. The big problem you're dealing with on your Linux box is the RAM: unless you're running a lightweight window manager and a small set of fairly tightly coded applications you don't have nearly enough RAM for practical use. Either add more RAM, or get rid of all that "gui desktop" bloat.

And if you think Linux is bad when it's thrashing, try running BeOS with insufficient memory. You'll wish you were back in Windows 3.11 on a 486/25.

Re:I use Mac and I grok O(1) schedules, and kernel by Anonymous Coward · 2003-12-25 17:50 · Score: 0

God you suck.

CDs bringing Windows to a stall by wackybrit · 2003-12-25 18:57 · Score: 1

you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

Yes, this is MAJORLY bad. I've had whole machines become totally unusable (mouse moves, that's about it) simply because of a dirty CD put in the drive, or a buggy drive (perhaps the mechanism is loose, etc). Windows wants to down the entire machine to devote all computational power to waiting for the CD drive! Linux is not off entirely scot-free, but it's definitely better at handling this situation, IMHO.

There must be a way of reducing the priority on these actions?

--
mogorific carpentry experiments

Re:CDs bringing Windows to a stall by drsmithy · 2003-12-28 18:07 · Score: 1

Windows wants to down the entire machine to devote all computational power to waiting for the CD drive!
This is not computational power that is being used, it is applications blocking while they wait for I/O to complete. Applications that aren't waiting for CD I/O (or in the case of poorly configured IDE machines, IDE disk I/O on the same channel) should chuff along nicely.

Re: byteboyz trash-talk by Anonymous Coward · 2003-12-25 18:58 · Score: 0

Nice work BigA: Ya bi*chslapped a well-deserving dweeb, while advancing & informing the yeomanry. We're now ready for part_2 !

Knock yourself out -- write a better one by Anonymous Coward · 2003-12-25 19:20 · Score: 0

bitch

GUI layer also has performance bottlenecks by BlueBiker · 2003-12-25 19:25 · Score: 1

We've been speaking of kernel scheduling, but a given GUI implementation can also have a dramatic effect on whether delays in one part of the system will result in sluggish UI response. IIRC, KDE is largely single-threaded and there are lots of cases where it will be momentarily hung-up waiting on some global resource. Maybe someone could elaborate.

Re:GUI layer also has performance bottlenecks by The+Snowman · 2003-12-25 19:57 · Score: 1

Yes, but KDE being laggy won't impact Apache or ProFTPd running in the background. Put a CD in the drive, and *everything* on a Windows box slows to a crawl until the OS reads the CD.

--
24 beers in a case, 24 hours in a day. Coincidence? I think not!

O(1) not just in 2.6 by robhancock · 2003-12-25 19:27 · Score: 1

One thing I haven't really seen pointed out is that the O(1) scheduler is not strictly present just in the 2.6 kernel - a number of distributions have had backported versions in 2.4-based systems; I believe Red Hat has had it integrated into their kernels since Red Hat 8, for example..

Re:I use Mac and I grok O(1) schedules, and kernel by Anonymous Coward · 2003-12-25 20:16 · Score: 0

If you "grok" (who uses crap like this in real life anyways, except people who have something to prove? Not me, and I have a degree in CS and 15 years experience. Go read the jargon file some more, it makes you look supercool) kernels and your Mac so much, why don't you know what "Darwin" uses? It's based on FreeBSD, so, "dipshit", it definately uses whatever the "BSDs" use. It IS BSD, running on top of Mach. Yes the MACH layer is also scheduling itself, but your question was about "Darwin". Darwin is FreeBSD on Mach.

From Apples site: "We should note, however, that apart from a few architectural differences (such as our use of the Mach kernel), we try to keep Darwin as compatible as possible with FreeBSD (our BSD reference platform)."

You want "harcr0e" [sic] technical press covering Darwin's internals? Start reading the FreeBSD coverage. Though, at the hardware layer (device drivers), Darwin is interfacing with Mach, while FreeBSD interfaces with hardware. The scheduler is software only, thus, untouched.

Just enough "knowledge" to think you know what you are talking about. Not singling you out, (though it fits), I mean 92% of ./ posters...

AC because I can't be bothered to make an account.

Uneducated guess... by Tokerat · 2003-12-25 20:33 · Score: 2, Informative

...but wouldnt' the fact that only one process is operated on at a time (i.e. a fixed amount of computing) as opposed to scanning the enitre list of processes (could be one...could be thousands) each time make the difference between O(n) and O(1)? Sure, things will be slower if you have 10,000 things to check as opposed to one, but at least here we're not checking 10,000 things for each and every one in the list...

(IANAKH)

--
CAn'T CompreHend SARcaSm?

Re:Uneducated guess... by wolfb · 2003-12-26 01:59 · Score: 1

it sounded like the 2.4 scheduler scaned once for every n process running, taking time proportional to n, while the 2.6 does it for each process taking a constant time. in other words, the old process runs less frequently but longer, the new one more frequently but shorter. it was not explained (though should have been obvious) why one approach was necesserily better then the other.

O/T by wackybrit · 2003-12-25 20:36 · Score: 1

I like the bit at the top of your homepage by the way, and I feel exactly the same about telemarketers.. getting calls from them is a bright spot in the day when you can exercise passive phone terrorism over the poor workers at Sprint and MCI.

--
mogorific carpentry experiments

Google bombs... by Anonymous Coward · 2003-12-25 21:31 · Score: 0

There's also: unelectable.

Debian with 2.6.0 (OT) by Fizzl · 2003-12-25 22:46 · Score: 1

I have one toy-box which I use to test my stuff before I put the changes on my server. However, it takes me quite a while to sync it again with the server if I totally mess it up. So I'm still hesitant to try this all by my own.
I'm currently running 2.4.23, but I am very interested about the cryptoloop and new scheduler.

So, to the point. Anyone have a link to describe how to install 2.6.0 to debian-stable?

I briefly tried it, and googled a bit, and got the impression that it isn't exactly trivial....

--
Bot Assisted Blogging

Re:Debian with 2.6.0 (OT) by Anonymous Coward · 2003-12-26 15:56 · Score: 0

You need to get the module-init-tools package if you want your kernel to support dynamically loadable modules. It contains replacements for modprobe, insmod, etc. You should be able to find this package for woody on backports.org or apt-get.org.

Otherwise, it's pretty simple. You'll need to familiarize yourself with some of the changes in configurable options between 2.4 and 2.6, otherwise you'll probably accidentally leave out support for your console or something silly like that.

noah

Interesting by tigga · 2003-12-25 23:21 · Score: 1

That's much more informative article..

However there are some stones unturned ;)

I wonder how different scheduler classes get scheduled.
And how does scheduler differentiate load balancing for processes running on same core (SMT) and on different processors.

It would be nice to have more technical article published somewhere...

This is not a troll, but please explain... by haxor.dk · 2003-12-25 23:46 · Score: 1

...why ARS gets so much press space here on SlashDot ? I mean, I also write a lot of articles about technical stuffs, but I never get any press space ?

Has /. become a front for AT ?

Someone explain the 2.4 scheduler, then? by argent · 2003-12-26 03:26 · Score: 1

As far as I can tell from the description in the article, there's nothing particularly new in this O(1) scheduler: what was described could have been the scheduler on just about any OS I can think of: one queue per priority level, pick the next available job from the highest level queue. Could have been taken from the Amiga manuals in 1985... either there's something missing from the description of the new algorithm or there was something really strange about the 2.4 scheduler.

What did it used to do?

Dumbing Back Up Again by The+Monster · 2003-12-26 03:30 · Score: 1

I think this way atleast it appeals to a broader audience, not just the CS audience who would know what a binary min heap is if I were to use it.

I have a modest suggestion for you to relay to your editor. There's this really cool feature of HTML that lets you put in a 'link' to another document. If your 'dumbed-down' article would use one or more of these to let the more technically-minded pursue that detail, you'd have the best of both worlds - an Executive Summary for PHBs and the nitty-gritty look behind das Blinkenlights that the hardcore geek demands.

--

[100% ISO 646 Compliant]
SVM, ERGO MONSTRO.

Good programmers should know their Big-O... by javabandit · 2003-12-26 03:55 · Score: 2, Interesting

The article was okay, I guess. Very, very elementary. I would assert that most schedulers are O(1), though. Its really nothing new. This article was probably good for someone who never heard of a scheduler or worked with one.

Although the article was about scheduling, I would have like to have seen some more text about Big-O notation and why it is important to programmers. To me, any good programmer should have a good concept of Big-O notation, how it relates to traversing data structures (such as trees), et cetera.

The article did not make a point as to why good schedulers use priority queues. Which is simple, really. All priority queues are O(1) for getting the next item in the queue. It has nothing to do with scheduling, but it is the nature of the priority queue itself... as a data structure.

I would have liked to have seen the article in its whole form. I wonder if it went into more depth about Big-O, et cetera.

Re:Good programmers should know their Big-O... by evilviper · 2003-12-26 18:14 · Score: 1

I would have like to have seen some more text about Big-O notation and why it is important to programmers. To me, any good programmer should have a good concept of Big-O notation, how it relates to traversing data structures (such as trees), et cetera.

I've found the perfect book for you: http://www.amazon.com/exec/obidos/tg/detail/-/0767 907485/002-5100424-2443250?v=glance

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

explorer.exe == bad design outside of the kernel by IGnatius+T+Foobar · 2003-12-26 04:27 · Score: 4, Informative

Windows XP, for example, isn't all that great in this area, and often one process can too easily slow things down. This is even further emphasized when all the processes you're working with are actually the main "explorer.exe" process; eg, you do something in Windows Explorer that blocks (spin up a CD drive) and everything else (the "desktop", task bar, etc) all become unresponsive...

Unfortunately, even the best kernel process scheduler in the known universe would not fix this design flaw in Microsoft Windows, because what you're seeing is not a thread/process scheduling problem.

As you correctly observed, many Windows programs are forced to make "shell calls" -- which doesn't mean the same thing it does in Unix, where you fork a /bin/sh and tell it to do something for you. It means that you send "events" to the currently-running EXPLORER.EXE with an idea of what you want it to do for you.

The reason it bogs down when it's busy is because it is waiting on a single event queue. Mounting/unmounting of media, network lag, other processes sending/receiving messages, etc. all give EXPLORER.EXE more events to wait on. That's why it's a bottleneck, even when there's plenty of CPU to go around.

It's an awful design, and it's one of those things that's fundamentally broken about Windows.

--
Tired of FB/Google censorship? Visit UNCENSORED!

Plural's by Cardbox · 2003-12-26 04:47 · Score: 1

Many of your plurals, but not all, use apostrophe-s: eg. "algorithm's" but "processors". I don't believe that this is random and I'm very interested to know if you can see the subconscious rule you're following to decide when to use s and when to use apostrophe-s.
This is really not a veiled criticism. I'm genuinely interested in linguistic phenomena, and I am convinced there must be a rule somewhere. I can't see it myself and I'm hoping that you can help.

Worthwhile tradeoff? by sorbits · 2003-12-26 09:09 · Score: 1

From the article:

While the priority queue implementation would provide O(1) retrieval from the queue, deletion and insertion of new processes execute in O(logn) time. The linked list implementation is even worse [...] surprisingly still used in many operating systems

I think the reason is quite simple: Generally you have very few running tasks, so even though inserting a new one is O(lg N), it's basically free, whereas checking 160 bits (to find the first one set), can only be done efficiently by exploiting (hopefully existing) architecture specific assembler instructions, and is still slower than using a normal priority queue (assuming more than 32 task priorities are defined).

Adding to this, one does not need to keep a priority queue of all active tasks, but can use a priority queue of at which priorities we have tasks running, and use buckets like in Ingolfs schedular -- that way we get an upper bound on inserting/removing: logarithmic in the number of priorities (so O(1)).

So that gives us a faster schedule, but a worst-case slower insert/remove (but still O(1)). So what to prefer? That depends on how the machine will be used (and if it even allows efficient bit field operations).

From the article it sounds like other things were also changed with the introduction of Ingolfs schedular (like changing the timer frequency), and that may explain the perceived performance increase? I do not use Linux myself, so I cannot comment on these subjective things.

Unlikely to match BeOS but... by Sits · 2003-12-26 09:56 · Score: 1

A 2.6 distro will probably feel quite snappy on modern (1Ghz+) machines (RH9 is still 2.4 based with a few patches to reduce latency but there is no kernel preemption and that makes a noticible difference).

Robert Love pointed out that BeOS often traded throughput for latency (point 2) coupled with that fact it was highly threaded for its reknowed responsiveness. On weaker (PII or less) machines I suspect BeOS will always feel faster but there will come a point where CPUs context switch so quickly extra effort beyond preemption will not be worth the effort.

By the way, most desktop OSes these days have preemptive multitasking. It is only older OSes such as OS9 and Windows 3.11 which do not.

Re:I use Mac and I grok O(1) schedules, and kernel by MochaMan · 2003-12-27 16:39 · Score: 1

From Apples site: "We should note, however, that apart from a few architectural differences (such as our use of the Mach kernel), we try to keep Darwin as compatible as possible with FreeBSD (our BSD reference platform)."

They're referring to the BSD userland (ls, rm, cp, etc.) there. Those have nothing (or very little) to do with the kernel or microkernel -- as can be evidenced by the fact that they all can be recompiled on most versions of UNIX, and many of those have very different schedulers. But as someone with a degree in CSc and 15 years of experience, I'm sure you were aware of that and just got emotionally excited.

The kernel/microkernel in Darwin have basically nothing in common with FreeBSD. Feel free to confirm this for yourself by reading the sources to both.

Re:explorer.exe == bad design outside of the kerne by functor0 · 2003-12-27 20:06 · Score: 1

Interesting. Which "shell calls" are we talking about exactly? It would be nice to know which these are and then spawn a thread to handle them if possible.

Re:explorer.exe == bad design outside of the kerne by man_ls · 2003-12-28 18:14 · Score: 1

Windows API calls that are handeled by the running shell, explorer.exe.

Different notations by man_ls · 2003-12-28 18:21 · Score: 1

While functionally similar, I always learned the "Big O" notations as:

O(k)
O(kn)
O(k * log n)
O(k * n log n)
O(kn^2)
O(k * n^n)

etc.

where K is some given constant...slightly more clear than O(1) for constant time.

Re:explorer.exe == bad design outside of the kerne by functor0 · 2003-12-29 17:55 · Score: 1

Are you telling me that *ALL* Win32 API calls are handled through explorer.exe!? That's ridiculous.

Re:explorer.exe == bad design outside of the kerne by man_ls · 2003-12-30 01:17 · Score: 1

Only if they're shell calls...Generally, filesystem operations are handeled by Explorer; other API calls are made to the respective services providing that functionality.

Slashdot Mirror

ArsTechnica Explains O(1) Scheduler

295 comments