The Really Fair Scheduler

Coming soon to a linux kernel near you: by El_Muerte_TDS · 2007-09-01 08:58 · Score: 3, Funny

The the fancy fair scheduler.

Re:Coming soon to a linux kernel near you: by Megane · 2007-09-01 11:32 · Score: 3, Funny

I'm waiting for the Science Fair Scheduler. And the ladies out there might want to try the Vanity Fair Scheduler.

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Re:Coming soon to a linux kernel near you: by BronsCon · 2007-09-01 12:10 · Score: 3, Funny

"fancy fair scheduler"

Oh, FFS.

--
APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
Re:Coming soon to a linux kernel near you: by Anonymous Coward · 2007-09-01 15:01 · Score: 0

I'm still waiting to hear an announcement of at least some consideration for a scheduler that knows about IO. They can call it Poop On Your Face And Rape You In The Nostrils Scheduler for all I care if it would just exist!
Re:Coming soon to a linux kernel near you: by Anonymous Coward · 2007-09-01 17:31 · Score: 0

No it should the RCFS(really completely fair scheduler)
Re:Coming soon to a linux kernel near you: by gowen · 2007-09-01 20:53 · Score: 5, Funny

How about the Scarbrough Fair Scheduler, that allocates Parsley, Sage, Rosemary and Thymeslices.

--
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
Re:Coming soon to a linux kernel near you: by Your.Master · 2007-09-01 22:57 · Score: 1

I hate you for this and yet I have to congratulate you.
Re:Coming soon to a linux kernel near you: by mindwhip · 2007-09-02 11:39 · Score: 2, Funny

Darkmoon Fair Scheduler... gives you prizes for throwing junk software at it...

--
[The Universe] has gone offline.
Re:Coming soon to a linux kernel near you: by fractoid · 2007-09-02 14:35 · Score: 1

Oooh, burn! Never... want... to see... evil bat eye... or glowing scorpid blood... again!

--
Rampant carbon sequestration destroyed the Dinosaurs' tropical paradise. I'm here to help repair the damage.
Re:Coming soon to a linux kernel near you: by gowen · 2007-09-02 18:45 · Score: 1

That's pretty much how I feel about myself...

--
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.

Still waiting for the IFS by amliebsch · 2007-09-01 08:58 · Score: 4, Funny

Still waiting for Steve Jobs' "Insanely Fair Scheduler."

--
If you don't know where you are going, you will wind up somewhere else.

Re:Still waiting for the IFS by Xtravar · 2007-09-01 09:02 · Score: 4, Funny

Still waiting for Steve Jobs' "Insanely Fair Scheduler." Wouldn't that be named something more like iFS or iSched?

God forbid we drop the lower-case I naming convention. It stands for "interwebs compatible".

--
Buckle your ROFL belt, we're in for some LOLs.
Re:Still waiting for the IFS by JoeCommodore · 2007-09-01 09:10 · Score: 4, Funny

Sure would be better than the "Multicolored Pinwheel of Wait" part of OS X now.

--
"Enjoy what you're doing! If it becomes drudgery, you're doing it wrong!" - Jim Butterfield
Re:Still waiting for the IFS by LiquidCoooled · 2007-09-01 09:41 · Score: 3, Funny

Sorry, Apple already has designs on the iSched moniker.
Where else would you keep your iLawnmower?

--
liqbase :: faster than paper
Re:Still waiting for the IFS by Anti-Trend · 2007-09-01 09:59 · Score: 3, Interesting

Agreed. While I recognise and appreciate the humor in your comment, this is the main reason I use Debian on the desktop rather than OS X -- I multitask heavily. A Linux kernel with a Desktop preemption model and 1000Hz Timer frequency is a Godsend for those who push their PC's a tad too hard on a regular basis. I would like to see a simplified version of the scheduler, but all said CFS isn't as bad as everybody makes it out to be.

--
Working in a DevOps shop is like playing in a band made up entirely of keytarists.
Re:Still waiting for the IFS by Solra+Bizna · 2007-09-01 10:50 · Score: 1

Upgrade to 1GB of RAM (2GB on Intel) and you won't see it anymore. (usually.)

-:sigma.SB

--
WARN
THERE IS ANOTHER SYSTEM
Re:Still waiting for the IFS by ScrewMaster · 2007-09-01 11:44 · Score: 1

I think we should hold a conference for kernel developers the world over to air their concerns about this issue.

We could call it the "International Scheduler Fair".

--
The higher the technology, the sharper that two-edged sword.
Re:Still waiting for the IFS by YU+Nicks+NE+Way · 2007-09-01 12:12 · Score: 1

Nonsense.

I see the pinwheel many times a day, and that's on a fully tricked out MacBookPro.
Re:Still waiting for the IFS by Breakfast+Pants · 2007-09-01 12:40 · Score: 1

and BOOM!

--

--

WHO ATE MY BREAKFAST PANTS?
Re:Still waiting for the IFS by earnest+murderer · 2007-09-01 13:20 · Score: 2, Informative

Upgrade to 1GB of RAM (2GB on Intel) and you won't see it anymore. (usually.)

-:sigma.SB
Depends a lot on your situation.

Even with many many gigabytes of ram there are many situations where Apples applications (or the os) just sit there and do nothing (or spinning that pinwheel like they've nothing better to do) and you wonder if they crashed or what... Often enough, no. They're just doing the wrong or stupid thing and it eventually recovers. How often you see it depends a lot on your usage pattern.

None of these (near as I can tell) have anything to do with the scheduler. Just shoddy code and bad decisions.

--
Platform advocacy is like choosing a favorite severely developmentally disabled child.
Re:Still waiting for the IFS by Anonymous Coward · 2007-09-01 15:18 · Score: 0

That's a feature\ from NextStep. The beachball is just a the aqua skin for it.
Re:Still waiting for the IFS by tlhIngan · 2007-09-01 15:33 · Score: 1

Still waiting for Steve Jobs' "Insanely Fair Scheduler."

Alas, such a thing won't exist... and we'll end up with the "Reality Distortion Fair Scheduler" - it looks like things are fairly scheduled, but deep inside, the scheduler isn't. It's just everyone is so happy using MacOS X with the RDFS that they don't notice *grin*.

(yes, the above was a joke - laugh).
Re:Still waiting for the IFS by coryking · 2007-09-01 15:52 · Score: 1

Oh yeah? Well Vista runs fine on 512mb of ram. I never see the pinwheel. Did you do something stupid like turn off ReadyBoost?

Oh wait. Wrong OS. Sorry.

FWIW, I get the pinwheel on my 1gb macbook sometimes while I'm in firefox sometimes. My "real" box that I do most of my work on runs Vista /w 2gb of ram and it does the same, only doesn't give me the visual queue that the mac does. All OS's suck in their own creative way. Your mileage may vary.
Re:Still waiting for the IFS by The+MAZZTer · 2007-09-01 16:50 · Score: 1

Yeah, that's one thing Microsoft got right. I mean, it's an HOURGLASS that never stops running! Incredible!

Oh wait. They replaced it with a teal pinwheel in Vista, I forgot. Pfft.
Re:Still waiting for the IFS by Alsee · 2007-09-01 17:46 · Score: 1

I think I am going to write and submit a scheduler, just so I can name it the My Scheduler Is Better Than Your Scheduler Scheduler.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
Re:Still waiting for the IFS by davester666 · 2007-09-01 18:42 · Score: 1

Sure would be better than the "Multicolored Pinwheel of Wait" part of OS X now.
Except the pinwheel of death is pretty much unrelated to process scheduling [or at least not directly related to it]. The pinwheel of death is presented if the Application hasn't asked for an event in the last 3 seconds or so. This generally happens when it's either waiting for a resource [vm, disk, something in the kernel] or it's doing processing in a loop without asking for user input. Most of the time, it happens because a programmer took the lazy way out and does too much processing on the main event loop instead of spinning off another thread or doing bits of work on a timer.

--
Sleep your way to a whiter smile...date a dentist!
Re:Still waiting for the IFS by The+One+and+Only · 2007-09-01 22:54 · Score: 1

only doesn't give me the visual queue that the mac does
You mean "cue". "Queue" is a data structure, or the thing you stand in at the post office (which functions just like the data structure).

--
In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
Re:Still waiting for the IFS by Ian+Alexander · 2007-09-02 08:31 · Score: 1

Upgrade to 1GB of RAM (2GB on Intel) and you won't see it anymore. (usually.)

-:sigma.SB

My school has two G5 Powermacs. They're both equipped with 2 GB of RAM and 2 2 Ghz G5 processors.

I see the spinny-rainbow-wheel-of-death all the time when I use those machines.

Does it... by markov_chain · 2007-09-01 09:03 · Score: 3, Interesting

help in the case when a process goes nuts allocating memory, and stops the GUI dead in its tracks? No Alt-Ctrl-Backspace, no switching to console, unbearably slow remote login...

--
Tsunami -- You can't bring a good wave down!

Re:Does it... by outZider · 2007-09-01 09:10 · Score: 0, Flamebait

The solution there would be 'FreeBSD'. :)

--
- oZ
// i am here.
Re:Does it... by DaleGlass · 2007-09-01 09:12 · Score: 5, Informative

I don't think any scheduler will help you with that. The slowness is due to the swapping in and out from the disk, and that's going to be limited by the horribly slow speed of the disk.

You could tweak things to make this a less likely ocurrence though.

Disable overcommit by echo 2 > /proc/sys/vm/overcommit_memory. No more OOM killer killing some random unrelated process. Memory allocations will fail and programs will be able to handle that correctly.

Set some memory limits in /etc/security/limits.conf

Avoid having too much swap space. It's awfully slow, if you're using it too much all you'll manage is to run more things slower.

Get more RAM, it's cheap. If you're regularly swapping then you definitely should.
Re:Does it... by markov_chain · 2007-09-01 09:21 · Score: 1

Thanks for the suggestions, I'll try some out.

I'm far from knowledgeable about what's possible to do right now using various tuning knobs. I guess I'm surprised that the GUI doesn't get priority over this sort of runaway process, but I have to temper this with saying that I never played with adjusting the nice level of various relevant processes.

Increasing the RAM size is not a solution though, since the kind of runaway process that causes the freeze will allocate everything it can anyway.

--
Tsunami -- You can't bring a good wave down!
Re:Does it... by cnettel · 2007-09-01 09:26 · Score: 1

Well, it should be possible for a scheduler to realize "oh, this process causes thrashing, I'll give it like 30 secs to see if it calms down, if not I'll freeze any more hard page errors caused by it for another 30 secs". Basically, in addition to thread quanta, introduce another level of longtime quanta for stuff that won't complete soon anyway. The worst killer here is when you have two processes, basically independent, that would each fit in RAM, but the scheduler insists on keeping them switching several times a second, so they will just swap each other out.
Ok, I know there are attempts to solve this. I think one scenario I've got several times in OS X, where NO thread is running and the amount of pageins is low, is due to some heuristic trying to stop behavior like this, but tying workset management and scheduling closer together might make sense in many kernels.
Re:Does it... by Colin+Smith · 2007-09-01 09:27 · Score: 1

ulimit is your friend.

--
Deleted
Re:Does it... by pe1chl · 2007-09-01 09:37 · Score: 1

I'm surprised that the GUI doesn't get priority over this sort of runaway process

That is because the GUI is just a set of processes running under the same mechanism, not some special part of the kernel or something like that.
Re:Does it... by Anonymous Coward · 2007-09-01 09:43 · Score: 3, Informative

(on a bash shell)
ulimit -v 4096 command_that_uses_memory
This will limit the amount of memory available to command_that_uses_memory, and kill it once that limit is reached. But do you really want firefox forcibly killed every time you visit youtube?
Re:Does it... by markov_chain · 2007-09-01 09:47 · Score: 1

Right, but that set of processes could be run at some higher nice level, which in theory would result in them preempting the runaway process. I'll shut up now because this is easy to test and I've never done it.

--
Tsunami -- You can't bring a good wave down!
Re:Does it... by ForumTroll · 2007-09-01 09:47 · Score: 5, Funny

But do you really want firefox forcibly killed every time you visit youtube?
Yes.

--
"A Lisp programmer knows the value of everything, but the cost of nothing." - Alan Perlis
Re:Does it... by DaleGlass · 2007-09-01 09:50 · Score: 1

Well, I'm not an expert in scheduling stuff, but that sounds pretty complicated.

Say, what if you really need to run a process that causes the box to swap like mad? It could be that you're say, trying to build MAME, which seems to have a couple of files that make gcc consume about 512MB RAM. Now what if you need to do this on a box with just 384MB? Having the scheduler keep pausing it would only make it longer.

Then, the most evil type of swap death is a positive feedback loop. For example, mail servers. Too much mail arrives, resulting in some swapping. This in turn slows down the mail server, spamd, etc. Mail gets processed slower than usual and the number of running processes increases. More mail arrives slows things down even more. And so on until the server dies completely.

On my desktop, swap death ocassionally happens when compiling. kdevelop + second life + vmware + konqueror + kmail + a mono app can quite easily bring my 4GB RAM box very near the limit, but not to swapping just yet. Then if I start the compilation process, it can easily results in 2 copies of gcc wanting 512MB each, and things grind to a halt.
Re:Does it... by Just+Some+Guy · 2007-09-01 10:35 · Score: 4, Informative

Avoid having too much swap space. It's awfully slow, if you're using it too much all you'll manage is to run more things slower.
FreeBSD likes lots of extra swap space. An idle system will notice that some process hasn't run in a month and will push it to swap, proactively freeing RAM for something else that might want it. Note that it will only page out a process's data segment; it's code segment uses the filesystem itself for paging (why copy "firefox" into swap when there's already a perfectly readable copy on the filesystem?).

Unless, of course, you unlink its executable file, in which case it allocates swap to hold the file first. Which also illustrates that while unnecessary computational complexity is bad, willingness to do complex things when the situation demands can lead to some pretty cool stuff.

--
Dewey, what part of this looks like authorities should be involved?
Re:Does it... by DaleGlass · 2007-09-01 11:32 · Score: 1

FreeBSD likes lots of extra swap space. An idle system will notice that some process hasn't run in a month and will push it to swap, proactively freeing RAM for something else that might want it. Note that it will only page out a process's data segment; it's code segment uses the filesystem itself for paging (why copy "firefox" into swap when there's already a perfectly readable copy on the filesystem?).

Unless, of course, you unlink its executable file, in which case it allocates swap to hold the file first. Which also illustrates that while unnecessary computational complexity is bad, willingness to do complex things when the situation demands can lead to some pretty cool stuff.

Linux does all that as well.

What I mean here is that there is a limit to the amount of swap space that it makes sense to allow to be used. I've seen people discussing really overkill things like "I'm going to get a 15K RPM SCSI drive for the swap partition and assign all of it to swap". Just that you could assign 40GB to swap (on a 64 bit box you probably can, haven't tried), doesn't mean it'd be a good idea. The system would be as good as dead long before it used any significant fraction of that.

There's a point where adding more swap is only going to allow the system to run even worse instead of having the process die and fix the problem.
Re:Does it... by diegocgteleline.es · 2007-09-01 11:36 · Score: 1

....why?

Just curious, it has been many years since freebsd offered me performance advantages than linux. These days it's pretty much the contrary, the last time I tried the supposedly SMP-optimized newest versions of freebsd, the system would fall into the FreeBSD's Big Giant Lock doing some simple dist tasks in a 2-CPU machine. And when I want a BSDish unix OS I've opensolaris....
Re:Does it... by a_n_d_e_r_s · 2007-09-01 11:39 · Score: 1

One can use nice(1) to give a program higher or lower priority to the scheduler.

So if you have a program that hogs the CPU - be nice(1) to it! :-)

--
Just saying it like it are.
Re:Does it... by Ash-Fox · 2007-09-01 12:03 · Score: 1

On my desktop, swap death ocassionally happens when compiling. kdevelop + second life + vmware + konqueror + kmail + a mono app can quite easily bring my 4GB RAM box very near the limit, but not to swapping just yet.
Dale, I envy your machine.

--
Change is certain; progress is not obligatory.
Re:Does it... by hedwards · 2007-09-01 12:27 · Score: 1

I can almost always kill any process off that I need to, and it does so promptly. The only thing which has ever prevented me from doing so was if the kernel froze. And that is not often.

I ctrl-alt-Fn always works, as does ctrl-alt-backspace, but if those don't work there are much more serious problems for me to worry about.

As long as I've been using freebsd, I have had no problems with the scheduling. The scheduling for Linux is hopefully better now, because last time I loaded it up the scheduler was completely unacceptable. The inability of it to handle anything more than an mp3 and the wm was pretty much unbearable. But from what I gather that has made huge improvements in recent times so it probably isn't the problem I recall.
Re:Does it... by shaitand · 2007-09-01 15:31 · Score: 1

'There's a point where adding more swap is only going to allow the system to run even worse instead of having the process die and fix the problem.'

Not to mention, despite what BSD does to proactively free RAM you don't want to do that unless there is a shortage of ram in the first place. After all, the program that has been idle for a month might kick up and do something and if nothing else needs the ram it is using, it will be more responsive if it is still in RAM than if it is sitting in swap on a box with 3gb of free ram.

What I have noticed (from observation, not study of the memory system in the kernel) is that the size of the swap file, or possibly the size of swap relative to free memory seems to impact when swapping begins. I use a swap that is ram*2 when using less than 512mb ram, for 512mb I use swap of equal size, and for more than that, swap is half the size of system memory. After all, its more advantageous to avoid swapping in the first place than to improve the performance of swap.
Re:Does it... by Alsee · 2007-09-01 17:52 · Score: 1

Can we have the user forcibly killed every time they visit MySpace?

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
Re:Does it... by Creechur · 2007-09-01 19:22 · Score: 1

Disable overcommit by echo 2 > /proc/sys/vm/overcommit_memory. No more OOM killer killing some random unrelated process. Memory allocations will fail and programs will be able to handle that correctly.
FYI, I used to do this on an embedded system - until the first time I had to run a multi-threaded app on it. Per-thread stacks are allocated in advance (so that they're spaced far enough apart to allow for expansion without stepping on each other), which causes any multi-threaded process to allocate lots of memory that it doesn't really need. The situation is better on a desktop than on an embedded system just due to the available memory, but you'll still run into issues if you turn off overcommit and run any apps that use lots of threads.
Re:Does it... by pe1chl · 2007-09-01 21:09 · Score: 1

This will not accomplish much.
For one, in Linux the process priority us dynamically adjusted. So a program that hogs the CPU will automatically decrease in priority so that it gets all CPU time remaining after other processes that use little CPU have got their share. It will not really starve lower-priority processes, as happens on a completely priority determined scheduler with static priorities (found in realtime kernels, in Windows NT, etc).

But, another issue is that a process that makes the system slow by allocating lots of memory and causing lots of paging to disk (as described here) is not a CPU hog and will thus not be controlled by the process priority. Even at a very low priority, such tasks can still slowdown the system.
You need tuning of the system parameters (via /proc or sysctl) to avoid that.
Re:Does it... by LLuthor · 2007-09-01 23:25 · Score: 1

That depends entirely on the use case for that swap. I run an app that use 5-8GB worth of temp files each run (and often more than one instance is running), on a system with only 16GB of ram, the extra swap space is used for backing a large tmpfs filesystem. This gets way better performance than running them on a real filesystem (we used to use ext3, then xfs).

We currently allocate ~80GB of swap for this machine (on a raid-1), and most of it gets used from time to time, and the kernel has a much easier time with mmap() on tmpfs than on any real fs.

--
LL
Re:Does it... by marcosdumay · 2007-09-02 03:55 · Score: 1

"Memory allocations will fail and programs will be able to handle that correctly."

One can dream, right? It would be more correct as "programs should be able to handle that", although they won't.

--
Rethinking email
Re:Does it... by DaleGlass · 2007-09-02 05:09 · Score: 1

Should be read as "they will gain the possibility of handling that correctly". Doesn't mean they will perform a clean shutdown, but at least they'll be able to if they're programmed correctly.

If the OOM killer kills the process it will immediately terminate without any hope of performing a safe shutdown. That sort of thing is can be quite nasty for some applications. Imagine for example an application getting killed when it's halfway through writing some data file.
Re:Does it... by TheLink · 2007-09-02 19:08 · Score: 1

You probably want to adjust your memory overcommit tunables then.

I've got 2GB of RAM, vm.overcommit_ratio=95, vm.overcommit_memory = 2 and swap at 384MB (might be better with _less_, but some processes just allocate memory without using it).

Disk transfers when swapping are usually about 10MB/sec, and so 384MB = more than long enough...

I'd rather have processes get an out of memory error quickly than have my entire machine go into a swap death spiral.

So what if the mail server dies because it can't allocate any more mem (set ulimits accordingly), at least I can still ssh in.
--
- Too many replies beneath your current threshold
Re:Does it... by paropaco · 2007-09-03 04:17 · Score: 1

Unless, of course, you unlink its executable file, in which case it allocates swap to hold the file first.

Why doesn't it just take into account the fact that the file is in execution in its reference count and leave the file where it is on the file system until it terminates? "unlink" only decrements the reference count, it does not free the file. Seems much simpler to me.
In addition, if it does as you say, FreeBSD does not just need to allocate the swap to hold the file, it actually needs to take every page from the executable file that has not yet been loaded in memory and actually copy it to the swap. Where's the coolness in that?
By the way, I didn't find the paragraph which lead you to that that's how it does in there.
Re:Does it... by outZider · 2007-09-04 11:32 · Score: 1

The scheduler on FreeBSD never locks me out of the system, never takes down the kernel under high load, and always gives me a chance to save a system without hard powering it. I'm using mostly Linux here at work now, of both Debian and Red Hat varieties, and I can't say the same for it at all. If given the choice, I would move everything here to FreeBSD for just the scheduler.

I'm also ticked that I am now Flamebait, but I guess it makes sense.

--
- oZ
// i am here.

Interestingly rigorous by heinousjay · 2007-09-01 09:03 · Score: 3, Interesting

I'd have to imagine doing so much work to prove a particular implementation's value mathematically is a good step toward depoliticizing the scheduler. That should help in what's been a contentious piece of the kernel of late.

--
Slashdot - where whining about luck is the new way to make the world you want.

Re:Interestingly rigorous by ianare · 2007-09-01 09:46 · Score: 3, Informative

One would hope, but it doesn't look like it's going that way. If you look at Ingo's reply, then Roman's reply to that, you can see what could be the start of yet another flame fest :
Hi,

On Fri, 31 Aug 2007, Ingo Molnar wrote:

> So the most intrusive (math) aspects of your patch have been implemented
> already for CFS (almost a month ago), in a finegrained way.

Interesting claim, please substantiate.

> Peter's patches change the CFS calculations gradually over from
> 'normalized' to 'non-normalized' wait-runtime, to avoid the
> normalizing/denormalizing overhead and rounding error.

Actually it changes wait-runtime to a normalized value and it changes nothing about the rounding error I was talking about. It addresses the conversion error between the different units I was mentioning in an earlier mail, but the value is still rounded.

> > This model is far more accurate than CFS is and doesn't add an error
> > over time, thus there are no more underflow/overflow anymore within
> > the described limits.

> ( your characterisation errs in that it makes it appear to be a common
> problem, while in practice it's only a corner-case limited to extreme
> negative nice levels and even there it needs a very high rate of
> scheduling and an artificially constructed workload: several hundreds
> of thousand of context switches per second with a yield-ing loop to be
> even measurable with unmodified CFS. So this is not a 2.6.23 issue at
> all - unless there's some testcase that proves the opposite. )

> with Peter's queue there are no underflows/overflows either anymore in
> any synthetic corner-case we could come up with. Peter's queue works
> well but it's 2.6.24 material.

Did you even try to understand what I wrote? I didn't say that it's a "common problem", it's a conceptual problem. The rounding has been improved lately, so it's not as easy to trigger with some simple busy loops. Peter's patches don't remove limit_wait_runtime() and AFAICT they can't, so I'm really amazed how you can make such claims.

> All in one, we dont disagree, this is an incremental improvement we are
> thinking about for 2.6.24. We do disagree with this being positioned as
> something fundamentally different though - it's just the same thing
> mathematically, expressed without a "/weight" divisor, resulting in no
> change in scheduling behavior. (except for a small shift of CPU
> utilization for a synthetic corner-case)

Everytime I'm amazed how quickly you get to your judgements... :-( Especially interesting is that you don't need to ask a single question for that, which would mean you actually understood what I wrote, OTOH your wild claims tell me something completely different.

BTW who is "we" and how is it possible that this meta mind can come to such quick judgements?

The basic concept is quite different enough, one can e.g. see that I have to calculate some of the key CFS variables for the debug output. The concepts are related, but they are definitively not "the same thing mathematically", the method of resolution is quite different, if you think otherwise then please _prove_ it.

bye, Roman
Re:Interestingly rigorous by try_anything · 2007-09-01 10:29 · Score: 2, Interesting

Math is reliable, but it's slow going, even for very simple math.

People prefer verbal reasoning, even though all kinds of logical errors can slip in undetected, for the simple fact that they can read it at the speed of speech -- even if they really shouldn't.

This is PAINFULLY evident in the software world. I imagine even kernel developers tend to be lazy this way.
Re:Interestingly rigorous by HeroreV · 2007-09-01 15:26 · Score: 3, Insightful

When will people learn that being rude doesn't help? If you want somebody to work with you, you need to play nice. It's not pleasant, and it's not easy to make yourself calm down and act like a pussy, but it's important if you ever want any collaboration.

Example:
Interesting, but I don't see this. Can you point it out?

I think you misunderstood me. It may not be a common problem, but it is a conceptual problem. The rounding has been improved lately, so it's not as easy to trigger with some simple busy loops. Peter's patches don't remove limit_wait_runtime() and AFAICT they can't, so I don't see how what you said can be correct.

I'm worried about how quickly you judged this issue, and that you haven't been more in contact with me discussing it. This issue is important to me, and I'd really like to work with you to get it resolved.
Re:Interestingly rigorous by ccp · 2007-09-04 22:38 · Score: 2, Interesting

When will people learn that being rude doesn't help? If you want somebody to work with you, you need to play nice. It's not pleasant, and it's not easy to make yourself calm down and act like a pussy, but it's important if you ever want any collaboration.

(emphasis mine)

Very true, but I have this suspicion that some hacker's rudeness is intended to piss people off and keep the field, the spotlight, and the pressumed "glory" to themselves.

Sad thing is, it works a lot of the time, and you can always blame old trusty Asperger's.

Cheers,
CC

The Infintely Fair Scheduler of Solomon by WombatDeath · 2007-09-01 09:04 · Score: 4, Funny

In which no process gets any resources at all. I've also been considering a quantum scheduler, in which each CPU cycle is assigned to every process simultaneously.

Shit, I've just figured out why I'm a project manager.

Re:The Infintely Fair Scheduler of Solomon by roman_mir · 2007-09-01 09:23 · Score: 2, Informative

Pay per scheduler, the kind that allocates time to processes that are initialized by the highest paying bidder. I am aiming for a CEO.

--
You can't handle the truth.
Re:The Infintely Fair Scheduler of Solomon by Anonymous Coward · 2007-09-01 09:28 · Score: 0

That actually sounds quite a lot like some old mainframe schedulers, or even the classic "Lottery Scheduler".
Re:The Infintely Fair Scheduler of Solomon by raftpeople · 2007-09-01 11:01 · Score: 1

You're already at 5 Funny, so all I can do is say that is a pretty dang funny post, the second line made me laugh out loud (not just the LOL that everyone types, but the real laugh out loud where the guy next to you wonders what the hell you're doing)
Re:The Infintely Fair Scheduler of Solomon by Antique+Geekmeister · 2007-09-01 12:05 · Score: 1

I thought you were an off-shore helpdesk?
Re:The Infintely Fair Scheduler of Solomon by aj50 · 2007-09-01 12:27 · Score: 1

Aim carefully, but make sure no-one catches you

--
I wish to remain anomalous
Re:The Infintely Fair Scheduler of Solomon by fahrbot-bot · 2007-09-01 17:28 · Score: 1

I've also been considering a quantum scheduler...
Otherwise known as the Heisenberg Uncertainty Scheduler.
The main problem with this is you can know which process is scheduled or which will be next, but not both. In fact, the act of scheduling would probably alter the scheduler itself.

--
It must have been something you assimilated. . . .
Re:The Infintely Fair Scheduler of Solomon by sparcnut · 2007-09-02 04:33 · Score: 0

I've also been considering a quantum scheduler, in which each CPU cycle is assigned to every process simultaneously.
This is pretty much what hyperthreading/SMT attempts to do (execpt instead of "every process" it's 2 processes).

--
perl -e 'print $i=pack(c5, (41*2), sqrt(7056), (unpack(c,H)-2), oct(115), 10);'

Reall fair with IFB by phiber9 · 2007-09-01 09:05 · Score: 1

I just hope it will work as-advertised with IFB

--
Xatrix Security - Computer Security news portal

Fuck this. by Anonymous Coward · 2007-09-01 09:11 · Score: 5, Funny

Let's just go back to cooperative multitasking like Mac OS where everything was simple.

Re:Fuck this. by morgan_greywolf · 2007-09-01 10:57 · Score: 0, Redundant

Let's just go back to cooperative multitasking like Mac OS where everything was simple.

And we don't do that because cooperative multitasking SUCKED. Both Apple and Microsoft completely agree on that, because each of their next-generation OSes (Windows NT and Mac OS X) replaced cooperative multitasking with preemptive multitasking. And Unix has always had preemptive multiasking.

--
My blog
Re:Fuck this. by Anonymous Coward · 2007-09-01 11:09 · Score: 0

... and that would be just like Windows 3.1
Re:Fuck this. by bcat24 · 2007-09-01 11:42 · Score: 3, Funny

Woosh!
Re:Fuck this. by coryking · 2007-09-01 15:46 · Score: 2, Funny

Dude. And windows 3.1 rocked. You dont see many security bugs with Windows 3.1 do you? It is like the most secure OS ever!
Re:Fuck this. by Anonymous Coward · 2007-09-01 18:17 · Score: 0

Double whoosh!

Re:Coming soon by El_Muerte_TDS · 2007-09-01 09:11 · Score: 1

Nobody is stopping you from using Windows Me

Re:Coming soon by JazzyMusicMan · 2007-09-01 09:12 · Score: 0, Flamebait

I would love to use your unfair scheduler! A scheduler with liberal tendencies might block all this spyware. We know how fond 'W' is of spying...

Re:Coming soon by ScrewMaster · 2007-09-01 09:15 · Score: 4, Funny

Of course, there's the companion "pork barrel scheduler" which randomly spawns useless processes in order to take time from those that deserve it.

--
The higher the technology, the sharper that two-edged sword.

Why not swappable? by jimmyhat3939 · 2007-09-01 09:15 · Score: 3, Interesting

What I don't understand is why these schedulers can't just be swapped out by the users. I know there was some discussion of this, and it was vetoed by the kernel maintainers. It makes a lot of sense to me to just allow users to insert kernel modules with schedulers and just do something in the /proc filesystem to go between them. Then people could use whatever they like, and if they write their own, they wouldn't have to recompile the kernel.

After all, isn't that the idea of open source software -- may the best code win?

--
Free Conference Call -- No Spam, High Quality

Re:Why not swappable? by cnettel · 2007-09-01 09:32 · Score: 5, Informative

The scheduler is at the very heart of the kernel. It's relatively hard to make the logic for choicing what and when to context-switch modular, while keeping the actual context-switches fast enough. Diferent schedulers tend to have different ideas on what stats to keep, and you all want it with good memory locality. After all, we should remember that this is a piece of code that's relevant tens or hundreds of times per second, no matter what you do with your machine.
Re:Why not swappable? by Anonymous Coward · 2007-09-01 10:01 · Score: 0

I think the main concern (probably rightly) is performance. Any kind of hot swappable architecture where you could change out your process scheduler on the fly is going to add computational overhead to one which is known at compile time.
Re:Why not swappable? by dhasenan · 2007-09-01 10:05 · Score: 2, Interesting

Then don't allow them to compile schedulers as modules -- force each kernel to have a single scheduler built in. Then it's a matter of specifying the interface and then linking in a different object file.

It's doable (easy, even), it doesn't require significant investment from a kernel maintenance perspective, and it cuts through a fair bit of politicking.
Re:Why not swappable? by Anonymous Coward · 2007-09-01 10:11 · Score: 0, Insightful

Then don't allow them to compile schedulers as modules -- force each kernel to have a single scheduler built in.

Hooray for rebuilding the kernel and rebooting whenever you want to switch to a different workload!
Re:Why not swappable? by Anonymous Coward · 2007-09-01 10:17 · Score: 1, Insightful

as opposed to always running with a scheduler unsuited to your workload?
Re:Why not swappable? by Anonymous Coward · 2007-09-01 10:35 · Score: 0

In Linus' letter about why he chose CFS, he explained why he blocked any efforts for making schedulars pluggable. It largely boiled down to him wanting to force his choices on the community, since few distributions would take the extra effort of applying patch sets he disproved of. Considering that he admitted that he chose CFS because the author was his friend, its not surprising. Remember that Linus has moved into a pure management role, so just like your boss, he has his favorites and acts accordingly.
Re:Why not swappable? by treke · 2007-09-01 11:05 · Score: 2, Insightful

The simplest answer is that the developers who have the final say don't want to do it that way. They think that it's better for the kernel to have one single scheduler that gets widely tested against every type of load than to have multiple schedulers that tend to only get tested in their areas of optimization.
Re:Why not swappable? by diegocgteleline.es · 2007-09-01 11:26 · Score: 1

Because this patch is just an improvement over CFS, and should either merged in mainline's CFS or completely rejected?
Re:Why not swappable? by mikael · 2007-09-01 11:42 · Score: 1

Not sure if this is an urban legend or not, but function calls between separate source code files could take longer than functions in the same source code file because the compiled executable code could end up on separate virtual memory pages. I would guess that modern compilers would optimisze the code to avoid this problem.

--
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Re:Why not swappable? by Anonymous Coward · 2007-09-01 12:31 · Score: 0

You just described the current state of schedulers on FreeBSD:
Put "options SCHED_4BSD" in the kernel config file if you want the modernized BSD scheduler, or "options SCHED_ULE" for the new but somewhat experimental one. There was also a fork ("SCHED_SMP") of that again for a while, but it was merged back into SCHED_ULE.
A kernel config file must contain exactly one scheduler.
Re:Why not swappable? by Anonymous Coward · 2007-09-01 14:55 · Score: 1, Informative

The Solaris kernel supports several schedulers which can be changed during runtime. You can also have different processes under different schedulers at the same time.
Re:Why not swappable? by coryking · 2007-09-01 15:55 · Score: 1

What is the deal with that? I could never figure out which was a good one for a "non enterprise" production box (as in, I can deal with 99.999). The "cool new one" always had huge warnings but seemed so tempting. Is there a FM somewhere that explains the difference between FreeBSD schedulers?
Re:Why not swappable? by Anonymous Coward · 2007-09-01 18:45 · Score: 0

Personally, I think an easy solution to this is just to have multiple kernels, with one for each scheduler. Sure, a single scheduler would have to be chosen as the default, however I believe that the major distros (Ubuntu, Debian, RedHat, etc) should have packages (via APT, YUM, etc) so all a user has to do is click a check mark in Synaptic and boom, they can reboot and use a different scheduler.

I also am in favor of having multiple schedulers (once they're added as above) available on a GRUB or LILO menu. I mean, I'd like to be able to boot each one and try each one on the same system with the same hardware and the same programs and then just "apt-get remove" the ones that are slower.

Anyhow, if the big battle is over which kernel is the default, that's fine, but I believe that every distro should let the USERS choose (via a simple, precompiled package on the package management system of choice) and I think that after everyone was given a choice, as someone said, the best code will win.

That said, I don't believe any scheduler is perfect for everyone. Many say that the old CK branch with the deadline scheduler provides better response on a desktop. I'd like to test that, but I can't, as a kernel install has failed for me 4 times (compiles, but won't boot). If it was an apt package, that wouldn't be an issue. Just the same, I'm sure that schedulers other than deadline preform better in a server farm, so I wouldn't suggest making deadline the default. I'd just prefer that it be an option for an average joe, instead of being restricted to the people who can recompile (or in my case, boot) their own.
Re:Why not swappable? by sonpal · 2007-09-02 01:46 · Score: 3, Interesting

One could say the same about filesystems - but we figured out how to abstract the filesystem API in UNIX a long time ago. This led to a lot of innovation in filesystems - ext2, ext3, ReiserFS, AFS, ZFS, etc. I think we might see similar innovation in schedulers if the scheduler was pluggable. At the very least, I suspect that Con Klivas would still be a kernel developer had we supported pluggable schedulers, and that alone might justify making the scheduler pluggable.

I expect that there would be a performance impact if the scheduler were pluggable - modular and optimized do not generally go together. However, the worst case performance of any scheduler dominates the user experience, so IMHO, it is worth accepting a small performance penalty to enables competition and innovation toward reducing the worst case performance.
Re:Why not swappable? by Zombywuf · 2007-09-02 02:59 · Score: 1

Or, running scheduler that is appropriate for any workload.

--
If you can read this you've gone too far.
Re:Why not swappable? by edwdig · 2007-09-02 11:33 · Score: 1

Not sure if this is an urban legend or not, but function calls between separate source code files could take longer than functions in the same source code file because the compiled executable code could end up on separate virtual memory pages. I would guess that modern compilers would optimisze the code to avoid this problem.

Virtual memory pages are typically 4 KB, so in a program of any substantial size, the odds are pretty high of any given function call crossing a page boundary. If you get a source file with a few thousand lines of code in it, you're likely to run into those issues.

This isn't an issue with kernel code, as the kernel isn't swappable. You'd have some pretty serious issues if your scheduler was swapped out to disk...

This post by fishthegeek · 2007-09-01 09:18 · Score: 3, Funny

has been scheduled for use by the slashdot server farm on September 6, 2007 at 14:54:23. Please refresh this page at that time for fishthegeek's insightful comment.

Automatically generated by:
Slashdot Predictive Post Scheduler v 2.12.02-16

--
load "$",8,1

Re:This post by The+MAZZTer · 2007-09-01 16:56 · Score: 1

I can't wait! No really. I'm not going to wait.

What about the neocon scheduler? by Anonymous Coward · 2007-09-01 09:24 · Score: 2, Funny

Completely rejecting both liberal and conservative ideals, it allocates time slices only to processes that already have them.

This is a "great" way to run things and if it ever goes to a vote, I hope lkml ops can be convinced to go the diebold route.

Re:What about the neocon scheduler? by Anonymous Coward · 2007-09-01 17:49 · Score: 0

it allocates time slices only to processes that already have them.
That sounds pretty conservative to me.

What about the really greedy scheduler... by Anonymous Coward · 2007-09-01 09:24 · Score: 1, Funny

That just takes all the cycels and keeps them for itself?

Re:What about the really greedy scheduler... by ozmanjusri · 2007-09-01 11:19 · Score: 4, Funny

Microsoft has patented that for the Vista scheduler

--
"I've got more toys than Teruhisa Kitahara."
Re:What about the really greedy scheduler... by Epaminondas+Pantulis · 2007-09-01 23:28 · Score: 1

No, in Vista the feature of eating spare CPU cycles is being taken care of by the Network Driver.

Re:Coming soon by Anonymous Coward · 2007-09-01 09:27 · Score: 1, Funny

The completely unfair scheduler, which takes all the time from processes that deserve it and gives it to processes that are blocked. Otherwise known as the liberal scheduler.

As opposed to the "REALLY completely unfair scheduler" (otherwise known as the conservative scheduler or "not nice" scheduler), which takes time from processes that need it desperately and give it to the top one tenth of one percent of processes that are swimming in priority and don't need it.

PFS by Altesse · 2007-09-01 09:29 · Score: 1, Funny

I'm waiting for a true revolution : PFS, the Porn Fair Scheduler. All processes related to porn (playback, download, etc.) receive much larger time slices than everything else.

Re:PFS by zootm · 2007-09-02 00:40 · Score: 1

I'm personally bewildered by the very concept of pornography so computationally-complex that it requires extra CPU time.

Are you raytracing beads of sweat? Or trying to graph metrics for shame in real-time?
Re:PFS by harkabeeparolyn · 2007-09-02 12:58 · Score: 1

Interactive CG porn, rendered in real-time; the women/men/animals on screen respond to your cries as the scene progresses. When you're trying to achieve simultaneous orgasm with a simulated orgy, timing is critical.
Re:PFS by zootm · 2007-09-02 13:21 · Score: 1

Honestly I find it hard to believe that this does not exist already. This is not an invitation to track it down and post links.

Insightful video clip about Linux schedulers by rpp3po · 2007-09-01 09:29 · Score: 0, Offtopic

The lecturer is no native English speaker. So sometimes you have to replace the word 'base' with 'scheduler'. The clip shows deep insight into what Con Kolivas really feels is going on right now.

http://www.scene.org/redhound/AYB.swf/

Re:Insightful video clip about Linux schedulers by rpp3po · 2007-09-01 09:32 · Score: 0, Offtopic

http://www.scene.org/redhound/AYB.swf ;)

More flame bait? by Bryan+Ischo · 2007-09-01 09:29 · Score: 4, Insightful

I read the article in question. There is obviously much disagreement about the value of the Really Fair Scheduler, and so I must assume that "derrida" and the Slashdot editors are once again just trying to invite more people to the flame-fest as usual.

The comments on the article at the linked-to site suggest that there are potentially flaws in the logic behind the Really Fair Scheduler, and that its author has ignored advancements in the CFS that make most (or all?) of its improvements irrelevent. Also there are many suggestions that the author of the Really Fair Scheduler, some guy named Roman something-or-other, is raging on the kernel lists rather than working cooperatively to improve the Linux scheduler.

Given what I have seen, I suspect that the Really Fair Scheduler is going nowhere, and that "derrida" knows that and is just trying to add more fuel to the flame-fire by posting about it on Slashdot.

Re:More flame bait? by icepick72 · 2007-09-01 10:20 · Score: 1

and so I must assume that "derrida" and the Slashdot editors
I don't know who you are but in cases like this we need facts and not assumptions, not perceptions, not mild understandings of issues.
Re:More flame bait? by try_anything · 2007-09-01 10:32 · Score: 1

I suspect what we need in cases like this is for everyone who wouldn't have know about this except through Slashdot to just STFU and GTFA. So, err, this will be my last post in this thread :-)
Re:More flame bait? by Dr.+Spork · 2007-09-01 10:36 · Score: 4, Insightful

You could be right, but Roman is in a tough position, because he's arguing for a change that he thinks is big, and Ingo seems to be trying to sap his enthusiasm by telling him to essentially "work on what we're doing" when Roman wants to have a debate about the best architecture for the scheduler.
In order to help give substance to the debate, Roman coded together some proof-of-concept stuff, but instead of his architectural ideas being looked at seriously and critically, Ingo instructs him to strip away most things and "well use it." That really should seem to everyone on the sidelines like Roman's ideas are being ignored without debate. Now, maybe Ingo is polite, Roman's work just sucks, and Ingo won't confront him on it. But if that's not the case, maybe there should be a (non-flamey) debate about the best architecture for the scheduler.
Re:More flame bait? by Hooya · 2007-09-01 14:33 · Score: 2, Funny

Or perhaps he's dreading having to say:

My name is Ingo Molnar, you kill -9ed my scheduler. Prepare to oops!
Re:More flame bait? by Anonymous Coward · 2007-09-01 20:13 · Score: 0

You could be on to something. I too just read through the thread. Early testers found starvation:
link
There's also a new reply from Ingo dated today, that RFS is more complex (increased code size), is not so fast and is less fair:
link
While it 'removes complexity' by removing comments, debugging code and other stuff.
Re:More flame bait? by Hotsphink · 2007-09-02 05:41 · Score: 1

No, not really. Read the whole thread. Roman's work really doesn't change the structure of the CFS much at all, and the structural change he does make can be applied independently of the other changes. It's true that Roman and Ingo disagree about how "big" of a change it is: Roman thinks it's major, but if you keep reading, you'll see that he's only referring to the mathematical changes; Ingo feels it's a minor refinement of the math that could be done to the existing CFS without disrupting much else.

So Roman is trying to propose an architectural change and Ingo is sidelining him? Well, he does propose one structural change and one mathematical one, which sums up to an architectural change, I suppose -- but either can be done independently of the other, on the current code base. The remaining changes are renaming a file, removing lots of comments, and removing some of the capabilities of the existing CFS (which are orthogonal to what Roman is working on.)

It certainly does not sound like Ingo secretly thinks that Roman's work just sucks, but he is unconvinced that Roman is attacking a particularly relevant problem. Roman is fixated on rounding errors, which Ingo feels are rarely relevant. Either of them could be correct for all I know.

Beyond that honest disagreement, which will have to be resolved with test cases, I have to say it's hard not to side with Ingo. Roman's work could clearly have been presented in a way that would have been much easier to both understand and integrate, but instead he chose to (among other things) copy one whole file and rename it rather than patching it. I've merged in enough messy third-party patches to recognize a situation where someone just can't be bothered to do all of the work needed when you are developing on a shared code base. Imagine that someone finds an off by one bug in your code: it should have been while (a = b), but you wrote while (a b). They send you a patch with that change along with reformatting all of your comments and re-indenting 80% of the code. Not helpful. Even if their version is prettier. (And if it is, then splitting it into two patches would get all of the advantages, at the cost of more work for the patch submitter.)

It's a trap by Anonymous Coward · 2007-09-01 09:31 · Score: 0

Con has announced that he's really pissed off about this new development, and is divorcing Linux for a second time.

ingo's reply by ianare · 2007-09-01 09:32 · Score: 4, Informative

Ingo's reply can be found here. Roman's reply to that is here and here

Re:ingo's reply by icepick72 · 2007-09-01 10:17 · Score: 1

Question: Can not someone run both schedulers through the same series of severe test cases (unit testing) and analyze the results, allowing the authors of each to add more test cases as needed to prove points. At some point the strengths and weaknesses of each will become apparent. End of the day results will be the proof.
Re:ingo's reply by budgenator · 2007-09-01 11:07 · Score: 2, Interesting

The problem is Linux is used in a spectrum of 3 obvious types, servers, workstations and desktop and the developers tend to be very sensitive to the server and workstations areas so in the end of the day it'll be test cases that favor servers vs. test cases that favor desktops. What makes me wonder is why don't they develop three, each one optimized for a particular usage pattern and just let me select the kernel I want with GRUB? It should be possible to modify init to select the correct rc.conf to each pattern as well.

--
Apocalypse Cancelled, Sorry, No Ticket Refunds
Re:ingo's reply by wellingj · 2007-09-01 13:38 · Score: 1

Don't forget the embedded spectrum, which likes Ingo's -rt patch. Which is currently being merged into the kernel.
I actually think that its at the heart of why Linus has given Ingo the go-ahead to do the CFS scheduler, because ultimately the CFS and -rt scheduler will be one and the same, or CFS layered ontop of -rt. What this means is more usage of the vanilla kernel for embedded devices instead of the 'other' real time Linux derivatives such as RT Linux from FSM labs and the RTAI patch.

--
Money is the root of all evil?
Re:ingo's reply by Anonymous Coward · 2007-09-02 02:56 · Score: 0

Flame war averted.

Re:Coming soon by arth1 · 2007-09-01 09:34 · Score: 4, Insightful

You're more insightful than you think. I don't want a fair scheduler. I want a very unfair one, that favours my favourite processes. And I want one that has as little overhead as possible -- a scheduler so complex that it eats 20% of the available cycles just to figure out who to give the remaining 80% to, I have no use for.

How about I really fair scheduler you ?! by Anonymous Coward · 2007-09-01 09:43 · Score: 0

How about I really fair scheduler you ?! Let us see how you like that, slimeball !

Linux Kernel Whining List by rpp3po · 2007-09-01 09:44 · Score: 2, Funny

poor guy... :(

But where is the Linux IO Scheduler? by Anonymous Coward · 2007-09-01 09:52 · Score: 5, Insightful

Screw the CPU scheduler at this point. The kernel folks are missing the obvious and utter brokenness of the IO scheduling. These bugs have been outstanding about a year now!! And it's not just AMD64 anymore either. Quoth the kernel bug report:

"Now, as far as this bug being AMD64 only. We develop a portable data analysis
tool and we run it on Intel Core Mobile systems (Sony UX series, Panasonic
Toughbook series) and see this bug or one almost exactly like it on those
platforms as well.
"

http://bugzilla.kernel.org/show_bug.cgi?id=7372
http://bugzilla.kernel.org/show_bug.cgi?id=8636
http://www.nabble.com/IO-activity-brings-my-deskto p-to-its-knees-(2.6.22.1-ck1)-t4192136.html
http://forums.gentoo.org/viewtopic-t-482731-start- 500.html

At first, deadline IO was touted as an answer, but that doesn't completely fix things.
Some say Native Command Queueing is broken. One person claims deadline + NCQ disabled helps.
Some say the kernel's vfs_cache_pressure settings help, while others refute it (compare kernel bug report versus page 21 of the gentoo forum thread). But no one understands what's really broken in the kernel.

Can we please get Ingo working on IO scheduling? PLEASE?

Re:But where is the Linux IO Scheduler? by ls671 · 2007-09-01 11:40 · Score: 1

But no one understands what's really broken in the kernel.

Hi ! This is Steeven Baldmer. See, that's why we, at ms, offer the only reliable solution. I can assure you that every ms developer knows exactly what is going on everywhere in our systems.
Seriously, mod parent up

--
Everything I write is lies, read between the lines.
Re:But where is the Linux IO Scheduler? by asc99c · 2007-09-01 12:10 · Score: 1

A great point. I/O to discs and networks has a few cases where the standard performance is poor. I've recently got a copy of Windows Vista and this seems to be even worse than I've seen before - performance during file copies across the network is horrible.

I've got a media server and HTPC networked up, and I'd really like to be able to be watching one film from the server, while copying a new DVD back to the server. Right now, setting the player priority to high has no effect on the network transfer and I can't get a watchable film. Network I/O is always going to have uncontrollable outside influences, but in this case at least, the OS could sort things out.

Also I run a photos website on a Linux box. Occasionally the Google spider would be indexing the site and cause lots of photos to be resized. This uses some CPU but the major problem is disc I/O. I'd love to be able to just set Apache processes to low I/O priority so this wouldn't affect performance for other stuff too adversely.
Re:But where is the Linux IO Scheduler? by Anonymous Coward · 2007-09-01 12:12 · Score: 1, Informative

There are also Fedora bug reports with similar problems. I have two machines that were affected with this with some of the later kernels in Fedora 5. And now with Fedora 7 I am seeing something similar, but not quite the same. I have another machine that I haven't seen the problem on.
Re:But where is the Linux IO Scheduler? by Alsee · 2007-09-01 18:05 · Score: 1

performance during file copies across the network is horrible.

Maybe you missed the recent Slashdot story, but Vista networking is seriously hosed (down to like 5% or 10% of capacity) when there's any audio active. It was a semi-deliberate design choice by Microsoft. People are having a shitfit over it, and Microsoft says they are working on a patch for the issue.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
Re:But where is the Linux IO Scheduler? by Anonymous Coward · 2007-09-01 22:57 · Score: 0

"I'd love to be able to just set Apache processes to low I/O priority so this wouldn't affect performance for other stuff too adversely."

You can. Just use the "ionice" command, it's in every decent distribution.
Re:But where is the Linux IO Scheduler? by ModMeFlamebait · 2007-09-01 23:46 · Score: 1

Also I run a photos website on a Linux box. Occasionally the Google spider would be indexing the site and cause lots of photos to be resized. This uses some CPU but the major problem is disc I/O.
If you're generating thumbnails on the fly without caching, you get what you ask for.

--
Pavlov. Does this name ring a bell?
Re:But where is the Linux IO Scheduler? by Molf · 2007-09-02 05:07 · Score: 1

Unfortunately, ionice prioritises only I/O writes, not reads, which are the cause of the problem most of the time. In my experience, using ionice actually makes *no difference at all*, which really sucks.

On a related topic, I was recently amazed to discover that, despite the Linux kernel's shockingly awful I/O scheduling performance (face it folks, it really sucks hard), Windows XP is considerably worse. Want to move a large file from one disk to another? Might as well go and grab a coffee; your machine will be a doorstop while it's transferring. (I still think Windows is better at CPU scheduling though (my preference is to set it for a server load even on the desktop).)
Re:But where is the Linux IO Scheduler? by asc99c · 2007-09-02 06:46 · Score: 1

Only the first access causes the resize - after that, the resized image is cached. It's just usually the Google spider gets there before anything else, normally within a couple of hours of new photos being uploaded.
Re:But where is the Linux IO Scheduler? by Chris+Snook · 2007-09-02 14:44 · Score: 1

a) NCQ really is broken on a whole bunch of drives.

b) You have 4 GB RAM, 12 GB swap, 10 GB swap free, and 1 GB swap cached, with less than 250 MB buffers + cached, and you're setting vm.swappiness=20. You might mitigate the problem by raising vm.swappiness to a higher value, but you really need to close some firefox tabs or buy more RAM. No amount of I/O scheduling work will fix your performance.

--
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
Re:But where is the Linux IO Scheduler? by c0d3h4x0r · 2007-09-03 00:34 · Score: 0, Flamebait

Welcome to Linux, where things are implemented not according to the greatest user need, but according to which things are of interest to geeks with their heads up their collective asses.

--
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.
Re:But where is the Linux IO Scheduler? by c0d3h4x0r · 2007-09-03 08:28 · Score: 1

It can't be flamebait if it's true (which it is).

--
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.

mirror, mirror, on the RAID by r00t · 2007-09-01 09:53 · Score: 3, Funny

Who's the fairest scheduler made?

Re:mirror, mirror, on the RAID by flyingfsck · 2007-09-01 10:32 · Score: 1

The Fairy Scheduler: Twenty Dollars, same as in town...

--
Excuse me, but please get off my Pennisetum Clandestinum, eh!
Re:mirror, mirror, on the RAID by chmod+a+x+mojo · 2007-09-01 12:37 · Score: 1

The "Fairy schedular"....same as in town? Where are you going San Fransisco?

--
To err is human; effective mayhem requires the root password!

Mod parent up by ardor · 2007-09-01 10:18 · Score: 2, Insightful

He's right on. IO has a much bigger impact.

--
This sig does not contain any SCO code.

Re:Coming soon by Anti-Trend · 2007-09-01 10:34 · Score: 3, Informative

Hmmm, ever heard of nice?

--
Working in a DevOps shop is like playing in a band made up entirely of keytarists.

Re:Does it, buy a lot of RAM by ls671 · 2007-09-01 10:45 · Score: 1

http://slashdot.org/comments.pl?sid=252305&cid=199 10521

--
Everything I write is lies, read between the lines.

Flamebait or necessary oversight? by amightywind · 2007-09-01 10:54 · Score: 0, Troll

Unfortunately Ingo Molmar's opportunistic and unethical behavior in stonewalling the valuable contributions of Con Kolivas invite increased outside scrutiny for his handling of scheduler contributions. In my opinion he should be replaced.

--
an ill wind that blows no good

Come on by Anonymous Coward · 2007-09-01 10:59 · Score: 1, Interesting

People don't you understand? Ingo Molnar is the favourite puppy of Linus, so completely forget anybody else getting a chance to make a better scheduler for the kernel...
It's the usual politics and corruption...

Math is only reliable up to a point by Goonie · 2007-09-01 11:11 · Score: 3, Insightful

A fair proportion of the time, the mathematics applied in computer science (and, probably, most other disciplines) starts with simplifying and often unrealistic assumptions.

Not that maths isn't useful, but much of the time it can't give you definitive answers for the questions you really want answers to, only somewhat related, simpler ones.

--

Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)

Re:Math is only reliable up to a point by try_anything · 2007-09-01 12:27 · Score: 1

The practice of making simplifying assumptions is mainly a problem when modeling performance. In this case, the guy was just describing the calculations that his code made. You don't have to model every aspect of behavior to model some aspects rigorously. Math bugs in microprocessors aside, I can't think of any reason why he would have needed to compromise rigor. (A typical mistake here would have been to ignore some limitations of computer arithmetic, but the limitations of computer arithmetic were central to the problem he was trying to solve, so I bet he took them into account.)

Undefined language semantics and other kinds of unknowables limit what you can model mathematically, but your code will only work reliably if you can avoid dependence on such things. The only good reasons to avoid a mathematical approach are sheer intractability and insignificance of the product, and a process scheduler is 1) not that complex, and 2) definitely important enough to merit a mathematical approach.

Re:Coming soon by daeg · 2007-09-01 11:32 · Score: 1

Why would we want a Windows kernel in Linux...?

Re:Coming soon by ls671 · 2007-09-01 11:55 · Score: 1

Read other posts about I/O. nice on linux only works on CPU cycles and is close to useless because I/O speeds haven't raised at the same rate as CPU speeds have. I mean. I have quad core CPUs that have a lot of spare cycles. Still, even if I renice 19 some process, it can still choke the machine because it is an I/O intensive process that takes .01 % of the CPU. So even at nice 19, one process can bring your machine down. We need a better I/O scheduler. Something similar to software raid (mdX_raidX processes) which only takes a little portion of I/O bandwidth available when you add a new partition to an array.

--
Everything I write is lies, read between the lines.

Not quite accurate by LinuxGeek · 2007-09-01 12:00 · Score: 2, Interesting

Linus chose the scheduler written by the person that best interacted within the existing developer structure and responded to problem reports. The rejected scheduler may have been slightly better, but the developer was much less cooperative and responsive to bug reports. He killed his own project because of attitude.

--

Kindness is the language which the deaf can hear and the blind can see. - Mark Twain

Re:Not quite accurate by Antique+Geekmeister · 2007-09-01 12:10 · Score: 3, Funny

I guess he should pull a Theo de Raadt, and release an OpenLinux kernel now?
Re:Not quite accurate by Anonymous Coward · 2007-09-01 14:03 · Score: 2, Informative

Actually he admitted that he didn't pay very much attention and may have taken one incident as the norm. That single incident was in response to a troll who submitted faulty bug reports and ignored the reasons for why they were rejected. Linus stated he didn't care that he may be wrong, since in the end he got a better schedular from a developer he knows.

As I said, this is more about management and politics than a choice based on technical details. Personally I don't care which schedular won, but it wasn't on their merits.

Re:Coming soon by Anonymous Coward · 2007-09-01 12:04 · Score: 0

Ever heard of ionice?

Actually, by kwabbles · 2007-09-01 12:05 · Score: 1

You can recompile your kernel with a different scheduler if you wish.

--
Just disrupt the deflector shield with a tachyon burst.

Sausages by chiok · 2007-09-01 12:10 · Score: 5, Funny

"To retain respect for sausages and Linux schedulers, one must not watch them in the making."
-- Otto von Bismarck (paraphrased)

Re:Coming soon by ls671 · 2007-09-01 12:19 · Score: 2, Interesting

Ever heard of ionice?

I experimented with it, but not in depth. As far as I remember, ionice didn't help a lot compared to real mainframe I/O scheduler. I have always felt that Linux was weak on I/O scheduling and other posts tend to confirm what I suspect.

Now, if you tell me that I can do real I/O scheduling with ionice and that you have managed to accomplish that. I might give it a second try, more in depth this time.

Also, please specify kernel tweaking parameters to cause ionice to act as a real I/O scheduler.

Again, I might not have experimented with ionice enough to possess an accurate picture but other posts on this thread seem to lead to what I assumed so far.

--
Everything I write is lies, read between the lines.

run-time swap hard, boot-time not so hard by davidwr · 2007-09-01 12:26 · Score: 1

Changing scheduling after boot is not easy.

However, it should be a boot-time option. The compile scripts should let you add in as many schedulers as you like, and select the default scheduler.

--
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.

User Driven Scheduler by elmartinos · 2007-09-01 12:40 · Score: 3, Funny

Writing a fair scheduler is difficult. Why not let the user decide? I propose a popup message for each context switch: "Hello, it seems the CPU is doing a context switch. Which application to you want to allow to run this time?".

--
Open Source Alternatives

arguing as anonymouse by Anonymous Coward · 2007-09-01 13:13 · Score: 0

I think the two people in question are the primary "anonymous" users posting.

-anonymous

Re:Coming soon by Tribbin · 2007-09-01 13:27 · Score: 1

I've not heard of that and where do I get it? I've been looking for such a solution very long.

It's not in the debian repositories so I get the feeling there is something wrong with it. (?)

--
If you mod this up, your slashdot background will turn into a beautiful sunset!

What about the TSNF Scheduler? by kybred · 2007-09-01 13:51 · Score: 1

For the text-ers out there?

TSNF.

On my old work machine by el_munkie · 2007-09-01 13:54 · Score: 1

It came up all the time. This was a G5 with 4Gb of RAM. It usually only made an appearance when I tried to get to a downed server through the finder. The other apps were usable, but Finder was out for about five minutes as it figured out what the problem was. This could also happen through a program's file menu dialogs, so if I was trying to open a file in Photoshop and misclicked on a toasted server in the sidebar, Photoshop became frozen.

Re:Coming soon by Anonymous Coward · 2007-09-01 14:28 · Score: 0

ionice is a part of Debian's util-linux package in sid. ionice is not in lenny's util-linux version. It appears it is new to util-linux.

mirror mirror by Anonymous Coward · 2007-09-01 15:26 · Score: 0

This should have been posted only on slashdot's mirror. Only it could tell us who's the fairest of them all

Now for the important question by DeVilla · 2007-09-01 15:35 · Score: 2, Insightful

Does Linus like him? More than Ingo?

Personally by Azari · 2007-09-01 15:44 · Score: 1

...I'm going to hold out for the Renaissance Faire Scheduler, so that I can finally get some use out of the Elizabethan hardware I've been hanging on to for so long.

Re:Personally by Harold+Halloway · 2007-09-01 22:37 · Score: 1

This will probably appear in my upcoming distribution Ye Olde Linuxe, complete with half-timbered desktop and a Plague Pit-shaped Recycling Bin.

Re:Coming soon by Gothmolly · 2007-09-01 16:32 · Score: 1

Something wrong with Debian, you mean?

--
I want to delete my account but Slashdot doesn't allow it.

Re:Coming soon by Anonymous Coward · 2007-09-01 17:11 · Score: 1, Informative

The meaning of "fair" in this case is that it equally allocates CPU time to programs run in the same priority. Mostly this is managed by allocating certain "timeslices". The reason you want the fairest scheduler you can get is so that your priorities are properly respected and processes at the same priority aren't discriminated against in terms of CPU time, something the old scheduler failed at.

It's time for a paradigm shift by timrichardson · 2007-09-01 17:38 · Score: 1

Instead of all the communist central planning nonsense trying to come up with ever cleverer politburo schemes, we should have a Market-Based Scheduler: CPU resources should be auctioned every 100ms. Let the market decide.

Re:It's time for a paradigm shift by TerranFury · 2007-09-02 06:12 · Score: 1

Best. Post. Ever. I wish I had mod points.

(Viva Ayn Rand! Drive your Lexus around in a grove of Olive trees! Flat taxes! Home schooling! Power to the individual!^H^H^H^H^H^H^H^H^H^H^Hauthoritarian corporations!)

YET: Distributed control is a big deal now. Lots of people are doing research into -- well, yes -- using "market" systems to get groups of, say, mobile robots to do something without having a single controller. Markets are really just negative feedback systems -- which can sometimes converge to what you want, sometimes to local minima, or sometimes never at all, instead going unstable.

So, I wonder: Is there distributed control work to be done in, say, task scheduling in compute clusters? Can some sort of market approach allow individual CPUs to reach optimal or near-optimal schedules with minimal inter-CPU communication?

This might actually not be a horrible idea. And even if it is -- hey, you can probably publish a paper on it, right?
Re:It's time for a paradigm shift by scumdamn · 2007-09-03 03:24 · Score: 1

Yeah, let the invisible hand of the kernel decide! Atlas Multitasked.

Smarter write throttling is the answer by Spoke · 2007-09-01 17:57 · Score: 4, Interesting

It's fairly well known that large writes to the filesystem can cause huge read delays.

This seems to be aggravated by a number of conditions listed in the links posted by the parent post, but it's also aggravated when using ext3 and ordered data journaling as well (which is the default on most systems).

There is some work being done to reduce the huge latency in reads that can occur during heavy write loads with the "per device dirty throttling" patchset. Initial results look very promising.

LWN article: Smarter write throttling
per device dirty throttling -v8

This patch set seems to hold a lot of promise in being able to fix this problem, but I'm not sure what the latest status is or what kernel it will make it into. It could make it into 2.6.24 at the earliest.

Re:Smarter write throttling is the answer by Spoke · 2007-09-01 18:06 · Score: 2, Informative

Here's a post on how the above patchset can improve the responsiveness of the system under heavy write load:

huge improvement with per-device dirty throttling

And the thread referencing the latest version of the patch posted to lkml:

per device dirty throttling -v9

It's only a matter of time by Harold+Halloway · 2007-09-01 18:44 · Score: 1

I am looking forward to the 'Fairly Fair Scheduler.'

Next week: by bytesex · 2007-09-01 19:23 · Score: 3, Funny

Next week: a completely new scheduler, written by Ingo, in 05:12:43.33213, called the 'Astoundingly Fair Scheduler', which doesn't look at all like this new improvement, especially - hey look ! Something shiny ! And in two weeks time, a defence written by Linus Torvalds, detailing why the AFS is so much better than the RFS, and why Ingo can be trusted so much more when it comes to maintaining stuff like that.

--
Religion is what happens when nature strikes and groupthink goes wrong.

Re:Next week: by maxume · 2007-09-01 23:58 · Score: 1

The Largely biased scheduler?

--
Nerd rage is the funniest rage.
Re:Next week: by Gen.Anti · 2007-09-02 02:46 · Score: 1

Funny joke indeed, but as a Freudist, I think guys feel an irrational urge to behave unfairly and/or preach about fairness, subconsciously induced by the word (accidentally) fair being brought to their attention with hypnotic regularity ;-D

Re:Coming soon by Anonymous Coward · 2007-09-01 19:28 · Score: 0

$ apt-cache show schedutils
Package: schedutils
Status: install ok installed
Priority: optional
Section: utils
Installed-Size: 88
Maintainer: Guus Sliepen
Architecture: i386
Version: 1.5.0-1
Depends: libc6 (>= 2.3.6-6)
Description: Linux scheduler utilities
These are the Linux scheduler utilities - schedutils for short. These programs
take advantage of the scheduler family of syscalls that Linux implements across
various kernels. These system calls implement interfaces for scheduler-related
parameters such as CPU affinity and real-time attributes. The standard UNIX
utilities do not provide support for these interfaces -- thus this package.
.
The programs that are included in this package are taskset, chrt and ionice.
Together with nice and renice (not included), they allow full control of
process scheduling parameters.

Re:Coming soon (oh yes it is in debian) by rdebath · 2007-09-01 20:21 · Score: 1

It's in the schedutils package in Debian stable. You need a 2.6 kernel.

that would be iNtarwebs by RichiH · 2007-09-01 21:09 · Score: 1

that would be iNtarwebs

fair, unfair, deal with it by KZigurs · 2007-09-01 21:52 · Score: 1

Bunch of school kids. Life is unfair, deal with it.

Suggestions for next iterations: Ass of a scheduler, bastard scheduler, unfair bully scheduler, depressed goth scheduler... (I will leave the exercise of figuring out the allocation semantics to reader)

Re: cooperative multitasking by jonadab · 2007-09-01 23:18 · Score: 1

You might be interested in the FreeDOS project. HTH.HAND.

--
Cut that out, or I will ship you to Norilsk in a box.

Review feedback by Ingo+Molnar · 2007-09-02 02:29 · Score: 5, Informative

Oh my gosh, the Linux scheduler is on Slashdot. Again! :-)

Frankly, this amount of interest in the Linux scheduler is certainly flattering to all of us Linux scheduler hackers, but there are certainly more important areas that need improvement: 3D support, the MM / IO schedulers, stability, compatibility, etc. (There's also the FreeBSD scheduler that went through a total rewrite recently - and it got not a single Slashdot article that i remember.)

But i digress. A couple of quick high-level points (most of the details can be found in the discussions on lkml):

I find the RFS submission interesting and useful, and i have asked the author to split the patch up a bit better, to separate the core idea from optimizations and unrelated changes - to ease review and merging of the changes, and to make the changes bisectable during QA after they have been applied to the mainstream kernel. (That is how patches are typically submitted to the Linux-kernel mailing list - it's a basic requirement before anything can be merged. CFS for example was applied to the 2.6.23 development tree in form of a series of 50 (!) separate patches. (And the scheduler works at every patching/bisection point.))

I also pointed him to the latest "bleeding edge" scheduler tree, which already implements the same non-normalized form of math and makes some of the rounding and performance arguments moot i believe. (lkml mail).

There are some issues where i disagree with Roman at the moment: even when comparing to unmodified current upstream CFS, i think Roman makes too much out of rounding behavior and i have asked him to substantiate his claims with numbers (lkml mail).

The current precision/rounding of CFS is better than one part in a million. (in fact it's currently even better than that, but i'm saying 1:1000000 here because we could in the future consciously decrease precision, if performance or simplicity arguments justify it.)

I can understand his desire towards creating interest in his patch, but IMO it should not be done by unfairly (pun unintended ;) trash-talking other people's code. The math code in CFS that achieves precision has gone through more than 5 complete rewrites already in the 20-plus CFS versions, and the current variant was not written by me but was largely authored by Thomas Gleixner and Peter Zijlstra.

New, better approaches are possible of course and the math is relatively easy to replace, due to the internal modularity of CFS. So we are keeping an open mind towards further improvements. (which includes the possibility of total replacements as well. Dozens of times has my own kernel code been replaced with new, better implementations in the past - and that includes large parts of the scheduler too. In fact only ~30% of current kernel/sched.c was authored by me, the rest has been written by the other 90+ scheduler contributors, according to the git-annotate output that covers the past ~2.5 years of kernel history. Beyond that numerous other people have contributed to the scheduler in the past.)

About the submitted code: it was a bit hard to review it because the new code did not contain any comments - it only included raw code - which is very uncommon for patches of such type. The email gave the theoretical background but there was little implementational detail in the patch itself connecting the theory to practice.

So to drive this issue forward i have today posted a question to Roman in form of a tiny patch that extracts only his suggested new math from his patch and applies it to CFS. If it is indeed what Roman intended then we can analyze that in isolation and in more detail. The patch is as small as it gets:

include/linux/sched.h | 1 +

Re:Review feedback by MrCopilot · 2007-09-02 05:17 · Score: 2, Informative

Nice to see you interested in our interest. I've read your lkml responses and they reinforce Linus' decision to chose you to Maintain the Scheduler (IMHO).
It should be pointed out to all Kernel Hackers, the kernel is the product, not a place for their pet project unmodified. No offense to Roman. This part of the code is a bit beyond me, But your approach to his patches seems reasonable. I hope he follows up with the patches you requested. We all want a faster "Fair" scheduler.
Like many here, I was intrigued by Con's claims of 3d improvements, and any work that moves us closer to better 3d Gaming is a good thing for Linux as a whole. It's good to see so much work in the Scheduler to squeeze every slice of time as much as possible.
Only one question, shouldn't you be working rather than reading Slashdot? I know its the weekend, just wanted to be on the other side of that question for once.
Just in case no one has said it lately, Thank You for your efforts.

--
OSGGFG - Open Source Gamers Guide to Free Games
Re:Review feedback by scumdamn · 2007-09-03 04:40 · Score: 2, Insightful

Speaking of unfair, I think it's completely unfair for you to ruin a wonderful flame war based on supposition and misunderstanding. How dare you roll up on Slashdot busting caps with your reasoned approach, data, and project management skills? Now this topic of conversation is hosed because nobody can BS their way through why you're a bad guy who's stepping on Roman's neck because of some daddy issues or something. Sheesh, of all the gall...

Misunderstanding... by Junta · 2007-09-02 03:03 · Score: 2, Interesting

At least in linux, and I presume FreeBSD's swap strategy is similar, you miss the point. Let's look at two scenarios, one with proactive swapping, one without, and a malloc comes in that exceeds system memory.

Non-proactive case:
-kernel sees malloc, knows it lacks physical memory to accommodate, malloc is blocked while kernel does housekeeping.
-kernel picks the appropriate amount of pages to write to swap, then writes those pages to swap space, taking a while since block storage IO is excruciatingly slow.
-After the extremely long previous step, the memory is freed
-the malloc is allowed to continue, after a number of milliseconds have passed to execute the drive write, aside from the drive write, everything was in the microsecond scale, so it was delayed by a factor of thousands.

Proactive case:
-The system has some idle time, with nothing immediately better to do, the kernel notes free swap space and flags some appropriate memory as what would be swapped out if and when the system was in need, and copies it to disk, but it *leaves it in memory*. The kernel remembers that while these pages are indeed in memory, it can zap them and be able to restore. This is the critical point, the data in memory has not been *moved* to swap, it has been *copied* to swap.
-Program using that data randomly kicks back to life. It's needed data is on disk, but it is a moot point because it is in physical memory too, so it isn't slowed down. The kernel might take this opportunity to re-evaluate things when idle in terms of what it thinks is mostly unwanted pages.
-Later on, a program needs to malloc and physical memory is exhausted, the kernel blocks to do housekeeping, finds pages that it knows it has copied to disc, frees them and uses them to satisfy the malloc, within microseconds.

Proactive swapping causes extra IO activity during idle, but does not, if implemented correctly, impact things proactively swapped unnecessarily negatively, and allows swap on actual demand to be nearly trivially fast. It may be wasteful to have gobs of swap, and certainly if the swap has the sole copy of tons of data then performance is hopeless, but don't think seeing the swap used count go up 'mysteriously' without significant mallocs going on that it will impact access to the data written to swap later on.

--
XML is like violence. If it doesn't solve the problem, use more.

Really Simple Really Fair Scheduler (For Real) by Anonymous Coward · 2007-09-02 03:07 · Score: 0

Check it out.

IBM's dispatcher for a change ? by i · 2007-09-02 03:45 · Score: 1

IBM has a really good sheduler (called "dispatcher") in their Mainframes (z/OS et al), couldn't Linux try a version of that one for a change ?
After all, IBM is running Linux on their Mainframes.

--
Mundus Vult Decipi

But it's so Yummy! by toddhisattva · 2007-09-02 04:32 · Score: 1

CFS has a considerable algorithmic and computational complexity Tenderize your steak, soak it in eggmilk, while it soaks get your breading ready, put the soaked steak in the breading, then into the frying pan.

The algorithm is straightforward, if a little labor-intensive.

I like cutting the steak into smaller pieces, so as to enjoy CFS as a finger food and increase the amount of breading. So computationally, it tends to leave grease and crumbs in my keyboard.

Re:Coming soon by mr_mischief · 2007-09-02 08:45 · Score: 1

Does your system with four CPU cores have four or more fast disks? Lots of the problems I hear about with IO being slow is that people put two, four, or eight cores into a machine, knowing that their one disk will be the bottleneck. Sure, four or eight disks will still be a bottleneck, but not as much.

How much is your four-core CPU? You can buy a nice Seagate SATA 3.0Gbps PRT 320GB disk for for less than $80 on NewEgg. Ideally you could buy twelve, and set up four RAID 5 arrays. Assign one array to be /usr and swap, one to be /home, one to be /var, and one to be /tmp. Alter this scheme as fits your special needs, of course. 12 * $80 = $960. You'd probably also need an extra SATA adapter card unless you're running one hell of a server board.

If that's too steep to go along with your underutilized $500 processor, maybe just get four for $80 and use single drives vs. arrays. That's $320 and is only a bit more than the cheapest Core 2 Quad.

Of course, you could go even cheaper and use 40GB for /usr and swap unless you're some kind of software collector. You could probably also do with that much for /var and for /tmp for that matter. So get 320 GB for /home, and 40 GB for each of the others, and save a bit more. If you want to stick with the 7002.10 series for the perpendicular recording, the 80GB drives are about $44. So 3 * 44 = 80 = 212.

$212 dollars worth of disks to speed your reads and writes significantly if you're using a single disk. Or even reuse the single disk for /home and make it a $132 investment.

Of course, this doesn't include shipping or power costs. But you get the idea. You can speed your machine's IO up quite a bit by having more than one disk. It's not just that you're talking about four times the transfer speed, either. You're cutting down on seeks, missed read and write opportunities, cache contention, and command queue length by going to more drives on separate interfaces. On lots of machine workloads, such as a heavily trafficked file server or mail server with local logging, you can, IME, get a machine to handle up to about four times the workload by separating the data and the logging onto separate disks.

Re:Coming soon by ls671 · 2007-09-02 09:08 · Score: 1

Actually, in my case, I found that RAM was a cheaper bang for the buck than a fast array of drives ;-)

http://slashdot.org/comments.pl?sid=252305&cid=199 10521

Of course, the best solution depends on the use case, what you suggest could be needed for some applications ;-)

--
Everything I write is lies, read between the lines.

Re:Coming soon by Anonymous Coward · 2007-09-03 01:28 · Score: 0

"Liberals are evil, because they are branded that way. Facts don't matter, truth is what we say it is"

Re:Coming soon by mr_mischief · 2007-09-04 02:52 · Score: 1

That equation, of course, should say '3 * 44 + 80 = 212'. I'm pretty sure I did a preview, too, but I didn't notice that until looking back at it later.

199 comments