Hope For Fixing Longstanding Linux I/O Wait Bug
DaGoodBoy writes "There has been a long standing performance bug in Linux since 2.6.18 that has been responsible for lagging interactivity and poor system performance across all architectures. It has been notoriously difficult to qualify and isolate, but in the last few days someone has finally gotten a repeatable test case! Turns out the problem may not even be disk related, since the test case triggers the bug only by transferring data either between two processes or threads. The test results are very revealing. The developer ran regressions all the way back to version 2.6.15 that demonstrate this bug has more than doubled the time to run the test in 2.6.28. Many, many people working at improving the desktop performance of Linux will be very happy to see this bug die. I know that I, personally, will find a way to send the guy that found this test case his beverage of choice in thanks. Please spread the word and bring some attention to this issue so we can get it fixed!"
Dang! I was going for First Post, but my machine was stuck in some weird I/O wait state.
When our name is on the back of your car, we're behind you all the way!
That's because you're not transferring data between yourself and another thread.
bugzilla.kernel.org?
The real "Libtards" are the Libertarians!
Anyone else notice the article 404ing from the front page? I'd say /. needs to fix some bugs/user errors rather than speak about a Linux IO latency most users don't even notice. Just an observation, and if you can read this, they either fixed it or you doctored up a query string like I did :D.
I'm not sure if this is related, but has anyone else noticed KTorrent can really bog your system down without showing any excessive resource usage in KSysGuard? For all I know, it may be passing information between one thread and another, and it's disk I/O intensive.
Been waiting all of 2 years and change for your precious bug fix, 'ave you? You almost had my eyes tearing up there I tell ya: 25 Year Old BSD Bug.
I'm not sure about anybody else here, but I was surprised to see that they mentioned that this will benefit 'Desktop' users.
I think that when it comes to the performance spectrum, Servers would be where this fix is the most needed. Admittedly if you are running a solid server, you should know to use older gen hardware and software that has been proven to be stable. However, some of this 'shiny new' tech coming out is appealing.
How about the Seagate 1500GB drive hang error? To my understanding Windows has been fixed, but the problem still persists in Linux. Could this potentially make a difference? I've been looking to build myself a nice NAS and those 1500GB drives are _cheap_. I can pick one up for about $160. I remember not too long ago that could only get me 80GB.
Heh, well i just hit the little link and then hit the link at the top to go back to the main topic... then sent a e-mail to /.
I'm sure kernel.org appreciates these links. Now instead of fixing the bug they're putting out fires in the data center...great job slashdot.
If this get resolved is there any chance the fix could get ported to Windows? I just had my Dad's XP laptop completely freeze after I plugged in a bog-basic USB thumbdrive. The desktop sprang to life only after I unplugged it. I wish some of the AC Windows fanboys who were hassling me here last week were around to see it. "Ready for the desktop" my ass.
OS not fast enough? Just upgrade your hardware components, preferably to a new, top-of-the-line system.
Oh wait... that's the Windows way of doing things.
Yeah, exactly, that's why volunteers have been hard at work to find and fix the (published, admitted) bug. Just like Win... Oh, wait.
wow, not just badsummary, utterly worthless summary. Here's the relevant discussion from LKML. Yes, this is all of it.
Peter Zijstra
Andrew Morton
In http://bugzilla.kernel.org/show_bug.cgi?id=12309 the reporters have
identified what appears to be a sched-related performance regression.
A fairly long-term one - post-2.6.18, perhaps.
Testcase code has been added today. Could someone please take a look
sometime?
There appear to be two different bug reports in there. One about iowait,
and one I'm not quite sure what it is about.
The second thing shows some numbers and a test case, but I fail to see
what the problem is with it.
This somewhat deflates the excitement evident in the OP. I mean, I know what he's talking about, these apparently random 1-2 second FREEZES while working, but if the guys in LKML arn't talking about it it's probably not being really worked on.
It must also affect servers, because none of the links is transferring data either.
Kevin Smith on Prince
That's because you're not transferring data between yourself and another thread.
But he is transferring data between himself and another sockpuppet.
I trrrrrrrrrrrrrrranssssssssfer data betwwwwwwwwwwwwwwwwwwwwwween threads alllllll the time......
Don't blame me, I voted for Baltar.
Sure, because every Windows developer is a lazy motherfucker that doesn't like his work and plays Solitaire the whole day long, and never ever work fixing things for the love of art. Hard working enthusiastic developers is a Linuzz monopoly.
It's time to realise that Abble's products are the biggest abomination these days. Just say NO to the dumb iAbble way!!
It's because people don't want to wait for a bugfix for over 2 years. They need fast systems NOW, and when a performance bug which doesn't get fixed can be solved by buying faster hardware, that's what they do.
This is your sig. There are thousands more, but this one is yours.
I am overjoyed that my suspicions have finally been vindicated. I've been working 10+hours a day on linux for the last 13years and you tend to get in tune with your environment (i can still today recite my DOS bootup tune on my XT even though I haven't worked on it for 20 years:-) and some time ago after installing a new flavour of linux I immediately started complaining to fellow workers that something has gone wrong in the kernel but it was not annoying enough to really do something about it; you start living with it. It manifests sometimes when I compile - my system simply locks up for 20-30 seconds which is something I never experienced before. I'd say it happens once out of every 50 compiles of the same program with gcc. During such occurrences, I can't access anything on my desktop which annoyes me cause I typically switch to another kterm session to prepare to run the build whilst compiling (to keep up the productivity and all that). I have also seen strange ratios of i/o to cpu wait in 'top' nowadays but can probably ascribe that to CPU's that just became ridiculously fast and the way top calculates its scores. Nevertheless, I've mumbled over and lambasted i/o wait in Linux ever since a very specific time in the past and even though I haven't noted the exact date, I'm sure its related to this. Anyway, I found this intrigueing enough to create a slashdot account after years to share my joy that the bugs days are hopefully numbered now.
For what it is worth, the problem is real.
We have experienced massive negative effects with our MySQL server; downgrading to early linux kernel solves the problem. This has been very difficult to debug as we never guessed that the OS would be a factor... we figured it had to be something we were doing. Only by chance did we try another distro / kernel only to find that everything starts working fine when you downgrade.
From your tone: I assume that you will be sending this guy something of great value ?
Why do you not have the courage to say your name when you post ?
You should enable DMA.
Change is certain; progress is not obligatory.
and here I was thinking that those pauses were because I had firefox open with >5 tabs for >1 hour.
Users notice this a ***lot***: https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/131094
...when you insist on doing development in the 'stable' kernel tree and expect vendors to stablise it.
Genius!
2.6.28 introduced an option for Preemption.
(Processor type and features --> Preemption model.)
Pick voluntary kernel preemption.
but he would not be receiving 100k a year from individuals that were happy with his work.
Quite a few of the core kernel developers actually _are_ paid to work on the kernel these days.
Whether or not the end users pay for it has nothing to do with whether or not the developers get paid.
In addition, if this test case has come from the community, then it would _never have happened_ if the kernel was not an open source project.
Advanced users are users too!
oh is that behaviour due to this bug???? because that was happening on my dad's ubuntu computer
Well, that would explain a lot.
Dewey, what part of this looks like authorities should be involved?
I've been using the "Preemptible Kernel (Low-Latency Desktop)" option for years on my Gentoo systems, and haven't seen any problems with it. The little bit of overhead seems negligible compared to the voluntary option.
thats been in there for a long time, not just in 2.6.28
http://bugzilla.kernel.org/show_bug.cgi?id=12309
davecb5620@gmail.com
I think way too many people are blaming their issues on this bug. Some of them may be valid but others probably have something misconfigured or maybe it only affects certain hardware. I don't expereience this bug. My interactivity does not suffer when I do anything I/O intensive.
Time makes more converts than reason
Not every one, but there's definitely too many of them.
Please someone fix the damn economy for crissakes.
Ah, okay. I'll start coding that right away.
"I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)
Or those people who didn't have a sufficiently high-paying job to manage basic necessities in the pre-crash era. Or people who, despite the desire and intelligence to do excellent work, failed to find a job before the beginning of the recession.
Any simplified model of the economy is bound to fail, because the economy is not simple. Not everyone who doesn't have massive savings spent themselves retarded. Not everyone who spent themselves retarded is going to suffer now. Sadly, it doesn't work like that.
This I/O scheduler was introduced as the default in 2.6.18 and available since 2.6.13. I wonder if that has something to do with it. I'm going to test it out on my home machines later today and have a look-see.
Supposedly it can be disabled and the AS scheduler can be used if you change it at runtime in /sys/block/hda/queue/scheduler, or use the "elevator=as" boot option.
Just disrupt the deflector shield with a tachyon burst.
In all fairness, the 20+% annual inflation we're going to see starting in 2010 will bring housing values back to their peaks by 2013.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Yeah, but yours is just a duplex issue, been known about for years, and easily fixed.
Yes, they do. It's out there in a lot of naming permutations, with a lot of different causes - video, browser, disk, X, general high I/O, etc. I have personally run into this bastard a number of times.
Long-term US bonds are going to be unsellable internationally a year or two from now
Yeaaaaaaaahhhh...'cause you know, all the other countries in the world that sell long-term bonds have perfect economies. The interest rate paid by the U.S. Gov't might go up a little to increase demand, but people will still be buying U.S. treasuries for the foreseeable future. As in the rest of your life.
Advice: on VPS providers
Yeah, and open source software is of the highest quality, never leaking any memory or having buffer overruns.
Cut the false dichotomy crap, kids.
You got free testing. Keep adding hardware and submitting /. stories until the system remains responsive...
-B
Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.
It's happily not an either/or situation. If you didn't buy newer hardware in the meantime, this bug being fixed will speed the kernel up on your old hardware. If you did buy new hardware, you get that extra speed plus the speed boost when the bug is fixed.
The #1 holder (China) is cutting back on US Treasuries, instead "strongly encouraging" that money be lent inside China.. Nobody else can take up the slack.
Kevin Smith on Prince
Sure it can. If China were to dump all of their US Treasuries on the market tomorrow, you'd see the effective interest rate jump up a couple points and they'd be sold out to other investors, likely within the same day if not within hours. There is always a demand for high-quality bonds.
Advice: on VPS providers
You've obviously not kept up with events. For a year now, the US has been under attack over its' AAA credit rating. This was BEFORE the market meltdown, etc.
From January 10th, 2008: http://www.reuters.com/article/bondsNews/idUSN1017237120080110
Since then, we've had a deficit that's ballooning, revenues dropping like a stone, unemployment going up up and away despite a fed rate of zero%, the sub-prime crisis now is calculated to affect at least 17% of ALL mortgages in the US,
Interest rates will have to go up a LOT to compensate for the inflationary effect of printing up all that new deficit spending. Do you really want to return to the days of 20% prime rate interest, like in April, 1981? Or 21.5% in December of 1980?
From October 14th, 1978 to May 20th, 1985, even the most credit-worthy couldn't get loans below 10%. How many people with prime mortgages can afford a 15% mortgage? How many businesses are viable if they have to pay 18% interest on their loans and bonds?
Kevin Smith on Prince
No, iluvcapra was having a stroke.
Or, alternatively, it could be that the US dollar should be massively devalued relative it's current state and it's only the frozen credit markets and the need to build reserves that is keeping it propped up. The dollar would be the new peso by now if it wasn't for that.
I'll have a bajillion dollars thanks!
With the latest wrong-headed bailouts (Merrill Lynch) diverting even more capital to propping up bad investments and bad actors, inflating its' value away is inevitable.
The government should have allowed the failures, kept its' powder dry, then moved in after the market correction to help pick up the pieces. It would have been cheaper and more effective.
Kevin Smith on Prince