Linus Torvalds Says 'Buggy Crap' Made It Into Linux 4.8 (theregister.co.uk)
Two days after Linus Torvalds announced the release of Linux 4.8, he began apologizing for a bug fix gone bad. The Register reports: "I'm really sorry I applied that last series from Andrew just before doing the 4.8 release, because they cause problems, and now it is in 4.8 (and that buggy crap is marked for stable too)." The "crap" in question is an attempt to fix a bug that's been present in Linux since version 3.15. Torvalds rates the fix for that bug "clearly worse than the bug it tried to fix, since that original bug has never killed my machine!" Torvalds isn't happy with kernel contributor Andrew Morton, who he says is debugging with a known bad use of BUG_ON(). "I've ranted against people using BUG_ON() for debugging in the past. Why the f*ck does this still happen?" Torvalds writes, pointing to a 2002 post to the kernel mailing list outlining how to do BUG_ON() right. He later adds "so excuse me for being upset that people still do this shit almost 15 years later."
"I've ranted against people using BUG_ON() for debugging in the past. Why the f*ck does this still happen?"
Maybe because ranting is not an effective method of communicating?
those kind of people that think they have to have the latest kernel are so amusing though, bleeding edge often means you'll bleed
As I'm not a developer, I had to read through some of the comments left to the original stories to figure out what the fuss was all about.
Maybe most Slashdot readers are more focused than I am on coding and already know all of this. But what I learned is that essentially, sticking this BUG_ON line someplace in the code causes Linux to do the equivalent of a Windows blue screen of death when it hits it. It's a purposeful way to cause an instant system halt because you believe the software should never reach that spot in the code, and if it does, you're worried that data corruption will result -- so better to halt things than let that happen.
It sounds like even back in 2002 though, Linus was expressing his dislike for using it and recommended a WARN_ON alternative that would just alert people to the issue but let things continue.
The thing is? I'm not entirely sure Linus's anger is warranted here? It sounds like basically, he's of the philosophy that "the code must go on". In other words, it's almost always better to keep the system running, despite any bugs, than to kernel panic and stop the whole thing. Perhaps in a world of virtual machines and servers running a whole slew of different processes at the same time, there's logic to this? (EG. If one of your boxes is needed to perform DNS, DHCP and/or other basic functions for a whole network -- you'd probably rather it keep doing those things, even if a bug is hit that means a process reading/writing data to files someplace else gets a critical error that could corrupt records in a database or improperly truncate some other file it was working with.)
BUT .... this could just as easily be subjective, based on where the bug lies and what it impacts, vs. what YOU consider a mission critical use of the machine in question. If BUG_ON saves data from loss, maybe that really is better for SOME users than letting it go on generating/logging warnings that people aren't going to notice right away?
I get the idea Linus leans the direction he does on this issue mainly because he wants any kernel he approves as "stable" to have that appearance, buggy or not.
Hmm, Andrew Morton is one of the two most trusted kernel developers, the other one being Al Viro. These two get absolutely no kiddie gloves from Linus when they do something they really, really, *really* ought to know better. OTOH, he pretty much merges anything these two send to him blindly on trust, *even when he is less than one week away from shipping it as a stable kernel*.
Does that make it a bit more clear?
And, believe me, these two get one or two rants like that at most once every 15 years. They are *that* good, and every single kernel developer worth something *knows* it. They get that level of rant with names included, because it is the other side of getting that level of trust out of Linus. Linus pretty much trusts Andrew Morton to be Linus himself.
And actually, so does everyone else [in the Linux kernel *upstream* community], as far as I know. So, yeah.
He's not apologizing. Saying "I'm sorry blank blank blank" doesn't constitute an apology.
What makes you think he was trying to apologize? He's highly irritated that one of his closest generals let something bad through. He reacted in normal Linus fashion but, I don't think you could get to where Andrew Morton is without being familiar with what Linus expects and how he reacts when his expectations aren't met. You have to have really, really thick skin to work at that level.
So, yeah, it wasn't polite. But, if you've ever worked at the higher levels of a big company, "polite" isn't even vaguely a consideration. The only reason that Linus rants make Slashdot news is because they happen in public. At any other company, a person in Andrew Mortons position is likely to be verbally berated but, it all happens behind closed doors.
If anything, in this case, people familiar with Andrew Morton probably find it slightly amusing that a Jedi Master can make mistakes too.
If I remember right, Andrew Morton has a Slashdot account so, maybe he can chime in.
If you actually read the thread, that's basically where he says it's appropriate, and only then.
The problem appears to be that people are using that feature in situations where recovery is feasible and desirable, or they're using it under the assumption that it only impacts people running special development kernels.
Log in or piss off.
I understand that this is a troll post but, it still makes me laugh. Software guys over the age of 40 are a pain in the ass to work with because they've seen so much idiocy over their careers that they can spot it immediately and point it out. If you are in your 20s and working with other people in their 20s, the echo chamber you live in is certainly re-affirming but, not particularly conducive to writing good software.
I don't think the problem is that everyone else is a moron (though there are a lot of them to be sure), it's that everyone else has a different plan, agenda, goal, technique. So one person who lives and breathes the code is annoyed at other people who may want to get home early, are new to the code, are under a lot of pressure to get it done fast, and so forth. At the work place, should you accuse the employee of being incompetent at the task, or blame the manager for assigning a person without the necessary skills and experience to that task?