Bitten By the Red Hat Perl Bug

← Back to Stories (view on slashdot.org)

Bitten By the Red Hat Perl Bug

Posted by kdawson on Friday August 29, 2008 @03:36AM from the 100x-between-friends dept.

snydeq writes "Smart coders always optimize the slowest thing. But what if 'the slowest thing' is the code supplied by your vendor? That was exactly the situation Vipul Ved Prakash discovered when he tinkered with a company Linux box on which Perl code was running at least 100 times slower than expected. The code, he found, was running on CentOS Linux, using Perl packages built by Red Hat. So Prakash got rid of the Perl executable that came with CentOS, compiled a new one from stock, and the bug disappeared. 'What's more disturbing,' McAllister writes, 'is that this Red Hat Perl performance issue is a known bug,' first documented in 2006 on Red Hat's own Bugzilla database. Folks affected by the current bug have two options: sit tight, or compile the Perl interpreter from source — effectively waiving your support contract. If a Linux vendor can't provide comprehensive maintenance and support for the open source software projects you depend on, McAllister asks, who ever will?"

23 of 234 comments (clear)

Min score:

Reason:

Sort:

waiving your support contract? by dougmc · 2008-08-29 03:39 · Score: 5, Insightful

Installing your own perl under /usr/local, leaving the system one alone under /usr, that waives your support contract?
Seems unlikely, and if actually true, remarkably stupid.
(However, messing with the perl under /usr, that would be a mistake. It could easily break other things that depended on that specific version ...)
1. Re:waiving your support contract? by Richard_at_work · 2008-08-29 03:55 · Score: 5, Insightful
  
  No, it doesn't waive your support contract, but it does mean you will be relying on a subsystem that is not supported by the vendor - which validates the 'effectively' modifier in the original statement.
2. Re:waiving your support contract? by Dolda2000 · 2008-08-29 03:58 · Score: 4, Insightful
  
  Even if it is true, the nice thing with a free operating system is that one can at least fix the bug oneself, support contracts voided or not. Try doing the same if there's a problem with Exchange or IIS.
3. Re:waiving your support contract? by no1home · 2008-08-29 04:55 · Score: 5, Informative
  
  Very true. And this has been an ongoing issue with Linux adoption... I have a friend who runs mega-million-dollar, mission-critical systems and they've had to move off of Linux in favor of (Sun? Don't remember right now). It isn't about functionality. It isn't about open source. It's about support. Red Hat, et. al. want to be enterprise systems, and claim to offer enterprise support. But they don't perform enterprise support. As indicated here, change something to fix a bug, and you don't get support for that piece anymore. More, they won't support a system that doesn't have the latest updates, which is a problem on mission-critical systems. We don't update needlessly, and we certainly don't update to 'today's' patch. We have to wait and be sure the patch is stable and provides an improvement without risking our mission.
  Until the players selling support realize all of this, Linux will be a difficult sell for such key systems (and the PHBs all think ALL their systems are mission-critical).
  Keep in mind, I say this lovingly.... I want Linux to succeed and prefer it over the popular alternative.
  
  --
  I hope this comment is well received... I could have moderated instead!
  
  Persecutors will be violated!
4. Re:waiving your support contract? by /ASCII · 2008-08-29 05:19 · Score: 5, Insightful
  
  The company I work for does support for any Linux distribution, custom compiled packages, whatever. If the customer uses non-standard packages and oddball solutions, it often takes more time to solve their problems, but since we work by the hours, that's their problem.
  I find it hard to believe that businesses such as ours are unusual.
  
  --
  Try out fish, the friendly interactive shell.
5. Re:waiving your support contract? by KaZen · 2008-08-29 05:19 · Score: 5, Interesting
  
  This is nearly opposite my experience. I'm working at a very large Wall Street firm.
  Red Hat does *not* tell us: "Oh, I'm sorry you're not running the latest support pack, no support for you."
  We've had to run a modified GCC for a while and Red Hat, *again* didn't say "You've changed it, so support for you." What they *did* say was, "Can you reproduce this on *our* gcc?" Which again is better than We've gotten from some other vendors.
  We're still running AS4U4 in some places and RH has worked with us to track down bugs. Sometimes it ends with: "This was fixed in 4.5 please update." Sometimes it ends with "This is a bug, and here is the HF, please update to the released version when it becomes available."
  In fact I have a hard time sometimes of getting our Admins to open tickes with the *right* vendor, because they'd rather open a ticket with RH, because it gets solved sooner. (Course that is more a dig on HP, Veritas, EMC and some other "Enterprise Software" companies.)
  Unfortunately for Both us and RH, we don't like to update either, and even when RH has proven an update solves the problem, it's hard to get the Admins to actually update the boxen.
6. Re:waiving your support contract? by Grimwiz · 2008-08-29 05:44 · Score: 4, Interesting
  
  My experiences are the opposite to that listed above. RedHat have been more forgiving and sensible supporting live production systems than my experiences from SUN or Veritas. Both experiences were for mega billion-dollar companies.
  
  --
  -- Don't believe everything you read, hear or think
Good thing it was open source by stjobe · 2008-08-29 03:41 · Score: 4, Insightful

Yeah, well, good on mr Prakash I guess. Good thing he had the option of rebuilding from source, I can think of a few other operating systems and applications where that simply isn't an option.
So, score one for open source I guess, headline be damned.

--
"Total destruction the only solution" - Bob Marley
Don't throw out the baby with the bath water by SirGarlon · 2008-08-29 03:41 · Score: 4, Insightful

Just because Red Hat made one high-profile mistake, doesn't mean their support service is without value. Jump to conclusions much?

--
[Sir Garlon] is the marvellest knight that is now living, for he destroyeth many good knights, for he goeth invisible.
1. Re:Don't throw out the baby with the bath water by canUbeleiveIT · 2008-08-29 04:00 · Score: 5, Interesting
  
  Just because Red Hat made one high-profile mistake, doesn't mean their support service is without value.
  
  Perhaps not, but I know that they will never get another dime of my money.
  
  For years, I always purchased Red Hat even though I never had occasion to use the support that came with it. I was (and still am) bought into the FOSS concept and wanted to make it work for me and my business. But I stopped sending RH my money sometime about 8.0, when I called their support to try and get some help with a printer issue. I would have been satisfied if they had been able to get either one of my printers (HP LaserJet 1100 or LaserJet 4L) to work with RH. A surly woman with almost unrecognizable English--obviously reading off of a cue card--tried for a few minutes and then dismissed my support case with the comment that "RedHat doesn't work with all printers." When I mentioned that I had paid for the RedHat just so that I could have support, she hung up on me. I called back to get another support person with an equally incompetent and rude tech.
  
  Eventually, someone at experts-exchange.com gave me the answer to my problem. Now I just download Centos and if I need support, I pay someone on a case-by-case basis.
2. Re:Don't throw out the baby with the bath water by Just+Some+Guy · 2008-08-29 04:42 · Score: 4, Funny
  
  Eventually, someone at experts-exchange.com gave me the answer to my problem.
  You had me until then.
  
  --
  Dewey, what part of this looks like authorities should be involved?
CentOS it NOT Red Hat by Anonymous Coward · 2008-08-29 03:46 · Score: 5, Informative

If it is "supplied by CentOS" then it was compiled by "CentOS" not Red Hat. Red Hat Enterprise Linux enterprise had a hotfix for this weeks ago. So if Vipul had been using a Red Hat product, he would not have had this problem.
1. Re:CentOS it NOT Red Hat by Anonymous Coward · 2008-08-29 03:58 · Score: 5, Insightful
  
  And recompiling doesn't invalidate his support contract; as a CentOS user he doesn't have one.
  The summary is bullshit.
Re:That's what you get. by timster · 2008-08-29 03:48 · Score: 4, Insightful

Well, I'm anything but a hardcore Perl hacker -- just use it to pragmatically list some rubbish now and then -- and I've never even heard of compiling your own Perl.
In truth, it's NOT like GCJ in the least. GCJ is a relatively immature JVM built from an entirely different codebase than the Sun JVM. "Vendor" Perl and "real" Perl ought to be substantially the same thing.
Just like all the foundation-level vendor tools, I would expect Perl to be built correctly on any official distro release. I shouldn't need to build my own GCC, my own Python, my own X, or my own Perl.

--
I have seen the future, and it is inconvenient.
Article is a troll by wrook · 2008-08-29 03:50 · Score: 4, Insightful

Cent OS is *not* an OS that Red Hat provides support for. So, in terms of support, you get what you pay for. The bug is fixable by recompiling Perl? Great. Submit the fix to the maintainers. End of story.
But, supposing that you *did* pay for support and you ran into this problem... It's a known bug with low priority. So get them to fix it. You're paying for support. Hold your vendor to their promises.
And if they don't fix it, find another vendor. That's the beauty of open source. If you need support and your current supplier sucks, you can find another.
But it's completely disingenuous to complain that recompiling your Perl binary will void your support contract *when you have no such contract*.
1. Re:Article is a troll by A+beautiful+mind · 2008-08-29 03:58 · Score: 5, Informative
  
  Still think it's a troll?
  
  This is what a perl core hacker has to say about the issue:
  
  It seems that there is still a problem with RedHat's packaged perl 5.8."8"**. RedHat seem to have an aggressive policy of incorporating pre-release changes in their released production code. This would not be so bad if they actually communicated back with upstream (i.e. me and the other people on the perl5-porters mailing list), or demonstrated that they had sufficient in-house knowledge that they didn't need to. But evidence suggests that neither is true, certainly for 5.8.x
  
  Let me stress that there has never been this problem in any released Perl, 5.8.7, 5.8.8, 5.10.0, and it won't be in 5.8.9 either when it comes out. The problem was caused by changes I made in the 5.8.x tree that RedHat integrated. End users reported the first bug something like 2 years ago, and RedHat closed it as "upstream patch" rather than reporting back "you know that pre-release change you made, that we integrated - well, it seems to have some problems"
  
  (...)
  
  For their versions affected, RedHat merely need to put out a patch integrating changes 31996, 32018, 32019 and 32025 which FIX IT, are documented as FIXING IT, and are from NOVEMBER 2007.
  
  --
  It takes a man to suffer ignorance and smile
  Be yourself no matter what they say
Re:That's what you get. by SatanicPuppy · 2008-08-29 04:03 · Score: 4, Insightful

Well, I am harder core than the average schmo where Perl is concerned, so for me it's a requirement...The vendor version is always inferior. Most forums will tell you the same thing.
But like I said, if you don't really need it, it's fine. I doubt the average user would ever run into this problem.

--
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
CentOS is compiled using the same tools and source by SuperBanana · 2008-08-29 04:18 · Score: 4, Insightful

Every time Redhat releases a hotfix, CentOS grabs the source and compiles it. They use the exact same toolchain to compile the exact same source. The only difference between a redhat package and a CentOS package is that CentOS has replaced "Redhat" everywhere, because Redhat started using trademark law to keep them from doing what the GPL entitled them to do (it got so bad that at one point, Redhat was threatening CentOS over even mentioning Redhat on their website.)
Let's keep our eye on the ball, here: this is a known bug, in Redhat's bug tracker, since 2006. Fixes have been commonplace since 2007, and only just now did Redhat get around to fixing the problem. The question remains: what good is Redhat over CentOS (the only difference being logos and a support contract) if they ignore a major performance bug for two years?

--
Please help metamoderate.
Don't link to bugzilla by MSG · 2008-08-29 04:18 · Score: 4, Informative

I don't know how many projects have asked Slashdot not to link to bugzilla. It makes the system unusable for the developers trying to get work done. Here's the text currently in the bugzilla entry (edited to meet slashdot's filter requirements): Bug 379791 - perl bless/overload performance problem Summary: perl bless/overload performance problem Status: VERIFIED Product: Red Hat Enterprise Linux 5 Component: perl (Show Red Hat Enterprise Linux 5/perl bugs) Version: 5.2 Platform: All Linux Priority: urgent Severity: high Target Milestone: rc Assigned To: Marcela Maslanova QA Contact: desktop-bugs@redhat.com URL: Whiteboard: GSSApproved Keywords: ZStream Depends on: Blocks: Reported: 2007-11-13 07:14 EDT by Nigel Metheringham Modified: 2008-08-29 10:30 EDT (History) Fixed In Version: Release Notes: Description From Nigel Metheringham 2007-11-13 07:14:04 EDT RHEL5 perl shows the same performance issues as the Fedora 7 perl did - see Bug #196836 and Bug #253728 This has been demonstrated in the recent perl update perl-5.8.8-10.el5_0.2 Same fix needs taking across to RHEL, ideally as a update release rather than waiting for next major release cycle. I do not have RHEL5.1 to test against right now, but the timing of the Fedora fixes leads me to believe these would be much too late for the 5.1 release cycle. -- Comment #2 From Martin Kutter 2007-11-30 05:24:01 EDT -- The issue can be observed running the benchmark script from the recent SOAP::WSDL package. To do so, download SOAP-WSDL-2.00_24 (and its dependencies) from CPAN, run perl Build.PL && perl Build, cd into benchmark and run perl -I../blib/lib 01_expat.t This is the Output from RHEL4: perl -I../lib 01_expat.t Name "DB::packages" used only once: possible typo at 01_expat.t line 2. Benchmark: timing 5000 iterations of Hash (SOAP:WSDL), XML::Simple (Hash), XSD (SOAP::WSDL)... Hash (SOAP:WSDL): 4 wallclock secs ( 3.48 usr + 0.01 sys = 3.49 CPU) @1432.66/s (n=5000) XML::Simple (Hash): 7 wallclock secs ( 7.19 usr + 0.03 sys = 7.22 CPU) @692.52/s (n=5000) XSD (SOAP::WSDL): 6 wallclock secs ( 6.06 usr + 0.01 sys = 6.07 CPU) @823.72/s (n=5000) And this (with reduced n) is from RHEL5 (different machine, perl-5.8.8-10): Benchmark: timing 500 iterations of Hash (SOAP:WSDL), XML::Simple (Hash), XSD (SOAP::WSDL)... Hash (SOAP:WSDL): 1 wallclock secs ( 0.59 usr + 0.00 sys = 0.59 CPU) @847.46/s (n=500) XML::Simple (Hash): 1 wallclock secs ( 1.06 usr + 0.00 sys = 1.06 CPU) @471.70/s (n=500) XSD (SOAP::WSDL): 11 wallclock secs (11.34 usr + 0.01 sys = 11.35 CPU) @44.05/s (n=500) Increasing the number of runs shows the O(n^2) nature of the performance problem - increasing the number of runs by a factor of 10 increases the runtime for the XSD bench by a factor of nearly 100: Name "DB::packages" used only once: possible typo at 01_expat.t line 2. Benchmark: timing 5000 iterations of Hash (SOAP:WSDL), XML::Simple (Hash), XSD (SOAP::WSDL)... Hash (SOAP:WSDL): 6 wallclock secs ( 6.19 usr + 0.03 sys = 6.22 CPU) @ 803.86/s (n=5000) XML::Simple (Hash): 11 wallclock secs (11.20 usr + 0.02 sys = 11.22 CPU) @ 445.63/s (n=5000) XSD (SOAP::WSDL): 851 wallclock secs (847.36 usr + 2.28 sys = 849.64 CPU) @ 5.88/s (n=5000) -- Comment #3 From RHEL Product and Program Management 2007-12-03 15:47:35 EDT -- This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the cur
Re:That's what you get. by amorsen · 2008-08-29 04:49 · Score: 4, Interesting

The vendor version is always inferior.
The vendor version in this case has a bug fixed. The bug caused incorrect behaviour. In this case the vendor version is only inferior if you prefer fast but incorrect results. There isn't anything wrong with preferring fast incorrect results over slow correct results, but most people probably want slow and correct to be the default if given the choice.
Fast and correct always wins, and the real Perl hackers are working on that. In the meantime we take what we can get.

--
Finally! A year of moderation! Ready for 2019?
Re:That's what you get. by chromatic · 2008-08-29 05:32 · Score: 5, Informative

Fast and correct always wins, and the real Perl hackers are working on that.

No released version of Perl ever had this bug. Red Hat pulled a patch from a development version of Perl and maintained it over released versions of Perl which did not need it. That's the source of this bug. The Perl developers fixed this bug before releasing the next stable version of Perl.

--
how to invest, a novice's guide
Re:That's what you get. by jc42 · 2008-08-29 06:11 · Score: 4, Insightful

There isn't anything wrong with preferring fast incorrect results over slow correct results, but most people probably want slow and correct to be the default if given the choice.
Well, I'd be a bit careful about making such general statements. There is evidence that people aren't generally that intelligent.
I remember back in the 1970s, when I was at a large university that shall remain unnamed, and a bunch of CS people did a detailed study of the Fortran that accounted for fully half the runs on the campus's central mainframe (which shall also remain unnamed). They found that fully half the runs produced at least some incorrect output due to undetected integer overflows. The hardware gave interrupts for floating-point overflows, but for integers, it just set a flag bit, and you needed to test that flag to catch overflows. The compiler had an option to generate such tests, but it was off by default. The vendor said they did this because they had found that most customers preferred faster code.
The local gang didn't believe this, so they did a bit of a survey. They asked lots of users of the Fortran code whether they would prefer their programs to catch all arithmetic errors if this meant that the code ran slower, or if they would prefer faster code that sometimes didn't catch errors. Roughly 90% of the people they asked this said that they'd want the faster code. Later on, I ran across references to similar tests at other schools, with similar results.
Personally, I was shocked by this. This mainframe was used to do the computing for most of the scientific work on campus, and scientific computing was almost entirely done in Fortran. So half their data runs had undetected incorrect output. They now knew this, and they still preferred the faster speed to correct output.
Somehow, I suspect that this situation hasn't changed. I've dug into various programming languages since then, to learn how they handle this and other potential sources of erroneous results. Most current languages still ignore things like overflow flags by default. Some have no way of enabling the tests of such flags.
Yes, I know lots of ways of explicitly testing for such errors myself. I've done it a lot, because I know I can't rely on others to enable the builtin tests (when they exist) when they recompile the code. But when looking at other people's code, I almost never see anything that will detect overflows. When you're N levels deep in function calls, you usually have no way of verifying the possible range of the current function's args, so there's no way of proving that an overflow can't happen.
Sometimes I'm amazed that our systems run as well as they do, given this sort of nonchalant attitude towards known sources of hardware errors. And I do a lot of paranoid, defensive programming, even though I know that my employers probably don't want it because it slows down the software.

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Re:That's what you get. by Karellen · 2008-08-29 08:57 · Score: 4, Insightful

Reminds me something I heard a wise hacker say once, when someone tried to convince him that their new version of some code was better that his, because it ran in 10% of the time his did but produced (slightly) wrong results in a few cases...
"If it doesn't have to produce correct results, I can make my version use no memory and run in zero time."

--
Why doesn't the gene pool have a life guard?