Slashdot Mirror


Optimizing Perl

An anonymous reader writes "Perl is an incredibly flexible language, but its ease of use can lead to some sloppy and lazy programming habits. We're all guilty of them, but there are some quick steps you can take to improve the performance of your Perl applications. This article looks at the key areas of optimization, which solutions work and which don't, and how to continue to build and extend your applications with optimization and speed in mind."

68 comments

  1. Error 400 by hattmoward · · Score: 1, Funny
    Proxy Error: Unable to connect to remote host "ltsgnd001k.sby.ibm.com" or host not responding - URL "http://ltsgnd001k.sby.ibm.com/developerworks/libr ary/l-optperl.html?ca=dgr-lnxw01OptPerl", errno: 79

    Does this mean you can't optimize Perl? =D

    Disclaimer: I use Perl almost exclusively for programming.

    1. Re:Error 400 by fatphil · · Score: 1

      "Disclaimer: I use Perl almost exclusively for programming."

      Good, it's useless for babysitting, or cooking, for example.

      FP.

      --
      Also FatPhil on SoylentNews, id 863
  2. Best PERL Optimization trick ever: by torpor · · Score: 3, Funny

    rm -rf /usr/bin/perl

    { Just Use Python! :) }

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    1. Re:Best PERL Optimization trick ever: by hattmoward · · Score: 1

      Why would you need -r if /usr/bin/perl is one file, maybe a symlink? -f is possible too, but unlikely.

    2. Re:Best PERL Optimization trick ever: by torpor · · Score: 1

      heh heh .. to make damned sure its gone. for good.

      (okay the -r is superfluous. so sue.)

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    3. Re:Best PERL Optimization trick ever: by the+quick+brown+fox · · Score: 2, Funny

      It's a little ironic that you wrapped "Just Use Python! :)" in curly braces, isn't it?

    4. Re:Best PERL Optimization trick ever: by Anonymous Coward · · Score: 0

      that's like saying, "Just use use Cobol."

      twit.

    5. Re:Best PERL Optimization trick ever: by torpor · · Score: 1

      Irony? Like Goldy, only not as Yellow?

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    6. Re:Best PERL Optimization trick ever: by Anonymous Coward · · Score: 0

      that should be: use Inline::Python;
      but for speed it's better to "use Inline::C;"
      (or even ASM)

    7. Re:Best PERL Optimization trick ever: by Samhain138 · · Score: 2, Insightful

      They said optimizing.
      You are aware that Python sucks when it comes to speed?
      You should look at the python-dev mailing list about how many times people mentioned that Python is slow, especially for regular expressions.
      One time someone suggested openning a perl process from within Python.
      So think twice before suggesting slow languages.

      NOTE: I personally hate Python, so I might not be too objective, but Python being slow compared to Perl is objective!

    8. Re:Best PERL Optimization trick ever: by AuMatar · · Score: 1

      I don't get hy people act like Python is a perl killer. Python is an interpreted OO language with limited string handling built into the main language and a unwieldly large standard library. Seems more like a Java competitor than perl.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    9. Re:Best PERL Optimization trick ever: by torpor · · Score: 2, Informative

      why should a language be all about string handling? thats what good libs are for.

      this article starts out with the 'some people program in Perl and use terrible habits' point. the problem is, Perl allows you into this bad habit territory, by design of the language.

      string handling is just one use for a language. python has plenty of superb string handling libs. its also very difficult to get into the same 'bad habit' territory that you can get into with Perl..

      my original post was to make the point that if you don't want to have to 're-optimize' your badly-written Perl code, just use Python in the first place.

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    10. Re:Best PERL Optimization trick ever: by AuMatar · · Score: 1

      True, but then you can get the same libraries for C, C++, Java, etc. Python still doesn't seem to try and fill the same gap than perl does- its much more of a Java competitor, without the applet nonsense.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    11. Re:Best PERL Optimization trick ever: by torpor · · Score: 1

      i've never though of Python as a replacement for Java. I've thought of it as a replacement for Perl, many, many, many times over, and have used it in that capacity quite well, too.

      That said, I'm learning Ruby these days. Ruby continues to impress me .. I found a Ruby script to do administration on HTML files which nearly brought tears to my eyes, it was so well-written, so fun to understand, and it worked so damned well.

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
  3. Working now! by Anonymous Coward · · Score: 0

    Got it now. Mirrored here, JIC. Please hit the link if you can, because they have a "Rate this article" thing at the bottom (I don't know if the form still works from the mirror, relative resources and such.), and we should give the author the good karma he/she deserves.

    -hattmoward

  4. Optimization Rules! by FooAtWFU · · Score: 4, Insightful
    The first rule of program optimization: Don't do it.
    The second rule or program optimization (FOR EXPERTS ONLY!): Don't do it yet.

    -- fortune

    --
    The World Wide Web is dying. Soon, we shall have only the Internet.
    1. Re:Optimization Rules! by Smallpond · · Score: 4, Insightful

      +1 insightful.

      Look at his first example, which is concatentating 1 million strings. His "bad" time is 5.2 seconds and the good time is 1.7. Who cares? Nobody uses perl to do high-performance computing. Imagine you are extracting 1M strings from a database and doing something with them. Would you care about a 3 second difference?

      Its OK to write good code, but its better to make your code clear and not dependent on clever tricks.

    2. Re:Optimization Rules! by Anonymous Coward · · Score: 0

      It's becuase of assholes like you that I had to buy a new computer this year, which is an order of magnitude faster than my old computer, to continue to perform the same tasks that I used the old computer for. Thanks.

    3. Re:Optimization Rules! by Anonymous Coward · · Score: 1, Interesting

      Unfortunately, perl is now used in bioinformatics an awful lot. Genetic analyses are (rather frighteningly) regularly carried out in perl. While any computer scientist would point out that lisp or, hell, even python, would be more appropriate, yes, that 3 second difference probably does matter to some people.

    4. Re:Optimization Rules! by AuMatar · · Score: 1

      I'd care about a 3 second difference if I was sitting waiting for an answer, yes.

      Atitiudes like this is a large part of why software sucks so badly. Hardware enjoys a huge speed increase every year, but my computer can't do any more because software has slowed by about that much. Newsflash- I don't care if new computers are faster, the speed of my existing hardware didn't increase.

      Performance matters. And just like security, it can't be an afterthought. You need to design for efficiency, trying to shove it in later requires more effort than it would have up front, more expense, and frequently a lot of redesign and more bugs. Spend time up front looking for ways to optimize, and your programs will be better for it.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    5. Re:Optimization Rules! by Pseudonym · · Score: 2, Insightful
      Performance matters. And just like security, it can't be an afterthought.

      Performance always matters. A program which computes the monthly payrolls is useless if it takes three months to run. However...

      1. Performance may not be an issue in your particular program. If it's good enough, it's good enough.
      2. "Performance" may not mean what you think it means. For one application, throughput (i.e. the amount of work you can do right before overload) is the most important consideration. For others, scalability, latency, memory usage or cache utilisation might be more important. Moreover, "performance" may not mean what you thought it meant when you started design.
      3. A few minutes with the back of an envelope before you start coding often beats hours spent with profilers afterwards.
      4. Time spent optimising the wrong piece of code is time wasted.
      5. If performance really, really matters, plan to write a copy to throw away.
      6. (And this is the most important of all...) The best favour you can do to your code with respect to performance is to write clean, well-compartmentalised code with good internal APIs and lots of unit and system tests. When you find a performance problem (and you will), you should be able to swap out bad code and swap in good code and (almost) nothing else will break. The corollary of this is that with performance, you need to design it in, but not necessarily code it in.
      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    6. Re:Optimization Rules! by fatphil · · Score: 1

      Brilliant post.

      I'm a speed freak, everything I do needs to be fast (I work with computational number theory), but I code almost entirely in C, and only very very rarely resort to assembly language. I often end up with faster code than those that are 100% assembly language, due to investment in your #3 and #6.

      And all of my house-keeping tasks (management of lists of candidates, or factors, or whatever) I code in Perl, because of your #1 (and #3).

      FP.

      --
      Also FatPhil on SoylentNews, id 863
  5. Here's My Style Guide by justanyone · · Score: 5, Informative
    I know, many people means many styles.
    Here's my style guide, something I developed using Perl for over 5 years now.

    Pardon the length, it's unavoidable.

    Perl Coding Conventions and Style Guide
    By Kevin J. Rice, Kevin@justanyone.com

    General conventions:

    Read the Perl style guide (http://theoryx5.uwinnipeg.ca/CPAN/perl/pod/perlst yle.html), and follow the conventions therein, especially the following:
    4-column indent
    Blank lines between chunks that do different things.
    Use mnemonic variables- the names must mean something useful. No one character names!
    Variable naming conventions:
    $ALL_CAPS_HERE constants only (beware clashes with perl vars!);
    $Some_Caps_Here package-wide global/static (also prefix 'gv_', see below);
    $no_caps_here function scope my() or local() variables;
    Be consistent.
    Be nice.

    Specific Coding Practices:

    1. Always do a 'use strict;' at the beginning of every module and script. This catches both subtle errors and bad coding practices.

    2. Programs should pass 'use warnings;' with a minimum of warnings before going into production. Note: turning off warnings in production is sometimes required for security or stability purposes. Solve root cause for all warnings if possible; don't just eliminate immediate cause.

    3. Turn on Taint checking for all cgi / web enabled scripts. Invoke with "perl -T" or "#!/usr/bin/perl -T".

    4. Use spaces for indenting, not tab characters. No file should contain any tab characters. These display differently in various terminals/editors, and mixing spaces and tabs makes code very messy. Most modern editors can be set to automatically insert spaces in place of tabs.

    5. Each subroutine should perform one distinct task. Feel free to break down lengthy (i.e., more than 1-2 screenfuls of code) subs. This means almost all subroutines should be 120 lines or less; longer ones should be justified in code review.

    6. Code blocks, when more than 1 or 2 lines, should have the block { } at the same indentation level to aid visual clarity of where that block starts/stops. Example:

    if ($condition == $value)
    {
    $another_var++;
    $two = $three + 12;
    }
    else
    {
    $four = $pi * 13;
    }

    7. Fully parenthesize stuff like "if ($a >= 5 || $b > 4)" into "if (($a >= 5) || ($b > 4))" so the user has no need to know/get wrong the order-of-operations. This includes one line conditionals like, "if (a) {}" - don't do: "if a {}".

    8. Evals: Always use evals when doing system calls. If otherwise using them, always comment/explain why. If you know something might 'die', explain it specifically, since it probably isn't obvious.

    9. Explicitly 'return' values at the end of every sub. Don't EVER use the last statement's value as a default return value; someone modifying the code later might not know you're depending on that value.

    10. All modules must explicitly end with '1;' to provide a return value for the module.

    11. Minimize the use of map() due to its confusing nature.

    12. Use parentheses around all function calls, such as sort($a, $b) instead of "sort $a $b;" to make it obvious a function call is occurring. Prefer not to use the Perl subroutine operator, as in "&subroutinename($arg1, $arg2);" just do "subroutinename($arg1, $arg2);".

    13. Don't use the 'unless' verb. Instead of, "unless($foo) {...}", code: "if (!$foo) {...}". The 'unless' verb is plenty confusing due to its uniqueness to Perl.

    14. Modules and scripts, when over 200 or so lines, should have a logMessage() subroutine that allows for various levels of logging (0=silent, 1=minimal, 4=normal, 8=verbose): logMessage(1, "message");

    15. Use a main() sub for all scripts, and include an explicit exit with an exit code appropriate to the platform you're on. Do not

    1. Re:Here's My Style Guide by hattmoward · · Score: 3, Interesting

      I usually put the CVS $Log: $ at the end of the script, after an __END__, and place a note about it at the top. You can also use the following snippet if you want to make your module's version equivalent to the CVS revision number: use vars qw($VERSION); $VERSION = ('$Revision: $ ' =~ /(\d+\.\d+)/)[0];

    2. Re:Here's My Style Guide by Anonymous Coward · · Score: 2, Interesting

      Oops! I'm so used to <code> tags at Perl Monks...

      use vars qw($VERSION);
      $VERSION = ('$Revision: $ ' =~ /(\d+\.\d+)/)[0];

    3. Re:Here's My Style Guide by Anonymous Coward · · Score: 0

      About constants: look at the "constants" package. You get real scalar constants (that can be optimized by the compiler) instead of fake $UNTOUCHABLE_SCALARS_THAT_SOMETIMES_GET_SCREWED_UP . Caveat: constant hashes can't be done with the "constants" package, you have to use Hash::Util's lock_hash(). That way you don't get auto-vivified keys in your precious, precious hash.

      Others:

      4. Also: always display any tabs as 8 characters, never change that value. Make sure everyone who touches your code knows it's 8 characters! 8 characters!! That way even if some jackass decides to mix tabs and spaces, you're still all on the same page. But they're still jackasses.

      7. Is "if a {}" even valid syntax? I don't think so:

      % perl -e 'if 1 {}'
      syntax error at -e line 1, near "if 1"
      Execution of -e aborted due to compilation errors.


      12. and 14. conflict with the use of underscores as name separators: don't use StudlyCaps, use_underscores_instead, for method/sub names. Be consistent with your variable and function naming!

      20. Use Params::Validate if appropriate.

      25. Document all public methods in POD!

      29. Huh? EOF is EOF. I don't need 5 wasted lines to tell me where EOF is.

      30. Document in POD! SYNOPSIS!

    4. Re:Here's My Style Guide by duggy_92127 · · Score: 1
      4. Use spaces for indenting, not tab characters. No file should contain any tab characters. These display differently in various terminals/editors, and mixing spaces and tabs makes code very messy. Most modern editors can be set to automatically insert spaces in place of tabs.

      I've never understood this one. Granted, mixing spaces and tabs makes for messy code, but there are two solutions to that: Use only spaces, or use only tabs.

      Why not tabs? They're easier to type when you're deeply nested, just hit Tab five times instead of 20 spaces. And for display, if I like an 8-char indent but you like a 3-char indent, we just set our editors differently and we're both happy. That's a feature, not a bug.

      Please, somebody, give me a truly valid reason why spaces are better than tabs, empirically. Thanks.

      Doug

    5. Re:Here's My Style Guide by Anonymous Coward · · Score: 0

      Some text editors will by default turn "\t" into " "^n (with n=3, 5 or 8 usually) for semi-good reasons. Empirically, if you're dealing with a bunch of disorganized devs, at least a few will be either incapable or unwilling to change this setting. Since editors generally don't go the other way, it is arguably easier to just use spaces to start with.

      I fucking hate spaces, but it's the lesser evil in certain situations. With emacs (which is admittedly 40% of the problem in the 1st place...) it's pretty easy to go from one to the other.

    6. Re:Here's My Style Guide by justanyone · · Score: 1

      Doug:

      As I mentioned above in a reply to that post, the varying depth of tabs can really get you in trouble.

      My editor (http://ultraedit.com/, when I hit the tab key, insert 4 spaces. Thus the ease of tabbing over to column 20 is indeed 4 keypresses. However, if my coworker does the same thing with his tab settings at 8, he hits tab twice and then puts in 4 spaces to align it. Ug. Or, hits tab 10 times if he's using a tabsize of 2. Yuck again.

      Emperically, you want a study that says that mixed use wastes time vs. just paying attention. I think that's a good idea for a study at CMU, but I already have experinced the massive sucking sound of my time being wasted cleaning up and aligning code so it looks clean and straight (yah, being a little anal retentive, but it actaully saves time in the long run).

      I believe this to be a RELIGIOUS issue and thus we'll never convert each other. I apologize if I've offended your God; I recognize he exists but chose not to worship him, I've got my own.

      -- Kevin

    7. Re:Here's My Style Guide by Haeleth · · Score: 1

      Please, somebody, give me a truly valid reason why spaces are better than tabs, empirically. Thanks.

      Every editor worth using has a function that makes it automatically insert spaces when you press Tab - but very few have functions that make them automatically insert tabs when you press Space.

      Therefore, it's easier to configure your editor to insert the right sort of spacing whichever key you press if you're using spaces rather than tabs. Therefore, using spaces means you're less likely to end up with mixed indentation.

      I don't know if you'll consider that a truly valid reason, but it's the best I can think of.

      The downside to mapping your Tab key like that is that it makes writing Makefiles a real PITA...

  6. And then stress-test with Slashdot ... by xmas2003 · · Score: 4, Interesting
    I use Perl for my halloween webcam - same code is used in the christmas webcam ... and thought I had it in pretty decent shape ... until the Slashdot thundering herd descended on it and gave it one heck of a stress test.

    For instance, flock is your friend ... and as I outline in my slashdot effect analysis you had better be prepared to handle race conditions. Ignoring the web server overload (mod_perl would have helped here), the code actually hung in there fairly well as I've learned from past "mistakes" when I've seen some pretty funky error messages crop up ... but even this time around, there was two minor corner cases I failed to account for (had never been "tickled" before) ... but those are fixed now so I'll be "more" ready if my christmas lights show up on Slashdot again ... but then again, you are never really "ready" for Slashdot! ;-)

    --
    Hulk SMASH Celiac Disease
  7. Universal Guilt by Anonymous Coward · · Score: 3, Funny
    Perl is an incredibly flexible language, but its ease of use can lead to some sloppy and lazy programming habits. We're all guilty of them

    Are you accusing me of writing PERL? Come over here and say that again!

  8. No offense by PrvtBurrito · · Score: 0, Flamebait

    No offense but, I think that programming in Perl is a sloppy lazy programming habit.

    --
    Laboratree - Scientific collaboration based on OpenSocial.
    1. Re:No offense by Anonymous Coward · · Score: 3, Insightful

      No offense, but if you require language constructs to keep your code clean and disciplined, you're a sloppy and lazy programmer.

    2. Re:No offense by Anonymous Coward · · Score: 1, Funny

      I never said I or any programmer writes perfect code, however I feel the blame for lazy and/or sloppy mistakes should be upon the programmer, not the tool.

      My approach to sex is the same to programming: have fun, be kinky, but don't be lazy and take proper precautions. I don't blame the girl if I'm a bad lay.

      Why the hostility about being a geek and not getting laid? You realize you're arguing with an anonymous programmer on slashdot about programming... not exactly a non-geeky thing to do.

    3. Re:No offense by ameoba · · Score: 1

      Nobody considers themselves to be the sloppy, lazy coder that needs a language to keep their code good, but they can all name somebody else who is. Using clean languages keeps the other guy in line.

      --
      my sig's at the bottom of the page.
  9. Orcish maneuvre by eyeye · · Score: 1

    ||= very handy optimization... especially in a persistant environment such as mod_perl or speedy cgi.

    --
    Bush and Blair ate my sig!
    1. Re:Orcish maneuvre by jdowland · · Score: 1

      but what does it do?

    2. Re:Orcish maneuvre by hattmoward · · Score: 1

      It's a shorthand operator. If you want to add a number to a variable, you can do this:

      $foobar += 42;
      Which is effectively this:
      $foobar = $foobar + 42;

      ||= works the same way, but with the || operator, so

      $quux ||= 'foo';
      works like
      $quux = $quux || 'foo';
      It sets $quux to 'foo' if $quux doesn't already contain something (which evaluates to a "true" condition). If not, it does nothing at all to $quux.
  10. perl optimization vs general optimization by BinLadenMyHero · · Score: 2, Insightful

    Many optimizations listed in the article are not pertinent to Perl, but to any programming language, and as such are inapropriate to be there. Like the part about avoiding calling functions inside loops, short-circuit logic, sorting, etc..

    But there are some good tips there, too: the part about string handling, references, and the AutoLoader.

  11. some comments by BinLadenMyHero · · Score: 2

    4. I disagree. The tab is useful. The fact that they display differently in various terminals/editors is a FEATURE! But I agree that mixing spaces and tabs is a bad idea.

    6. bullshit. That's personal taste.

    13. utter bullshit. What's confusing about 'unless'? It may be unique to perl, but it's a pretty obvious english word.

    14. bah

    16. good! good that you don't prohibit gotos

    18. bullshit. The $_ variable has a nice semantic value. Of course it should be used only in small blocks.

    21. hmm.. Sometimes it's nice to declare them near place they're used.

    22. Sometimes it makes sense to declare many inter-related variables on one line. Like my ($display_width, $display_height);

    23. nice. I prever to use just g_

    26. I like to use dashes instead #---------

    29. hun?

    1. Re:some comments by justanyone · · Score: 4, Insightful
      4. Tabs: Tab is a nasty character that is not visibly different from x number of spaces. Lots of people like tabs. That's fine. Lots of people don't. That's fine, too. But, when 2 people work on the same code, bad stuff happens. Spaces ALWAYS get mixed in. This is bad. The easiest method to elim this prob is No Tab Chars. This can get religious, but BADLY ALIGNED CODE LEADS TO CODING ERRORS! This is a frequent mistake and costs time (and therefore money and anger).

      6. The "bullshit... personal taste" aspect of brace alignment is both true and misleading. Really, it doesn't matter which way you do it, as long as you're consistent. But, with multiple people working on the same code, consistency is difficult. I've always done it with left brace on the left margin so I could easily see what lined up where. If your rule is opposite, fine, but USE ONLY ONE and code looks much nicer.

      13. UNLESS (pardon my french) = stronzino (a little piece of shit). It's in the language to assist removal of a single ! 'not'. This can really confuse people. I'm not the smartest guy, nor the dumbest, but sometimes I see it and just go, "huh?". I'm not used to it. Neither have been many other Perl coders I know when we've spoken about it.

      14. I take it by "Bah" you don't like scripts to log their actions. I've fought this recently with a 'know-it-all' type who wanted to build something fancy to do logging "when I get around to it". Yuck. Keep it simple, log what's going on so you can trace it later. Simple text files with "just did this, value=12" can help tremendously in debugging production problems. Users never know what they did; error messages never can contain enough info about what happened before.

      16. GOTOs are evil. I admit to some brainwashing by CS profs on this, but have dealt with enough spaghetti code to agree with it. Yes, there are times when it's good. But, in my last 100,000 lines of Perl, I haven't had to use it yet. So, it must not be vital. My goal is simplicity of code, not speed, since who cares about speed most of the time anyway, unless it's really bad, in which case there's probably somethign you're doing wrong otherwise.

      18. $_ is valuable only until you need to know what's in it. Then, you need a real variable name. You also may need that var to stick around past the next function call. I say, use 'my $request = $_; ' or something to grab $_ and make it obvious.

      21. Declaring vars near use is good ONLY in subs. If you have:
      exit(main());
      sub main
      {
      do_jack($GV_DEF_ONE);
      }
      my $GV_DEF_ONE = 12;
      sub do_jack
      {
      ....
      }
      you'll get an error during parsing due to GV_DEF_ONE not being declared yet.

      Regardless, Global vars are hard enough to spot and should be rare, declare them all at the top of the module to make it bloody obvious you're using one.

      22. I can sometimes agree to my ($a, $b) = split(',',@inlist); but not disparate vars all crammed together on one line, it's not readable, the vars are hidden, not aligned and initialized, etc.

      29. Lines of hashes visually indicate end of file. I can always tell I have the last page of a printout when all my files end with 5 or so rows of hashes. Just convention and a good idea, not a hard-fast rule.
    2. Re:some comments by BinLadenMyHero · · Score: 0, Troll

      WTF???

      Lameness filter encountered.
      Your comment violated the "postercomment" compression filter. Try less whitespace and/or less repetition. Comment aborted.

      13. There are times the 'unless' fits right in. Like when the action will take place by default, but have some exception. Like

      print $dirname unless $dirname =~ /\A\.[\.]+\Z/;

      16. Spagheti code is evil, not goto. Since Perl have more flexible flow control, I don't remember to have used it yet, but when coding in C, sometimes I use goto to cleanly escape a function that deals with many resources. It really can make the code much simpler and more readable by avoiding having many exit points.
      resource* get_r1()
      {
      resource r1 = NULL;
      resource r2 = NULL;
      resource r3 = NULL;
      r1 = alloc_resource();
      if(r1 == NULL) {
      perror("blah");
      return NULL;
      }
      r2 = alloc_resource();
      if(r2 == NULL) {
      perror("blah");
      free_resource(r1);
      return NULL;
      }
      r3 = alloc_resource();
      if(r3 == NULL) {
      perror("blah");
      free_resource(r1);
      free_resource(r2);
      return NULL;
      }
      use_resources(r1, r2, r3);
      free_resource(r2);
      free_resource(r3);
      return r1;
      }


      I much prefer:
      resource* get_r1()
      {
      resource r1 = NULL;
      resource r2 = NULL;
      resource r3 = NULL;
      r1 = alloc_resource();
      if(r1 == NULL) goto error;
      r2 = alloc_resource();
      if(r2 == NULL) goto error;
      r3 = alloc_resource();
      if(r3 == NULL) goto error;
      use_resources(r1, r2, r3);
      goto clean;
      error:
      perror("blah");
      if(r1 != NULL) free_resource(r1);
      clean:
      if(r2 != NULL) free_resource(r2);
      if(r3 != NULL) free_resource(r3);
      return r1;
      }


      29. Oh!.. a printout. That makes sense. I just coundn't think of a reason for that lines. It's just that I never chop trees to look at code.

    3. Re:some comments by jgrahn · · Score: 1
      The tab is useful. The fact that they display differently in various terminals/editors is a FEATURE!

      Please name an editor or terminal which doesn't treat TAB in the way God intended. Unless the user selects the option "I want my text files to be incompatible with all other tools in the known world". Which, granted, a surprising number of people do ...

    4. Re:some comments by nutsy · · Score: 1

      MS-DOS/Windows edit.com doesn't! It converts tabs to spaces on load, and (maybe) converts (some) spaces to tabs on save. Of course, God probably doesn't intend for people to waste their time with edit.com in the first place.

    5. Re:some comments by Gabe+the+Programmer · · Score: 1

      Sorry, but if you think "unless" is confusing, then you're absolutely nuts.

      How is "unless" MORE confusing than "if not this is true?"

      die "Connection lost" unless $connection->ping;

      Thank God for unless.

  12. Article is seriously flawed by Anonymous Coward · · Score: 2, Interesting

    I'm still working through it, but I cannot reproduce its purported effects.

    First it has a syntactical error with the "x" operator; it puts the number on the left and the string on the right, but the actual syntax it the other way around. If the author had actually tried to run his examples, he would have noticed this.

    Then the author says that putting as much text in a single-quoted string as possible better, and says that something like:
    print 'aaaaaaaaaaaaaaaaaa',"\n" ;
    is better than:
    print "aaaaaaaaaaaaaaaaaa\n";

    I just tested this, and not only could I find no difference between single and double-quoted strings with the same amount of text, the suggested "improvement" with two separate strings, above, was significantly SLOWER than the second version.

    At this point I lost interest (and respect) and stopped checking. but don't take my word for it, try it youself! Are you getting different results?

    1. Re:Article is seriously flawed by Anonymous Coward · · Score: 0
      I agree that the author was not clear about passing multiple arguments to print(), however I think he shouldn't have used constant strings as an example.



      use Benchmark;
      my ($a,$b,$c) = ("a"x100,"b"x200,"c"x300);
      open(NULL,">/dev/null" );
      timethese(1_000_000, {
      "concat" => sub { print NULL "$a$b$c\n" },
      "multi-args" => sub { print NULL $a,$b,$c,"\n" },
      });



      produces:



      Benchmark: timing 1000000 iterations of concat, multi-args...
      concat: 8 wallclock secs ( 6.95 usr + 0.24 sys = 7.19 CPU) @ 139082.06/s (n=1000000)
      multi-args: 5 wallclock secs ( 4.94 usr + 0.25 sys = 5.19 CPU) @ 192678.23/s (n=1000000)



      So not a huge performance gain, but definitely a measurable one.
    2. Re:Article is seriously flawed by Anonymous Coward · · Score: 0
      Oops, I suppose I should've explained that he shouldn't have used constant strings ("a","b","c") vs. scalar variables ($a,$b,$c) because constant strings can be optimized and don't go through the same codepath. Using Terse as the author talked about:
      % perl -MO=Terse -e 'print $a,$b,$c'
      LISTOP (0x81310b0) leave [1]
      OP (0x81f8980) enter
      COP (0x8133ef8) nextstate
      LISTOP (0x8131038) print
      OP (0x81f89a0) pushmark
      UNOP (0x8131060) null [15]
      SVOP (0x8131100) gvsv GV (0x813e8f4) *a
      UNOP (0x8130fc0) null [15]
      SVOP (0x8130fe8) gvsv GV (0x813af40) *b
      UNOP (0x8131088) null [15]
      SVOP (0x8131010) gvsv GV (0x812f944) *c



      vs.
      % perl -MO=Terse -e 'print "$a$b$c"'
      LISTOP (0x8131178) leave [1]
      OP (0x81f88e0) enter
      COP (0x8137580) nextstate
      LISTOP (0x8131128) print
      OP (0x81f8980) pushmark
      UNOP (0x81310d8) null [67]
      OP (0x819d1b8) null [3]
      BINOP (0x81310b0) concat [2]
      BINOP (0x8131038) concat [1]
      UNOP (0x8131060) null [15]
      SVOP (0x8131100) gvsv GV (0x813af34) *a
      UNOP (0x8130fc0) null [15]
      SVOP (0x8130fe8) gvsv GV (0x813af4c) *b
      UNOP (0x8131088) null [15]
      SVOP (0x8131010) gvsv GV (0x813e8c4) *c



      Those additional calls can really add up in a tight loop.

      At this point slashdot's posting code is preventing me from posting this, so I had to remove all the leading spaces in the Terse output above
  13. A better way to Sort than in the article. by cryptor3 · · Score: 4, Informative
    In the article, the author mentions that a faster way of implementing this sort:
    my @marksorted = sort {sprintf('%s%s%s',
    $marked_items->{$b}->{'upddate'},
    $marked_items->{$b}->{'updtime'},
    $marked_items->{$a}->{itemid}) <=>
    sprintf('%s%s%s',
    $marked_items->{$a}->{'upddate'},
    $marked_items->{$a}->{'updtime'},
    $marked_items->{$a}->{itemid}) } keys %{$marked_items};
    is this sort, which pre-computes a "sort" field for each record. (of course, at the expense of memory):
    map { $marked_items->{$_}->{sort} = sprintf('%s%s%s',
    $marked_items->{$_}->{'upddate'},
    $marked_items->{$_}->{'updtime'},
    $marked_items->{$_}->{itemid}) } keys %{$marked_items};
    my @marksorted = sort { $marked_items->{$b}->{sort} <=>
    $marked_items->{$a}->{sort} } keys %{$marked_items};
    I argue that this implementation is flawed, because the fields can run together, so for example, if you had the following data:

    Data object 1: upddate = 111, updtime = 1100, itemid = 200
    Data object 2: upddate = 1111, updtime = 100, itemid = 200

    So both strings would have a sort value of 111110200, but of course, data object 1 should be sorted before data object 2. Using delimiters in the sprintf statement will ensure that different fields are marked as different, but they will interfere with the sort order.

    Another problem is that if your sort string is too long, perl may convert it to a floating point number and thus lose the data from the later fields.

    The more correct way to do this sort is
    my @marksorted = sort {
    $marked_items->{$b}->{'upddate'} <=> $marked_items->{$a}->{'upddate'} ||
    $marked_items->{$b}->{'updtime'} <=> $marked_items->{$a}->{'updtime'} ||
    $marked_items->{$b}->{'itemid' } <=> $marked_items->{$a}->{'itemid'}
    } keys %{$marked_items};
    The added benefit of this method is that it definitely won't have overflow problems (which may be the case in the above examples, because "<=>" is the numeric compare operator. Had the author used "cmp", there would then be a quantity of numeric comparisons proportional to the length of the sort string.

    The other benefit of my sort is that it is more flexible. you can change the "<=>" operator to a "cmp" operator if one of your fields is string data.

    The sort that I propose (one I've been using) may or may not be faster than the "faster" sort proposed by the author, but then again, speed is nothing without correctness.
  14. AKA "How to micro-optimize in perl" by fizbin · · Score: 3, Insightful

    Look, this kind of "squeeze the last bit of performance" exercise can be nice fun for assembly, or possibly C, programmers, but when have you had something that was acceptable as a perl script, but only after extensive optimization?

    Better yet, I would have liked pointers on how to test code snippets for performance (such as illustrating the use of Benchmark or Devel::SmallProf), and then possibly a few pointers like this. (and why was Memoize left out of an article like this?) This sounds like someone writing perl who'd rather be writing assembly code.

    In optimizing my (and others') perl scripts, the best tools I've found are the profiler and an understanding of what the code is supposed to do. That, and changing the nature of deployment of the program - from a cgi script to mod_perl, for example. All these little techniques are chasing after grains of sand, when there's a big rock right in front of your face.

    1. Re:AKA "How to micro-optimize in perl" by Anonymous Coward · · Score: 0

      yes, don't dismiss anything out of hand, it's a big world and there's lots of possibilities.
      think: regex inside critical inner loop, but that's only one of many many possibilities

  15. The example is flawed; the theory is ok by cryptor3 · · Score: 1
    Take his examples with a grain of salt. A number of the examples may be flawed in some cases, mostly because the quantities of data involved are so small that performance bottlenecks are moved to other areas. This is a good lesson for performance benchmarking: Know where your bottlenecks are. The ideas in the article are mostly valid, but they just don't apply when this is not where your bottleneck is.

    Incidentally, I am getting a slightly better speed on the singlequote example (as claimed). My times are 12s vs 14s.

    The primary bottleneck here is in the IO of the print statement itself. I bet that the string interpolation is probably very fast compared to the buffering slowness induced start/stop-ness of the second print statement. Most likely you have a very fast CPU.

    Buffering makes all the difference in the world. From some of the benchmarking I've done previously, I have a hypothesis that "\n" sent to a print statement will trigger a buffer flush after it finishes sending that string off to the print statement.

    In other words, an exaggerated version of foreach(1..1000) { print "blah\n";} will be slower than foreach (1..50) { print "blah\ntimes ten" }

    I have seriously changed the execution time of one script from 980ms to 98 ms just as a result of bad buffering from the print statement. I think that the process of splitting up the print statements probably made it wait (more often) for the I/O resources.

    In my particular case, I had a CGI script running off localhost, that was outputting about 30kb of text, looping through 80 or 90 records of data. Rather than doing the print statements directly, I buffered everything to a string, and then at the end of the loop. As a tradeoff, I think I may have actually settled on flushing my string buffer at the end of each loop cycle.

    Just for the record, I think the article is good in the sense that if you didn't realize that what you were doing might be a performance issue, it's a good wakeup call to go and benchmark your code. But don't just follow the advice on faith.

    test:
    my $trials = 5000000;

    PrivoxyWindowOpen(FH, ">testfile.txt");
    $startt1 = time;
    for (1..$trials) {
    print FH "aaaaaaaaaaaaaaaaaa\n";
    }
    $endt1 = time - $startt1;
    close FH;
    print "Elapsed time: $endt1\n";

    PrivoxyWindowOpen(FH, ">testfile.txt");
    $startt1 = time;
    for (1..$trials) {
    print FH 'aaaaaaaaaaaaaaaaaa', "\n";
    }
    $endt1 = time - $startt1;
    close FH;
    print "Elapsed time: $endt1\n";

    kill 'testfile.txt';
  16. Optimizing schmoptimizing by cryptor3 · · Score: 3, Interesting

    I have a friend who works at a big company that provides a lot of "utility" to its customers.

    They run perl scripts all the time to crunch text files containing lots of data coming in from remote sensors and stuff like that. He told me that the more senior guys have the philosophy is "Optimize? nah, just let it run the extra 20 minutes."

    And they're talking about scripts that get run in a cron job DAILY.

    1. Re:Optimizing schmoptimizing by Anonymous Coward · · Score: 1, Insightful

      If the resource use isn't a problem and it leads to better readability and maintainability -- then, good for them! That's the right call.

    2. Re:Optimizing schmoptimizing by cryptor3 · · Score: 1

      Certainly true. But the impression I got from my friend was that the code they wrote was a little bit redundant, and they just didn't bother to remove the redundancies. I think readability and maintainability was not really affected either way.

    3. Re:Optimizing schmoptimizing by Anonymous Coward · · Score: 0

      I have a friend who uses perl to process data that has to be sent out in real time. He has to optimize his scripts quite a bit. From what he tell me he usually uses C for these parts, and then calls it from perl though. Anyways, there are people out there who need optimized code still. Beleive it or don't.

    4. Re:Optimizing schmoptimizing by 12357bd · · Score: 1

      Then, maybe it was a 'if it's not broken...' case.

      --
      What's in a sig?
  17. Perl is the only language ... by Anonymous Coward · · Score: 2, Funny

    ... where the optimized code is easier to read.

  18. unless by hding · · Score: 3, Informative

    isn't unique to Perl. It exists, for example, in Common Lisp.

  19. parse tree by Doc+Ruby · · Score: 2, Interesting

    Perl compiles its code into an intermediary "tree" of logic nodes (Perl "opcodes"). Are there any topology strategies for optimizing that tree, in the graph itself? Any visualization tools that let Perl generate the tree, then let a programmer change the tree, then complete the compilation of the new tree to new code? Is Parrot/Perl6 making any of these strategies more feasible, or are they all going away?

    --

    --
    make install -not war

    1. Re:parse tree by Anonymous Coward · · Score: 0

      Anything is possible in Parrot/Perl6 - just wait (x + 2) years where x is always next year.

  20. My favorite perl joke: by mshiltonj · · Score: 4, Funny

    DAY 1:
    Manager: How many lines of code did you write today?
    Developer: One.

    DAY 2:
    Manager: How many lines of code did you write today?
    Developer: One.

    Day 3:
    Manager: How many lines of code did you write today?
    Developer: One.
    Manager: Are you telling me that in three days you've only mangaged to write three lines of code?
    Developer: You don't understand -- I've been working on the same line of code all three days.
    Manager: (pauses) You're writing in perl again, aren't you?

  21. you're a dick by Anonymous Coward · · Score: 0

    go back to your ivory tower, you useless person

  22. THE EDITOR IS THE PROBLEM by mgkimsal2 · · Score: 1

    My editor (http://ultraedit.com/, when I hit the tab key, insert 4 spaces.

    *THIS HITS THE NAIL ON THE HEAD*

    You've got it configured to *insert* spaces when you hit tab. I don't recall ultraedit doing that by default (haven't used it in a few years tho). Most editors I use by default will *RENDER* a tab as X spaces, not actually *CREATE* X spaces. If it renders as X, you can easily change the rendering. But once they've become spaces, you can't go back (easily anyway).

  23. My 0.02 Euro ;) by kompiluj · · Score: 3, Interesting

    Well for me the greatest optimization is Perl itself, which allows to quickly write potent code. It spares the programmer's time, which costs much more than machine time. And as to optimization - well I think that a good optimizing compiler should do the job - you know I couldn't recognize my inefficient C++ code after running it through the Intel C++ compiler - it has improved soooo much!

    --
    You can defy gravity... for a short time