Are There Perl Optimization Guides?
ara818 asks: "I have written a 4,000-line personal web assistant using Perl. After getting everything to work I am now working on making the code run faster. The problem I keep running into though is that there are so many ways to do the same thing in Perl that I don't know which is faster. Right now, I am working on intuition but I'd really like a site or book that could give me at least a few pointers or some guidelines. Is there any such resource available for Perl, or for that matter, other popular programming languages?"
- Use hashes instead of linear searches. Instead of iterating over @keywords to see if $_ is a keyword, construct a hash with it:
- Consider using foreach, shift, or splice rather than subscripting.
- Use use integer.
- Avoid goto
- Avoid printf if print will work
- Avoid $&, $`, and $'
- Avod using eval on a string. Eval of a string forces recompilation every time the program is ran. In particular, symbolic references instead fo using eval to to construct variable names: ${$pkg . '::' . $varname} = &{ "fix_" . $varname}($pkg)
- Avoid eval inside a loop. Put the loop into eval instead, to avoid redundant recompilations of the code.
- Avoid run-time-compiled patterns, that is,
/$pattern/. Use the /pattern/o (once only) pattern modifier to avoid pattern recompilations when the pattern doesn't change over the life of the process. For patterns that change occasionally, you can use the fact that a null pattern refers back to the previous pattern, like this: /$currentpattern/; # Dummy match, must suceed //;
- Short-circuit alternation is often faster than the corresponding regular expressions. So:
/one-hump/ || /two/; /one-hump|two/; - Reject common cases early with next if inside a loop. As with simple regular expressions, the optimizer likes this. You can typically discard comment lines and blank lines even before you do a split or chop:
/^#/; /^$/;
- Avoid regular expressions with many quantifiers, or with big {m,n} numbers on parenthesized expressions.
- Maximize length of any non-optional literal strings in regular expressions. This is counterintuitive, but longer patterns often match faster than shorter patterns. That's because the optimizer looks for constant strings and hands them off to a Boyer-Moore search, which benefits from longer strings. Compile your pattern with the -Dr debugging switch to see what Perl thinks the longest literal is.
- Avoid expensive subroutine calls in tight loops.
- Avoid getc, use sysread instead (for single-character I/O only). . To get all the non-dot files within a directory, say something like this:
- Avoid frequent substr on long strings
- Use pack and unpack instead of multiple substr invocations.
- Use substr as an lvalue rather than concatenating substrings.
- Use s/// rather than concatenating substrings.
- Use modifiers and equivalent and and or, instead of full-blown conditionals. Statement modifiers and logical operators avoid the overhead of entering and leaving a block. They can often be more readable too.
- Use $foo = $a || $b || $c instead of:
- Set default values with $pi ||= 3;
- Don't test things you know won't match. Use last or elsif to avoid falling through to the next case in your switch statement.
- Use special operators like study, logical string operations, unpack 'u' and pack '%' formats.
- Beware of the tail wagging the dog. Misresembling ()[0] and 0
.. 2000000 can cause Perl much unnecessary work. In accord with UNIX philsophy, Perl gives you enough rope to hang yourself. - Factor operations out of loops.
- Slinging strings can be faster than slinging arrays.
- my variables are normally faster than local variables.
- tr/abc//d is faster than s/[abc]//g
- Print with a comma separator may be faster than concatenating strings.
- Prefer join("",
...) to a series of concatenated strings. - Split on a fixed string is generally split on a pattern. That is, use split(/
/, ...) rather than split(/ +, ...). - system("mkdir
...") may be fsater on multiple directories if mkdir(2) isn't available. - Cache entries from passwd and group and so on.
- Avoid unnecessary system calls.
- Avoid unecessary system() calls.
- Keep track of your working directory rather than calling pwd each time.
- Avoid shell matacharacters in commands -- pass lists to system and exe where appropriate.
- Set the sticky bit on the Perl interpreter on machines without demand paging. chmod +t
/usr/bin/perl - Using defaults doesn't make your program faster
The same chapter also lists Space Efficiency, Programmer Efficiency, Maintainer Efficincy, Porter Efficiency, and User Efficinecy. Each section contradicts each other.my %keywords;
for (@keywords) {
++$keywords{$_};
}
Then test $keywords{$_} for a nonzero value to see if $_ is a keyword.
"foundstring" =~
while () {
print if
}
print if
is likely to be faster than:
print if
at least for certain values of one-hump and two. This is because the optimizer likes to hoist ceertain simple matching operations up into higher parts of the syntax tree and do very fast matching with a Boyer-Moore algorithm. Complicated patterns defeat this.
while () {
next if
next if
chop;
@line = split(/,/);
opendir(DIR, ".");
@files = sort grep(!/^\./, readdir(DIR));
closedir(DIR);
if ($a) { $foo = $a; }
elsif ($b) { $foo = $b; }
elsif ($c) { $foo = $c; }