(Useful) Stupid Regex Tricks?
careysb writes to mention that in the same vein as '*nix tricks' and 'VIM tricks', it would be nice to see one on regular expressions and the programs that use them. What amazingly cool tricks have people discovered with respect to regular expressions in everyday life as a developer or power user?"
Beautiful regexp that validates RFC 822 addresses: Mail-RFC822-Address.html
Unselfish actions pay back better
Of course, you can do better still. For mac addresses, try:
^([[:xdigit:]]{2}:){5}[[:xdigit:]]{2}$
[:xdigit:] is short for hexadecimal digits, i.e. a-fA-F0-9
We can also loop 5 times over the 'XX:' sections.
For pretty much any useful stock problem solved by regular expressions, see Perl's Regex::Common module. A lot of these patterns are fiendishly complicated to deal with edge-cases properly.
[
This regex matches a number: interger or float, scientific notation or plain, plus or minus...
[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?
Colorless green Cthulhu waits dreaming furiously.
I wonder why such FAQs are still posted on a site like Slashdot. We now have a great repository for exactly this kind of questions:
http://stackoverflow.com/questions/tagged?tagnames=regex&sort=votes&pagesize=15
Magic stuff like this is not working: /\([FB][ot]o\).*\1/ although that seems to be the closest description of what we wanted.
In perl, I did /([FB][ot][o]).*\1/ and it seemed to work as you wanted. Also, if you're using a regex engine that supports lazy (non-greedy) quantifiers like perl does, I would use them in this case. It reduces backtracking. In perl, put a ? after the *.
It seems both Opera and ping in Windows interpret individual parts with leading zeros as octal. More interestingly, Opera also accepts hexadecimal. That makes constructing a regexp that validates any arbitrary IP address, and not just a valid dot-decimal, a bit more cumbersome.
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
I personally like the regex-builder mode in Emacs as well. This one allows you to build a regexp while highlighting all matches in the current buffer.
Of course, this should probably have been posted in the emacs thread earlier, but I think it is probably a good match for this thread as well :)
To start it, just use M-x regexp-builder