Slashdot Mirror


Next Generation Regexp

prostoalex writes "Jeffrey E. F. Friedl, author of newly published 2nd edition of Mastering Regular Expressions, wrote a feature article for O'Reilly Network on the recent innovations in the regular expression world. You'd think that such area as regular expressions would be fairly stable, but according to the author, 'when I started to work on the second edition of Mastering Regular Expressions and started refocusing on the field, I was rather shocked to find out how much had really changed'. The article's behind-the-scene purpose is apparently to push a new book that O'Reilly published this month, but it has great educational value for anyone involved with practical extracting and reporting."

11 of 248 comments (clear)

  1. regexp by Anonymous Coward · · Score: -1, Offtopic

    regexp regexp

  2. Do we need complex acronyms? by poopbot by Anonymous Coward · · Score: -1, Offtopic

    Credits: dmg

    Yet again the Linux so-called elite, backed up by their pseudo intellectual cohorts of the w3c conspire to ruin Linux's chances in the marketplace by sowing confusion and complexity. As someone with years of experience in the marketing world, I am constantly amazed at the willingness of the W3C and other bodies to pollute the acronym space with their content free "TLAs".

    Basic marketing 101 (and an undergrad course in psychology) would tell them that the normal person is only capable of remembering approximately 7 items of data in their short-term memory, but now we have to remember HTTP, HTML, XML, XSL, DTD, PHP, SSL, DSL, ADSL, ISDN, Perl, etc etc etc

    This is a text book example of the tail wagging the dog from a marketing perspective.

    I have been following the standardisation of the web for many many months now, but one thing has become clear, E-commerce will NEVER become popular so long as there are so many confusing acronyms involved. The guys in charge of marketing Linux absolutely MUST work to reduce the number of acronyms. One possible solution would be to merge those protocols which are not all that different. For example, why not merge XML with SGML ? (they could call it XSGML or SXGML or perhaps XMSGML), they seem to address the same problems. Or would that be too simplistic a solution for their pampered elitist ivy-league minds to comprehend ?

    If something is not done URGENTLY, and I mean URGENTLY, Linux (and other more experimental derivatives such as FreeBSD) can never hope to be taken seriously as an e-commerce platform by the people who count - the accountants.

    The miracle of Linux is that anyone actually runs it at all, considering one seems to require a masters in computer science to install it! (contrast this with NT4 which was so easy to install, we let our receptionist upgrade her own machine).

    As usual my "open source" advice is free. Hopefully this time my valuable advice will be taken into account the next time the w3c smell an acronym brewing.

    Finally, in conclusion, as an American, I am saddened that the Internet seems to have been commandeered by a European based protocol. Was America so short of talent we had to buy the HTML protocol from Tom Berners-Lee at CERN ?

    Think of the security implications of the worlds strongest economy, running an e-commerce protocol developed by a foreigner from Socialist Europe. Remember the wall has not been down for that long. Who knows what kind of trojans might be lurking within the depths of these complicated protocols.

    I am afraid I am behind Al Gore on this point, how can this be necessary in the home of smart corporations such as Microsoft and Intel ? The answer is the vast subsidies given by European socialist governments to fund development of the HTML specification.

    The solution is clear. The federal government should mandate and strongly subsidise the use of Microsoft software for all US corporations involved in e-commerce. Only with a US-developed set of protocols can we be assured of the security of our transactions.

    - posted by poopbot: because we're all crapflooders at heart

    2cKsagPMOl

  3. All Hail the Cool-Owl Book! by Icepick_ · · Score: 0, Offtopic

    These owls are much, much cooler than this one.

  4. Re:Contentless article by Anonymous Coward · · Score: -1, Offtopic

    contentless isn't a word.

  5. Re:Contentless article by Anonymous Coward · · Score: -1, Offtopic

    Mod this guy up, biatches!

  6. radio contest please troll thank you by Anonymous Coward · · Score: -1, Offtopic
  7. Re:Contentless article by Anonymous Coward · · Score: -1, Offtopic

    you're an idiot

  8. Re:regexp and programmers by Jonny+Ringo · · Score: 0, Offtopic

    my $poophead = "Over the course of my career I have come to the rather firm opinion that you are not worth much as a coder if you do not know regular expressions. I don't care what language(s) you're proficient in, or if you've memorized every single design pattern the GoF has ever conceived, of do 4 foot by 6 foot UML diagrams in your head. If you can't do regexps then you're missing a basic skill. I bought Friedl's book a couple of years ago, and although I wound up not using man of the Perl related stuff the rest of the book helped me out immensely.
    A programmer without knowledge of regular expressions is like a carpenter without a hammer."

    $poophead =~ s/\sI\s/me the poophead/;

    print "$poophead\n";

  9. Re:.NET regexps and Microsoft's documentation by Tablizer · · Score: 1, Offtopic
    Seen on an IBM Bimbo's forehead:

    This Mind is Intentionally Left Blank

  10. K shall rule the world by Jayson · · Score: 1, Offtopic

    when people become intelligent enough to use it and Arthur finishes K4. Watch Kuro5hin within the next month for an Introduction to K to appear (I will also submit it here, but I doubt it will get posted).

  11. Re:palestine is for suckaz by Anonymous Coward · · Score: -1, Offtopic
    I would like to see all the muslims exterminated, everywhere.

    Muslims are primitive pigs who must be eliminated for the sake of humanity.