ICFP 2001 Task
David Mentré writes: "The ICFP 2001 programming contest TASK is now available. The objective is to write an optimizer (aka compressor) for an HTML-like language. It must work for arbitrarily big inputs and in a limited wall-clock time. Can you guess what will be the winning language? ;)" We already announced the contest, but now that the task is available, it might be fun to look at. So what will the contestants come up with? Anyone think perl might be the language of choice? :)
Slashdot editors suck, and that's the story i'm sticking to!
We don't need = (except in initializers, of course)...
Hmm... I have better things to do than figuring out an algorithm for this, really I do...
--
Barry de la Rosa,
public[at]bpdlr.org
-- /. ID is lower than Bruce Perens'!
Barry de la Rosa,
public[at]bpdlr.org
My
Just to point out, there's a good functional XML parser/processor available for Scheme - as everyone here knows, XML is just a verbose way of writing certain Lisp S-Expressions. This software will allow you to load any XML in as a lump of Scheme, and it can then be treated like any other Scheme data structure you've been taught how to process in CS 101. See www.lh.com/~oleg/ftp/Scheme/xml.html
for example: <WEIGHT unit="pound">
<NET certified="certified"> 67 </NET>
<GROSS> 95 </GROSS>
</WEIGHT>
becomes:
(WEIGHT (@ (unit "pound"))
(NET (@ (certified)) 67)
(GROSS 95)
)
Choice of masters is not freedom.
Java?
Anyone? Ok, I'm actually serious, don't mod me up +1 Funny.
While that's partly true, it won't help you. The task is designed for functional languages, and it shows. Functional languages have algebraic data types and pattern matching built in. In C you'd have to implement this yourself, plus all the code required for memory management and write the optimiser, all in three days.
That's why I predict O'Caml will win. Of all the languages which have native support for precisely this kind of task, it's the one that also has the support for beating the time limit.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Java does not have pattern matching, unless you think method dispatch is pattern matching. :-)
Having said that, Java would be a good general-purpose language to solve this problem in. I don't like your chances given the three day limit, though.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
No, flexibility will be the deciding factor, closely followed by speed of development. Speed of execution won't make a huge difference, because the judges are going to throw in a test case so large and complex that no program can find the "optimal" sequence in the given wall time on the test machine. So the winner will be the program which can make the best use of the time available.
The winner will use a set of optimisation techniques (from simple transformations to expanding the string out into decorated characters and re-encoding), and will be able to combine these techniques seamlessly. And this will be coded in three days.
This is not a test for a systems programming language like C. This is a test for a language in which you can write complex decision logic quickly.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
I predict that the winner will probably be O'Caml, and for one reason only: it will beat the time limit. Moreover, this will have little to do with the speed of the O'Caml implementation.
Your program has to complete within a given wall time (not CPU time). This is supplied as a command-line argument and environment variable. The winner will be the tool which can stop computing before the time is up and produce the output. The winning program will be one that iteratively improves the input string, but is monitored by another thread which stops the optimiser thread when the time is almost up, then grabs the best solution so far and performs the output.
O'Caml just happens to have good built-in primitives to write this kind of "interrupt the thread when the time is up and let me know what you did" computation.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Thats because all of the good C engineers have jobs and are too busy to fsck around with stupid games like this. Caml programmers bored from a lack of any employment in the real world had no problem finding the time to dedicate to this problem.
Someone you trust is one of us.
Since most can handle it, you may even want to "compress" head and body statements... And, of course, leave out closing tags.
print "404 Not FoundThe requested URL was not found on this server.";
This message is provided under the terms outlined at http://www.bero.org/terms.html
At least for correctness. Think about it..
All you have to do, is parse the document, and then spew it back out in a smaller representation. There are lots of booleans in there, and those are as far as i can see what you really need to worry about.
Of course, I'm not gonna bother to do this one...
Anyhow, speed will be the deciding factor on this one. I wouldn't be surprised if the winner is written in plain ole C.
Stop the brainwash
I don't think lisp is the most popular functional language any more.. I would say it's probably SML or O'Caml. Lisp hasn't won in any of the previous years, I don't think.
I agree wholeheartedly, though, that Perl won't be powerful enough for this kind of AST manipulation. I like that the task will probably make slashdot kids think that perl is extremely appropriate -- that will put them in their place all right.
I am positive Brainf*ck will be on the podium.
I'm afraid I am not suffinciently advanced in these languages to bet on them, but I reckon they would give the other contenders a run for their money.
XSLT is designed not to have side effects and is instead more of a functional language meaning that you can't assign values to variables dynamically. Considering that this task may require quite a lot of storing of state before deciding which optimization to make I think it is rather unlikely that XSLT is a good language to use to solve this particular problem.
Then again I didn't look too hard at the task but it may be that the judges have designed it in such a way that it is amenable to functional programming considering that it is the being run by the International Conference on Functional Programming. Even then most functional languages do allow you to store state, even Lisp so an XSLT solution would have to be rather clever.
--
You're missing the fact that the tags have to be paired. Ensuring the tags are paired isn't a problem, but ensuring the tags are paired optimally is where the difficulty arises.
:). I'm writing my entry in Haskell and I have a fairly decent optimizer except for these cases and it's currently a 110 line Haskell program.
Consider: <B><I>Hello</I></B>World&l t;B><I>Foo</I></B>
Should be: <B><I>Hello<PL>World</PL>F oo</I></B>
But how do you know if it's more optimal to go with a PL tag vs closing the B and I? You would need the array to go further down and see that you will need the B and I again so you don't bother to close it.
This is the problem I'm running in to now
--BEGIN SIG BLOCK--
I'd rather be trolling for goatse.cx.
Things you think are in the Constitution, but are not.
If you need me, I'll be doing a preliminary victory dance over in the QBASIC camp...
A helluva lot simpler. One form- (function-name param1 param2... paramN). Not a whole bucket of syntax rules like you find in those BCPL derived languages. Oh well, let the children speak their C, C++, and Java and leave the meat to the parents.
Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
Perhaps. But C++ never was very usable for a big project to me either. So much work for so little result. Smalltalk has worked for me, but to each his own I supose.
Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
ie, it doesn't have to be a functional programming language.
Logo! Logo! Bow before the awesome power of the turtle!
He put his boots up on the table and made a face. "The sig," he smirked. "You can waste your life in search of the sig."
As far as time limits are concerned, may I suggest Mercury. It's a functional/logical hybrid that's somewhat unorthodox for VHLLs in that it's a compile-only rather than interpreted-only or interpreted/compiled. Its designers claim that it can get similar performances to programs of similar functionality written in C, alhtough I haven't seen any benching, and I'll believe it when I see it. :)
Anyone think perl might be the language of choice? :)
No, because Perl isn't a functional language. *sigh*.
Actually, they say they will test with inputs as large as ``a few megabytes''. Note that it is always legal to merely echo an input if you can't figure out how to optimize it...
Regardless of how we do in the competition, we're learning a lot about our programming language and its implementation, anyhow. The task isn't a perfect fit for our language, which is good, since it has tickled some of the dark corners already.
Two days to go!
Lisp is not just a functional language, as any functional purist will tell you with an air of disdain. It supports multiple paradigms.
I don't know Perl, so the following is a real question, not merely rhetorical: Can you create new functions on the fly in Perl as you can in Lisp? That is, can you have a function1 that returns a new function2 as its value, where function2's definition is relative to the parameters that were passed into function1 at its time of creation? That is, can you create closures in Perl? In Lisp, this is done as follows:
Just passigning functions around does not make a language functional. Heck, C can do this function pointer. Truly functional languages support the creation of closures (among other things). Can Perl do this?
why would C be inferior for solving this task?
It's a great choice. You'll kill the guys working in x86 assembler.
I'd like to create a Forth Team as I consider this rather ignored language as one of the most elegant solutions in terms of language simplicity as well as interpreter efficiency.
--
Trolling using another account since 2005.
Did anyone else spot the similarity between the name of their mark-up language and another language?
No, your children are not the special ones. Nor are your pets.
It takes a little up-front learning if you're used to, say, C. The syntax is different from other languages, but actually simpler.
MSFT was among the 1999 winners, but their entry was in Haskell, not Visual Basic. Unfortunately their writeup has disappeared from the MSFT web site.
Hint: 20 levels of bracing means you should be breaking that function down into smaller, reusable functions. You don't tend to do that in C++ because it's such a drag -- cut, paste, declare, etc. In Scheme new functions are cheap and easy.
That aside, who says you have to use it for a large project before it's fun? I'm having fun with a couple of 3KLOC projects.
- PLClub submitted two separate entries using OCaml, either of which would have won the contest.
- Camls 'R Us took second place.
- Galois Connections took third with their Haskell entry.
- The Merry Mercurians took fourth with their Mercury entry.
1999:- Camls 'R Us mopped up the competition with their 3585-line OCaml entry
- The 1250-line Haskell Entry that took 2nd place was written in a mere 24 hours.
1998:Everyone knows that Integer Basic had much better performance than Applesoft, as long as you can tolerate only having integers. 10 COLOR=3 20 PLOT 10,10 umm, what else was the program supposed to do ?
Thanks to relentless innovation from Redmond, the choice is clear. A team working in Visual Basic will clearly triumph.
Art At Home
Perl hasn't a chance. As much as we all love it, it is not up to the task of writing complex programs such as this. The winner will probably be written in LISP, because it is the most popular functional language. The only other possible winners are better functional languages such as Caml or Haskell.
Even Slashdot wants to hide some things
You mean this story is thin, neat, and dresses well?
Perl does in fact have proper closures. E.g.
#!/usr/bin/perl -w
use strict;
sub make_adder {
my $addend = shift;
sub {my $x = shift; $x + $addend}}
my $adder = make_adder(3);
print &$adder(2), "\n";
IMHO not as nice as Lisp or ML, but there nonetheless.
I think I'll use Scheme
Casual Games/Downloads
Everyone is welcome.
print "
<HTML>
<HEAD>
<TITLE>404 Not Found</TITLE>
</HEAD>
<BODY>
<H1>Not Found</H1>
The requested URL was not found on this server.
</BODY>
</HTML>";
"Your superior intellect is no match for our puny weapons!"
I think I will just submit cat(1). Should beat all others in the "correctness" and "speed of optimisation" department!
...but it might be hard to make it really fast. Maybe if you wrote it in assembly?
A somewhat simple but slow way of doing it would be to create an array in which each non-tag character has a set of flags for bold, italic, whatever and a set of fields for color, size, etc. Then simply go through the newly created array and put in tags when ever the flags of fields change.
It should be possible to optimize that... but I imagine significantly more efficient ways to do it will be found.
I think this method will create highly optimized output... I can't think of anything to screw it up off the top of my head... but maybe I'm just missing something really obvious.
Perhaps if some of the tags turn other tags off? I don't remember. That could screw it up.
Comments?
It is a miracle that curiosity survives formal education.-Albert Einstein