Slashdot Mirror


Can You Do the Regular Expression Crossword?

mikejuk writes "Programmers often say that regular expressions are fun ... but now they can be a whole lot of fun in a completely new way. Want to try your hand at a regular expression crossword? The idea is simple enough — create a crossword style puzzle with regular expressions are the 'clues.' In case you don't know what a regular expression is — it is a way of specifying what characters are allowed using wild-card characters and more. For example a dot matches any single character, an * any number of characters and so on. The regular expression crossword is more a sort of Sudoku puzzle than crossword however because the clues determine the pattern that the entries in a row have to satisfy. It also has to use a hexagonal grid to provide three regular expressions to control each entry. This particular regular expression crossword(pdf) was part of this year's MIT Mystery Hunt. This annual event is crammed with a collection of very difficult problems and the regular expression crossword, created by Dan Gulotta from an idea by Palmer Mebane, was just a small part of the whole — and yes there is a solution."

82 of 115 comments (clear)

  1. Solution by Anonymous Coward · · Score: 1, Interesting

    I'll post the solution as regular expression

    *

    ^there you have the solution, the infinite plays of Monkey-Shakespeare, and the answer to life and the universe, and everything.

    1. Re:Solution by Stradenko · · Score: 4, Informative

      I think you mean .*

    2. Re:Solution by Coolhand2120 · · Score: 4, Informative

      The article summary was wrong about * and so are you. At least the language in the summary leaves much to be desired, although they are correct about it being a numerator, they leave off the part that it matches the previous character or subexpression. * = the previous character or subexpression zero or more times. As Stradenko pointed out to get ANY character you need . (period). To get any character zero or more times you need .* (period asterix). To get the solution to anything with more than one line you need [\s\S]*.

      So you're pretty far off the mark as far as 42 goes.

    3. Re:Solution by dkf · · Score: 1

      To get any character zero or more times you need .* (period asterix). To get the solution to anything with more than one line you need [\s\S]*.

      That depends on the RE dialect; some treat newline as an ordinary whitespace character by default.

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    4. Re:Solution by BlackPignouf · · Score: 1

      Sorry, your solution is wrong.
      "HCXRCMIIIHXLS" doesn't fit the regexp.
      Try http://twoevils.net/cross-regex.html

    5. Re:Solution by swilver · · Score: 1

      You're right. The center row should be "HRXRCMIIIHXLS".

      I also preferred doing it on the print out, much easier for this kind of puzzle.

    6. Re:Solution by hoggoth · · Score: 1

      I hate you.

      I taught my son regexes so he could help me, and he is walking around mumbling 'I hate .* '

      --
      - For the complete works of Shakespeare: cat /dev/random (may take some time)
  2. Just solving it is easy. by Anonymous Coward · · Score: 2, Funny

    Solving it without going insane, on the other hand, is an entirely different story.

    1. Re:Just solving it is easy. by gd2shoe · · Score: 1

      Tell me about it

      Why the hey did they have to put 2/3 of the clues upside down? That was cruel.

      (And yes, I realize it was an attempt at uniformity, to have every line take the form of clue-answer. Still, it is impossible to retain that form without having most of the clues upside down no matter how you turn the page. If it's merely to slow down students in the competition, I call unnecessary roughness. Judging students ability to read math upside down is worthless compared to the value of a good puzzle.)

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    2. Re:Just solving it is easy. by Sun · · Score: 1

      It being up side down was less of an issue than the fact I kept losing my place on the page, because the problem does not have an at-a-glance orientation. I solved that by drawing a couple (four, actually) of arrows.

      Shachar

  3. Obligatory xkcd by XDLMAO · · Score: 2, Funny
    1. Re:Obligatory xkcd by Anonymous Coward · · Score: 1

      And here I thought it was going to be: http://xkcd.com/356/

    2. Re:Obligatory xkcd by Anonymous Coward · · Score: 2, Funny
    3. Re:Obligatory xkcd by fche · · Score: 5, Insightful

      Randall should draw a comic about obligatory xkcd references.

    4. Re:Obligatory xkcd by Anonymous Coward · · Score: 1

      I got nerd sniped. Luckily I wasn't in the road. Only took about 2 hours.

    5. Re:Obligatory xkcd by MagicM · · Score: 1

      http://xkcd.com/356/

      I haven't gotten anything useful done all morning!

    6. Re:Obligatory xkcd by Anonymous Coward · · Score: 1

      That would just result in obligatory obligatory xkcd reference references being posted whenever an obligatory xkcd reference is posted.

  4. Great idea, but... by stephanruby · · Score: 3, Insightful

    It's a great idea, but the puzzle given is too complicated.

    If they really want to popularize this concept among programmers, many of whom have forgotten regular expressions even if they had once mastered them, they should really create much simpler puzzles in a mounting order of difficulty.

    Hopefully, someone enthused by the idea will create and publish such puzzles.

    1. Re:Great idea, but... by Anonymous Coward · · Score: 1, Insightful

      I don't know a single programmer who has forgotten regular expressions. Who are the "many" you speak of?

      Besides, rather than the puzzle being too complicated, maybe your brain is too simple?

    2. Re:Great idea, but... by Anonymous Coward · · Score: 5, Funny

      The only thing difficult about the puzzle is the format in which it is presented. How many people have printers? Of those, how many have working printers? And, of those, how many also have paper?

    3. Re:Great idea, but... by Anonymous Coward · · Score: 2, Funny

      s/forgotten/never learned/g

    4. Re:Great idea, but... by Goaway · · Score: 2

      It takes about an hour to solve. It isn't terribly complicated.

    5. Re:Great idea, but... by stephanruby · · Score: 1

      You're right. My brain must be too simple.

      Nothing gets by you.

    6. Re:Great idea, but... by SQLGuru · · Score: 4, Funny

      How many people have printers? Of those, how many have working printers? And, of those, how many also have paper?

      I have all of those........but no ink.

    7. Re:Great idea, but... by pipatron · · Score: 1

      I pasted the URL to the PDF in gimp, solved the puzzle in a layer. Not really rocket surgery here.

      --
      c++; /* this makes c bigger but returns the old value */
    8. Re:Great idea, but... by hcs_$reboot · · Score: 1

      It's a great idea, but the puzzle given is too complicated

      The puzzle was obviously designed by a program - so the solution should also come from a software.

      --
      Slashdot, fix the reply notifications... You won't get away with it...
    9. Re:Great idea, but... by CAIMLAS · · Score: 1

      Yes. This.

      I use regex (pcre) on a daily basis. This? This hurt my head. Holy shit that puzzle is hard. (Granted, I hate crossword puzzles... maybe I'm not old enough yet.)

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    10. Re:Great idea, but... by Anonymous Coward · · Score: 2, Insightful

      Really? I'm about as far from a Regex Guru as you can get and frequently advocate against using them for anything but the simplest task and I was able to solve it in about 45 minutes or so. When you first sit down with it, it looks near impossible, but there are a handful of hexes that can be deduced immediately and after getting a few more it's not that much harder than a sudoku.
      I though the puzzle was challenging, but not overwhelmingly so in any way and would love to see more of them.

    11. Re:Great idea, but... by Sam+H · · Score: 1

      I think you haven't actually given it a try. The clues are written as regexes, which require to know the syntax, but it's actually a pretty easy logic puzzle.

      --
      God, root, what is difference ?
    12. Re:Great idea, but... by MightyYar · · Score: 1

      Are your still in your 20s or something? You'll forget anything you don't use in a while. When I did web scraping I got quite good at regex. Now, I have to look some things up, especially when dealing with look aheads and look behinds and other slightly more esoteric features. I am working through the puzzle, though.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    13. Re:Great idea, but... by ArsenneLupin · · Score: 2

      You could always squirt out some more ink, I mean, you are an octopus, right?

      ... and if you aren't, just use black paper...

    14. Re:Great idea, but... by MightyYar · · Score: 1

      Sorry, didn't mean to imply that the regexs in this puzzle used lookaheads or behinds. There's nothing even remotely esoteric about the ones in the puzzle.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    15. Re:Great idea, but... by BlackPignouf · · Score: 1

      Ruby, for example.

    16. Re:Great idea, but... by saveferrousoxide · · Score: 1

      the Visual Studio replace tool

    17. Re:Great idea, but... by UCFFool · · Score: 1

      I was inspired, so I made an easier one. http://mariolurig.com/crossword/

      --
      "The more pity, that fools may not speak wisely what wise men do foolishly" - Touchstone,Shakespeare's "As You Like It"
    18. Re:Great idea, but... by Daniel+Klugh · · Score: 1

      Acording to O'Reily's "Unix in a Nutshell", ex (and hence vi), sed and ed support the \number feature. I know from experience that vim and GNU emacs support it. In fact, GNU emacs's "dired" ("directory editor") feature lets you re-name files automatically; allowing you to convert floating point numbers in file names to fixed point (so they sort corectly) and corecting extentions (e.g. /\(.*\)\.jpe/\1\.jpeg/).
      (n.b. you have to put back-slashes before parenthese to make them meta-characters in emacs)

      --
      Daniel Klugh
  5. simple? by bitingduck · · Score: 4, Funny

    There's probably already a CPAN module for solving it...

  6. Apologies to Betteridge by MtHuurne · · Score: 3, Interesting

    Solved it a few days ago. It was fun. It's not as hard as it looks.

    and yes there is a solution

    In fact, there is exactly one solution.

    1. Re:Apologies to Betteridge by hcs_$reboot · · Score: 3, Funny

      Well done, you deserve your 1?[0-9] points.

      --
      Slashdot, fix the reply notifications... You won't get away with it...
    2. Re: Apologies to Betteridge by almitydave · · Score: 1

      The "?" means match zero or one times, not any single character. In this context, it means the 1 is optional.

      --
      my, your, his/her/its, our, your, their
      I'm, you're, he's/she's/it's, we're, you're, they're
  7. Rules? by Yojimbo-San · · Score: 1

    Where are the rules? Just the grid isn't much help; for example the clue N.*X.X.X.*E on a length 9 line might be NXxXxXE (length 7). A colleague has just looked at the solution and my hypothesis is that each regex fully describes the line (i.e. /^clue$/) but it would be nice to be sure ...

    --
    Quick wafting zephyrs vex bold Jim
    1. Re:Rules? by MtHuurne · · Score: 1

      I haven't seen any part of this puzzle other than the grid itself, but if you interpret every clue as a match for the full line like you said, there is exactly one solution.

    2. Re:Rules? by DarwinSurvivor · · Score: 1

      Seeing as some have .* at the start and end, it strongly implies that it must match the entire line.

    3. Re:Rules? by DarwinSurvivor · · Score: 1

      Then again, the ^'s sprinkled around seem to imply the opposite.

    4. Re:Rules? by Anonymous Coward · · Score: 2, Insightful

      The rules are anchored to the ends. Printing a ^ and $ on each clue is redundant and silly, when a moderately intelligent person could easily figure that part out for themselves.

    5. Re:Rules? by Goaway · · Score: 1

      ^ has multiple meanings in regexes.

    6. Re:Rules? by Aristos+Mazer · · Score: 2

      Not useless. It has to match the whole line. If the regular expression matches zero characters, then the rest of the line is left as the next token in the string. You're thinking of it as a parser... think of it as the results of a parser -- the parser ran, and it returned the complete line of characters as a token when given this regular expression. Does that help you understand why this works?

    7. Re:Rules? by deek · · Score: 1

      That makes more sense. You're right, I was trying to match the regular expression to the line, instead of the line as a result of the regular expression. Still, it would have been nice to remove the ambiguity and wrap each clue with ^ and $. No matter how redundant or silly it seems to Anonymous Cowards.

    8. Re:Rules? by pipatron · · Score: 4, Informative

      Everywhere ^ is used in the puzzle it means that it matches anything not in the group. For example [^abc] would match any character except a, b and c

      --
      c++; /* this makes c bigger but returns the old value */
    9. Re:Rules? by Anonymous Coward · · Score: 1

      Mystery hunt tradition is that many puzzles have no rules, you have to figure out what to do as well as solve the puzzle...

    10. Re:Rules? by DarwinSurvivor · · Score: 1

      GAH, yes, forgot about that. Need to practice some more regex apparently...

  8. I have a flight from Seattle to Boston by XnavxeMiyyep · · Score: 1

    I have a flight from Seattle to Boston that stops in NYC tonight. Looks like I'll have something to do! Hope I remember all the regex syntax...

    --
    I put the 't' in electrical engineering.
  9. Breaking news. by mutube · · Score: 4, Funny

    Yvonne Lee, Community Manager at Dice.com writes,

    ^\\([^ ()]+\\)\\(([0-9]+\\),\\([0-9]+\\))"

  10. Interactive by Ozan · · Score: 5, Informative

    No need to print out the puzzle, somebody made an interactive version:
    http://twoevils.net/cross-regex.html

    1. Re:Interactive by Garridan · · Score: 1

      Gee, thanks. I made a hex grid in gedit and solved it in that referencing the pdf for the clues. And NOW you post the widget. On the puzzle: that was a blast! I want more! My only disappointment with the puzzle is that a certain amount of meta-puzzling (this part of that clue provides no info unless...) proved to be useful -- I never used that knowledge 'just in case', but it was never wrong. I prefer puzzles with misleading meta-hints to trap fools. (instead of me being made the fool)

    2. Re:Interactive by BlackPignouf · · Score: 1

      nhpeha sdiomomthfoxn xaxphmmommmmrhhm cxnmmcrxemcmccccm mmmmmh rxrcmiiihxlsoreoreoreorev cxcchhmxccrrrrhhhrru ncxdxexlerrddmmmmgcchhcc

      Upcased and without spaces is the solution.

  11. Parenthesis puzzle by jabberw0k · · Score: 1

    The article asks if anyone has composed other programming puzzles, like a parenthesis puzzle.

    Any LISP program should qualify for that one.

  12. The problem with expression by shoeman_g · · Score: 1, Funny

    If you have a programming problem that requires regular expressions, you now have two problems.

  13. That's not right by iYk6 · · Score: 2

    For example ... an * [matches] any number of characters and so on.

    No. That's shell expansion, not regular expression. To match any number of characters, you would use ".*".

  14. Re:ObBetteridge by Lorens · · Score: 1

    My first thought when seeing the crossword was that to make sure there aren't two or more answers, you place the clue between egrep and /usr/share/dict/words . . . and that effectively cured me of any desire of actually doing the crossword.

  15. Shouldn't that be... by FuzzNugget · · Score: 1

    In many places, there is /.*/ But shouldn't that be /.+/ ? Or am I to assume that it's accepting spaces? Most likely not.

    1. Re:Shouldn't that be... by seebs · · Score: 1

      No, they mean ".*". A .* is zero or more characters. In some cases, yes, that means zero.

      --
      My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
  16. amb in Lisp by Ckwop · · Score: 1

    Set up an amb for each square. Then use "require" with each regular expression defined across the grid.

    Problem solved - generically - for all time!

    It's not the most efficient solution in the world, but it'll probably still solve it faster than you?

    1. Re:amb in Lisp by gd2shoe · · Score: 1

      I've not dealt with Lisp, so I only think I know what you're saying. I made a Sudoku solver once. It worked immediately. All the time. Every puzzle. There are still people out there who derive enjoyment out of solving Sudoku puzzles.

      This puzzle idea is far more interesting than Sudoku. The fact that a computer solver can be written without great effort doesn't really diminish it.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  17. solved by empgodot · · Score: 1

    It took me about an hour to solve it. I printed it out and additionally to filling in letters I marked the cells where I am not 100% sure that I had the correct letter. I had to do some rollbacks of a few marked cells. I'd say it is as hard as a medium sudoku.

  18. What do you mean "Can you do..."? by seebs · · Score: 1

    I think the question is, can you not do it? Answer for me: No.

    My strategy: I wrote a program which read in a grid of letters (it actually just ignored spaces, so I laid them out in a hex shape), did the collating to produce the strings for each direction, then did, for each clue, four matches: ^re$, ^re, re$, and re. It then displayed the best match it had found. I'd post what this looks like, but the Slashdot comment system won't let me. (Apparently, "too many junk characters", and also no way to make spaces work.)

    And it produced one of these dumps for each of the three sets of clues.

    Then I ran that in another window in a loop, once a second, and started solving. Was super fun. Got it done early enough to sleep some, too. A++. Fun. Would solve more.

    --
    My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
  19. Re:Easy? by seebs · · Score: 1

    I am pretty sure that it is implicitly "all letters".

    --
    My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
  20. star * by Trevelyan · · Score: 1

    Most of the regex are qulified with a star *, which mean 0 or more times. So since the regex allows 0 matches I can put in whatever I like. Maybe they meant + ? I'm not going to look at the solution. I will just concentrate on the few chars that are not suffixed by a * .

    1. Re:star * by gd2shoe · · Score: 2

      If you put a space anywhere in the puzzle, at least one of the clues will fail. The only solution that works is made entirely of the capital letters found in the clues. Don't believe me? Try to find a complete solution with a space in it.

      And yes, they really do mean .* There are several of those that match empty strings, so you need to be on your guard. There is a single + in the puzzle, right where it needs to be.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  21. It's actually easy (really) by Sam+H · · Score: 2

    It took me less than 10 minutes to complete that crossword. It's actually easy, because the clues always give enough information to immediately place a letter somewhere with minor thinking; no tracking back is ever needed (unlike in some Sudoku grids where it's often easier to "try" a number, then cancel if an inconsistency appears).

    Actually most of the clues can be easily translated to natural language and make the puzzle understandable to the average people: [^M]*M[^M]* means "there is one and only one M in this line", (RX|[^R])* means "every R in this line must be followed by an X", etc.

    --
    God, root, what is difference ?
    1. Re:It's actually easy (really) by BlackPignouf · · Score: 1

      If you're being honest, 10 minutes is impressive.
      It took me 2h10m, and I still need to finish (i.e. begin) a paper due tomorrow.

    2. Re:It's actually easy (really) by insecuritiez · · Score: 1

      It took me 2 hours. Nothing about it was very hard but most cells can only be filled in after quite a few other cells in the row are already filled in. This makes the number of logically deducable cells avialable at any given time somewhat low.

  22. goggles google? by johnsnails · · Score: 1

    Can google googles solve it like suduku's?

  23. Inversion by devnullkac · · Score: 1

    And here I was thinking the crossword clues would be as normal, but the answers in the grid would themselves be regular expressions.

    --
    What do you mean they cut the power? How can they cut the power, man? They're animals!
  24. Re:ObBetteridge by gd2shoe · · Score: 1

    Except this isn't actually a crossword puzzle. It doesn't make any words, only odd series of letters. As the summary said, it's more like Sudoku, with a crossword bent.

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  25. Re:Easy? by gd2shoe · · Score: 1

    Nope. In fact, there can be no spaces. (It's not a rule, there just aren't any, deductively. Consider it a free hint.)

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  26. Solution by swilver · · Score: 1

    Took about an hour to solve, but I'm already insane.

    http://hjohn.home.xs4all.nl/RegEx-Solution.jpg

  27. Re:ObBetteridge by sudon't · · Score: 1
    $ grep r*d*m* twl |grep ^........$ |wc -w

    29766

    --
    -- sudon't

    Air-ride Equipped

  28. Re:ObBetteridge by IAmGarethAdams · · Score: 1

    I'm assuming since most of the clues begin and end with .* that they're all intended to be anchored to the start and end of the word. So that particular clue matches an 8 character string consisting only of (zero or more) Rs followed by (zero or more) Ds followed by (zero or more) Ms. I don't think you'll find anything in the word list completely matching that

  29. Re:Misleading lead by IAmGarethAdams · · Score: 1

    I must have missed the day that 0 stopped being part of the set of "any number"

  30. Re:Misleading lead by IAmGarethAdams · · Score: 1

    Sorry, I just figured out you're talking about the missing '.' which is of course a mistake in the description

  31. Interactive version by wasteland.junk · · Score: 1

    Lacking a printer, I threw together an interactive version: http://jimbly.github.com/regex-crossword/ Enjoy!