Slashdot Mirror


Tetris Is Hard To Test

New submitter JackDW writes: Tetris is one of the best-known computer games ever made. It's easy to play but hard to master, and it's based on a NP-hard problem. But that's not all that's difficult about it. Though it's simple enough to be implemented in one line of BBC BASIC, it's complex enough to be really hard to thoroughly test.

It may seem like you can test everything in Tetris just by playing it for a few minutes, but this is very unlikely! As I explain in this article, the game is filled with special cases that rarely occur in normal play, and these can only be easily found with the help of a coverage tool.

18 of 169 comments (clear)

  1. One line? by GrahamCox · · Score: 4, Funny

    If anybody wrote code like that for me, they'd be made to sit on the naughty step and think very, very hard about what they'd done.

    1. Re:One line? by lgw · · Score: 4, Informative

      Except that is not "one line". It is six lines. Any program can be a "one-liner" if there is no limit on the line length. Well, unless you writing it in Python.

      The line length limit is 256 bytes, of course. And these hacks are the basic-equivalent of the C obfuscation contest.

      As the authors say: "I'd like to think it is self documenting. The code speaks for itself; even if what it has to say is not very nice."

      --
      Socialism: a lie told by totalitarians and believed by fools.
    2. Re:One line? by BlackHawk-666 · · Score: 4, Informative

      In ASCII, but many BASICs will reduce keywords down to a single byte.

      --
      All those moments will be lost in time, like tears in rain.
    3. Re:One line? by Ed+Avis · · Score: 3, Insightful

      It's analogous to testing itself. Testing cannot prove the absence of bugs, though it can find them. Similarly a coverage check cannot show that your test suite is adequate, but it can show it to be inadequate (or perhaps reveal dead code to prune). Nobody is claiming that coverage is the be all and end all of testing. That does not mean it is useless to measure it.

      --
      -- Ed Avis ed@membled.com
  2. in Soviet Russia by Trepidity · · Score: 4, Funny

    It may seem like you can test everything in Tetris just by playing it for a few minutes, but this is very unlikely! As I explain in this article, the game is filled with special cases that rarely occur in normal play, and these can only be easily found with the help of a coverage tool.

    Tetris doesn't need coverage tool to test you. Everything about you.

    Code-coverage tool is crutch for weak capitalist engineer. Tetris is Soviet technology, forged by people's will.

  3. Nice advertisement by Anonymous Coward · · Score: 5, Insightful

    From a company promoting automated WCET analysis. Hah!

    1. Re:Nice advertisement by phantomfive · · Score: 4, Insightful

      Normal users don't test all cases of a game.

      Maybe not, but as soon as you tell yourself, "I don't need to test this code, a normal user will never get to it;" you can be certain that after saying that, a user will find a way to break it. The Gods of Eternity will laugh at you.

      --
      "First they came for the slanderers and i said nothing."
  4. replacing line feeds with terminators is not a 1-l by raymorris · · Score: 4, Informative

    In BBC BASIC, a colon is a statement terminator, much like a semicolon in languages with C-style syntax. The linked code is therefore not a one-liner by any meaningful definition of the term. One could replace all of the linefeeds in Linux kernel source with semicolons and other appropriate terminators. That wouldn't make the kernel a one-liner.

  5. Re:Perl-standard line length by JackDW · · Score: 3, Informative

    A BBC BASIC line is limited to 256 bytes, including the line number. It is impressive to squeeze the whole game into that space. There's also a 140-byte Javascript version, also very impressive, though it has fewer features and doesn't have all the block types.

    --
    You're an immobile computer, remember?
  6. Re:One Line by dgatwood · · Score: 3, Informative

    It's simple enough to implement in a shell script. At least three or four of us have done it over the years.

    --

    Check out my sci-fi/humor trilogy at PatriotsBooks.

  7. perhaps. I wonder if it NP-hard by raymorris · · Score: 4, Interesting

    The fact that probability is involved doesn't mean there's not an optimal strategy, of course, where optimal is defined as "highest expected score" (score X probability). So figuring out an optimal strategy is a hard problem - how hard is it?

    If the probability of a certain series of shapes coming next were 100%, we'd have an NP-hard problem, agreed? Does another probability make it easier or harder? Harder, if anything. That's provable because the probability version can be solved by solving each of the potential series as if each were known. What's harder than NP-hard? It may well still be NP-hard. It can't be of any more solvable complexity class.

    1. Re:perhaps. I wonder if it NP-hard by phantomfive · · Score: 4, Funny

      What's harder than NP-hard?

      Intractable.

      --
      "First they came for the slanderers and i said nothing."
    2. Re:perhaps. I wonder if it NP-hard by allo · · Score: 3, Informative

      I disagree.
      For a stochastic process your greedy "take the option with the best chance" algorithm may work, it may fail completely, just depending on the random numbers. If you have an stochastic polynomial algorithm, you have a chance to get the same or better expectation value than your "optimize global then choose greedy" algorithm. Both approaches may win or fail, but in the deterministic game the np-complete version always wins, while the "shortcut" version cannot compete. In the stochastic version, the shortcut may be as good as the optimal solution, because you cannot get the global optimum anyway so choosing a local one may be a good choice.

  8. Infomercial for a code coverage tool? by 140Mandak262Jamuna · · Score: 4, Interesting
    Code coverage tool seems like a good idea from some theoretical stand point. But in practice number of code paths multiply rapidly and getting all the paths executed would involve unreasonably long time. Further rarely called procedures or rarely executed is just one class of problems. There are functions that will execute million times correctly and misbehave once in a million or once in a billion calls. For example I came across a bug in something so simple like calculating the centroid of a triangle. Absurdly simply code that adds the x, y, z coordinates of the vertices and divides by 3. That is all. In dealing with output of some CAD software, when the smallest angle of the triangle fell below 1.0e-08 radians, it returned a wrong value of the centroid. Typical sanity checks based on mathematical facts, like centroid of a triangle can never be outside the triangle will not work. The code that checks inside-or-outside of triangle is far more complex than the centroid code. The floating point truncation errors make this kind of sanity check useless. You can't even plot it on the screen and look at the centroid. OpenGL is implemented in single precision.

    So at some point you reach a point of diminishing returns. It might not be worth making sure every line got tested when there are procedures that have a bug that happens in one in a billion calls. My philosophy is, "Perfection is the goal. Doing better than the last release is the shipping criterion".

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
  9. Nonsense -- make your own test suite by rs1n · · Score: 4, Insightful

    Why would you only test your code via normal use? Why wouldn't you just create a test suite that actually tests all the scenarios? In the case of tetris, you can simply force a sequence of pieces that will enable you to reach the scenarios described in the article. Or you can even start the game with a pre-made board.

    Has slashdot really become a means for tech companies to inject free advertisement by a simple blog post made to look like real journalism?

  10. Beware coverage tools by Anonymous Coward · · Score: 3, Insightful

    The article makes it sound like coverage tools help! If you're not familiar with them, they tell you which bits of code have been run, not how many of the N cases of that code have been executed.

    So the code might fail with a particular combination of inputs, but the coverage tool is more interested in which bits of the code have been execute.

    It's one of these tools and metrics that non-technical managers use to substitute for an ability to read code.

  11. C64 BASIC version within a screen of code by Neo-Rio-101 · · Score: 3, Interesting

    It's a 15 liner.
    Note that the {CBM-x} represents the graphic on that particular key (press the c= key and the letter to produce it)

    1 a$="efijefijefijefijbfjnhijkbfjnhijkijfgaefjijfgaefjefjkiefbefjkiefbbfjidefj"
    2 a$=a$+"abeieijkaeijijkgabfjiefgehijebfj@abe@dhe":o=207:dime(o):forx=0to111
    3 print," {CBM-M}"," {CBM-G}":p=asc(mid$(a$,x+1)):e(x)=(pand3)+(pand12)*10:next:m=2024
    4 print," {CBM-T}{CBM-T}{CBM-T}{CBM-T}{CBM-T}{CBM-T}{CBM-T}{CBM-T}{CBM-T}{CBM-T}":gosub6:goto7
    5 pokei+e(r),c:pokei+e(r+1),c:pokei+e(r+2),c:pokei+e(r+3),c:c=160:r$="d":return
    6 i=1152:r=h:c=32:gosub5:j=int(rnd(0)*7)*16:r=j:gosub5:r=h:h=j:i=i+9:return
    7 gosub6:w=i:t=i:g=r:k=240:l=1278
    8 gosub15:c=32:gosub5:r=-r*b-g*notb:g=r
    9 i=w:w=i+40:gosub5:gosub15:ifbthen12
    10 getk$:g=randkor((r-4*(k$="s"))and15)
    11 t=w:w=w+(k$=l$)-(k$=r$):l=l+40:goto8
    12 c=o:gosub5:m=m-(l0:w=-t*b-w*notb:l$="a":return

    --
    READY.
    PRINT ""+-0
  12. Re:Sort of spammy, also not convincing by wrook · · Score: 3, Insightful

    Code coverage tools will not tell you if your tests are sufficient. They simply tell you what lines of code were hit. They don't tell you whether or not the line of code was hit while doing a meaningful test. In fact, it is trivially easy to write "tests" that exercise 100% of the code but have no expectations at all.

    What code coverage tools tell you is what code you definitely haven't tested. If you haven't run that line of code in your tests, you definitely haven't tested it. This is useful information, but not essential if you have a good team. My current team is quite comfortable writing tests. We do most things TDD and without trying hard our average code coverage is 96%. I occasionally wander through the other 4% to see if it is worth testing and most of the time it isn't. Occasionally I will find the odd piece of logic that was jammed in hurriedly without tests, but on our team it is quite rare. On the other hand, I have worked on teams that were not comfortable writing tests and mostly wrote them after writing production code. On those teams we would get about 75% test coverage with holes you could drive a bus through. A code coverage tool was very useful for educating people on the need to improve the way they wrote tests.

    I feel very confident I could TDD my way through a tetris implementation and get 100% code coverage without undue effort. I don't think I would find all of the corner cases without help, though. A code coverage tool wouldn't help me in that instance.