Slashdot Mirror


Why You Shouldn't Use Spreadsheets For Important Work

An anonymous reader writes "Computer science professor Daniel Lemire explains why spreadsheets shouldn't be used for important work, especially where dedicated software could do a better job. His post comes in response to evaluations of a new economics tome by Thomas Piketty, a book that is likely to be influential for years to come. Lemire writes, 'Unfortunately, like too many people, Piketty used spreadsheets instead of writing sane software. On the plus side, he published his code ... on the negative side, it appears that Piketty's code contains mistakes, fudging and other problems. ... Simply put, spreadsheets are good for quick and dirty work, but they are not designed for serious and reliable work. ... Spreadsheets make code review difficult. The code is hidden away in dozens if not hundreds of little cells If you are not reviewing your code carefully and if you make it difficult for others to review it, how do expect it to be reliable?'"

27 of 422 comments (clear)

  1. What he's really saying is by Anonymous Coward · · Score: 4, Insightful

    "I don't know how to use spread sheets properly."

    1. Re:What he's really saying is by Anonymous Coward · · Score: 5, Insightful

      To be fair, neither to the vast majority of people who use spreadsheets for important work.

    2. Re:What he's really saying is by Anonymous Coward · · Score: 5, Insightful

      Disagree. I think what he's really saying is "I've had to maintain and develop tools made by people that don't know how to use spreadsheets properly, and I'm fucking sick of it."

    3. Re:What he's really saying is by Tyler+Durden · · Score: 5, Funny

      I know exactly how to use spreadsheets properly. Just don't.

      --
      Happy people make bad consumers.
    4. Re: What he's really saying is by Anonymous Coward · · Score: 4, Insightful

      Most people have no idea how to use a relational database.

    5. Re:What he's really saying is by jd2112 · · Score: 4, Insightful

      "I don't know how to use spread sheets properly."

      Or, I realize that just because I have a hammer not all problems are nails.

      --
      Any insufficiently advanced magic is indistinguishable from technology.
    6. Re:What he's really saying is by Giant+Electronic+Bra · · Score: 4, Funny

      No kidding. Also, it MAY not be that easy to review the code in a spreadsheet, but it is VERY VERY EASY to test it. If you want reliable spreadsheets its PERFECTLY possible to test them to the Nth degree, far more so than with most other code. You have a place to put the tests, and a place to put the expected results, its all rather devilishly simple actually. For that matter you can document the bejeezus out of them too.

      I think spreadsheets are like any sort of simple interpreted language. Idiots can easily blow their left foot off. Real software engineers can also do some very cool stuff. Most of the perl code I've seen is ugly as all hell and pretty worthless, but MY perl code is a thing of beauty that people maintain for years. Its all in how you use the tool.

      --
      "Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
    7. Re: What he's really saying is by Ol+Olsoc · · Score: 4, Interesting
      Just got back from a meeting where I was explaining relational databases. No one really got it, but the Excel expert thought I was full of shit.

      Well, I sort of am, but not for that reason.

      There is nothing worse than trying to get a spreadsheet person up and running on relational databases. They argue with you about every point, then they freak.

      --
      The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
    8. Re:What he's really saying is by lonecrow · · Score: 5, Insightful

      Spreadsheets are just a part of the Darwinism of applications. Some sharp fellow within an organization things its important to start tracking some data point or another. Maybe it gets ignored and forgotten. Other times it grows as other people see its utility and start making requests to track related data points. Eventually you get a multi-worksheet or even multi-workbook spreadsheet masquerading as an application. At some point it becomes far to hard to maintain or understand so they contract out someone like me who moves it to a relational database with a web front end. Everyone is happy!

      This work forms a major part of my work load don't fuck with it!

      Also, it is appropriate. It would be inefficient to develop a proper relational database application on the whim that some set of data points might be useful. Spreadsheets are a proving ground, and important stage in the life cycle of an application.

  2. So what's the alternative? by Cyberax · · Score: 5, Interesting

    So what's the alternative? There are no good and easy to use software packages to create simple data-intensive apps. The closest alternative was VB6 and if I had to chose between it and Excel, I'd choose Excel any day of the week.

  3. Code reviewing a spreadsheet by muhula · · Score: 4, Insightful

    If the inability to code review spreadsheets was a real issue, it wouldn't be too hard to convert spreadsheet functions into a functional language. For non-programmers, a spreadsheet lowers the barrier to entry. This allows people to do something useful and productive who couldn't do so otherwise. That's a good thing.

    1. Re:Code reviewing a spreadsheet by geekoid · · Score: 4, Insightful

      For non programmers modern spread sheet give the user rope, with a noose already premade and a map on where to put your head.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  4. Spreadsheets as a software development platform? by Virtucon · · Score: 4, Insightful

    You're doing it wrong.

    --
    Harrison's Postulate - "For every action there is an equal and opposite criticism"
  5. Re:"spreadsheets" = computation program by geekoid · · Score: 4, Insightful

    Maybe you should read it again?
    His real criticizm is that spreadsheet software is horrible for any high end work, or with anything you want to share, and he is correct.

    "so he probably doesn't know how to use the interface of a spreadsheet very well, which makes the act of checking a formula tedious..."
    it is tedious, even if you are an expert and even if the user uses goof practices.

    "P-hacking is the problem in social science/economics research, not using 'spreadsheets'"
    I don't think you know what P-Hacking is.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  6. Piketty's work will be done for him by matbury · · Score: 5, Insightful

    The fact that Piketty's work describes a damning indictement of the USA's most cherished concept - free market capitalism - means that thousands of neo-liberal economists will pour over every single digit and operator in his spreadsheets looking for anything to negate the findings. If they can't find anything, they'll attack him. When you hear of character attacks against Piketty or some other diversionary tactic, you'll know his data is correct.

  7. Re:Spreadsheets - best and worst thing there is by plover · · Score: 5, Informative

    What people fail to realize is that spreadsheets are like any other form of programming, and therefore should be treated as such. Write tests. Break complex formulas down into named cells. Use references to carry concepts. Beware of globals. Keep small concepts small, simple, and modular. Write more tests.

    Does anybody do that with every spreadsheet they write? Doubtful. I know I only go to all that trouble myself when I have a boatload of inputs that have to get put together. I usually discover about part way in that the sheet is going to be complex enough to need tests. When I do, it's time to start refactoring it, and these are my general steps:

    1. Give cells and ranges meaningful names
    2. Break complex formulas down to several small formulas
    3. Add tests for the formulas
    4. Factor out duplicates

    Of all of these, giving cells and ranges names is the most important, because it makes the sheets readable. I can then usually understand the results well enough to know if my formulas are working, but a complex formula often needs an independent set of tests to prove the discontinuities in the functions are actually where I think they should be.

    --
    John
  8. Re:Spreadsheets as a software development platform by preaction · · Score: 4, Insightful

    Tell that to the entire finance and insurance industry.

  9. Some things stick by TapeCutter · · Score: 5, Informative

    I recall a survey of (non-trivial) corporate spreadsheets in the mid-90's, it went something like 95% had a maths bug, in 80% of cases the bug made the sheet useless, 50% of the spreadsheets were used to make (incorrect) financial decisions. The reason why corporations coffers don't evaporate is that they use thousands of them so the +/-ve affect on the money buffer has a central limit of zero. It's a much more precarious situation if you using a single homespun spreadsheet to run a corner store

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    1. Re:Some things stick by viperidaenz · · Score: 5, Funny

      Were the survey results collated on a spreadsheet?

    2. Re:Some things stick by peragrin · · Score: 4, Funny

      nope Access Database.

      now which is scarier?

      --
      i thought once I was found, but it was only a dream.
    3. Re:Some things stick by timeOday · · Score: 5, Insightful
      The question is whether having the logic squirreled away in code or a DB would have made it more correct, which is a big assumption!

      I really think Piketty deserves a lot of credit for releasing his "source" spreadsheets on such a substantive and controversial work. Most authors do not. If the critiques turn out to be substantial and extensive, I plan on waiting for a second edition with corrections before investing time in reading it.

    4. Re:Some things stick by Sarten-X · · Score: 5, Informative

      I've done audits on spreadsheets. They're not terribly difficult, and I dare say they're easier than many of the code reviews I've been through.

      The most important thing is to understand how to use the spreadsheets. Either use separate worksheets for each major step in the calculation, or at least separate the computations using extra blank space. That serves the same function as code blocks, breaking up the computation into smaller, more manageable, pieces. Each small piece can be audited separately, and it provides a clear trail of how one number becomes another.

      Next, use your formatting, even if it's not in a worksheet ever intended for public viewing. I'm particularly a fan of using conditional formatting to highlight the cells in a sheet (especially minimums and maximums) that will be passed on to the next worksheet. Then it's easy to check that the correct values are being passed, and the intermediate values all make sense.

      Finally, use your fill tools correctly to ensure that the same computation is being applied to all cells. You should be able to audit the top of your worksheet and fill down to the bottom, without any formatting or visual elements getting in the way, and know that the whole worksheet is correct. When reviewing an old worksheet, note that Excel will highlight (with a green corner mark, as I recall) cells that don't fit the pattern.

      Finally, remember that writing an algorithm for a spreadsheet has some of the same pitfalls as any other implementation. Double-check any function of which you're not certain the parameters. Put comments in non-obvious areas. Don't be too clever, and of course, if someone else can't understand your brilliance, you're not being brilliant.

      --
      You do not have a moral or legal right to do absolutely anything you want.
  10. A Formula only an Actuary could Love by turp182 · · Score: 5, Interesting

    There are no corporate secrets below, but I stumbled upon this formula in an actuarial spreadsheet (I'm a developer with an actuarial education).

    The only way this logic could be verified is by breaking the single formula into 20+ different cells with more simple calculations.

    And of course it is in several thousand cells, bringing any computer at all to its knees during calculation.

    A good example of how not to use Excel (but the actuaries don't have access to IT prototyping or core development).

    =IF(F6="050",tiers!$D$21+IF(AND(F6="050",OR(E6="W",E6="X")),tiers!$D$29-tiers!$D$26+VLOOKUP("N"&100&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&"N/A"&M6/12&"/"&N6&"0-"&R6/1000&C6*1,tiers!$L$2:$W$20969,12,0),IF(OR(E6="W",E6="X"),VLOOKUP("N"&F6*1&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&"N/A"&M6/12&"/"&N6&"0-"&R6/1000&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="P",M6=36,N6=60),tiers!$D$29-tiers!$D$26+VLOOKUP("E"&100&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(E6="P",M6=36,N6=60),VLOOKUP("E"&F6*1&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="P"),tiers!$D$29-tiers!$D$26+VLOOKUP("E"&100&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6/12&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(E6="P",VLOOKUP("E"&F6*1&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6/12&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="*",M6=6),tiers!$D$29-tiers!$D$26+VLOOKUP(B6&100&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&IF(Q6=48,"0-4","6-10")&M6&"/"&N6&"0-100"&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(E6="*",M6=6),VLOOKUP(B6&F6*1&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&IF(OR(L6="PPH08",L6="PTH08"),"0-9",IF(Q6=48,"0-4","6-10"))&M6&"/"&N6&IF(R6=125000,"100-125","0-100")&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="*"),tiers!$D$29-tiers!$D$26+VLOOKUP(B6&100&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&IF(OR(L6="PPH08",L6="PTH08"),"0-9",IF(Q6=48,"0-4","6-10"))&M6&"/"&N6&IF(AND(OR(L6="PPH08",L6="PTH08"),R6=100000),"0-100",VLOOKUP(R6,tiers!$B$64:$C$70,2,0))&C6*1,tiers!$L$2:$W$20969,12,0),IF(E6="*",VLOOKUP(B6&F6*1&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&IF(OR(L6="PPH08",L6="PTH08"),"0-9",IF(Q6=48,"0-4","6-10"))&M6&"/"&N6&IF(AND(OR(L6="PPH08",L6="PTH08"),R6=100000),"0-100",VLOOKUP(R6,tiers!$B$64:$C$70,2,0))&C6*1,tiers!$L$2:$W$20969,12,0),"ERROR")))))))))),IF(AND(F6="050",OR(E6="W",E6="X")),tiers!$D$29-tiers!$D$26+VLOOKUP("N"&100&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&"N/A"&M6/12&"/"&N6&"0-"&R6/1000&C6*1,tiers!$L$2:$W$20969,12,0),IF(OR(E6="W",E6="X"),VLOOKUP("N"&F6*1&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&"N/A"&M6/12&"/"&N6&"0-"&R6/1000&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="P",M6=36,N6=60),tiers!$D$29-tiers!$D$26+VLOOKUP("E"&100&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(E6="P",M6=36,N6=60),VLOOKUP("E"&F6*1&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="P"),tiers!$D$29-tiers!$D$26+VLOOKUP("E"&100&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6/12&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(E6="P",VLOOKUP("E"&F6*1&VLOOKUP(L6,tiers!$B$52:$C$55,2,0)&"N/A"&M6/12&"/"&N6&VLOOKUP(R6,tiers!$B$57:$C$59,2,0)&C6*1,tiers!$L$2:$W$20969,12,0),IF(AND(F6="050",E6="*",M6=6),tiers!$D$29-tiers!$D$26+VLOOKUP(B6&100&VLOOKUP(L6,tiers!$B$38:$C$49,2,0)&IF(Q6=

    --
    BlameBillCosby.com
    1. Re:A Formula only an Actuary could Love by Waffle+Iron · · Score: 5, Funny

      What is the end result for this one cell?

      42

  11. What you're really saying is by CriminalNerd · · Score: 4, Insightful

    "I never worked in a company with normal people."

    I'm guessing you haven't had the pleasure of working in the typical firm where the company's years-old ENTIRE lifetime of work and data is passed around e-mail as a 80MB Excel attachment.

    1. Re:What you're really saying is by zugmeister · · Score: 4, Funny

      ... where the company's years-old ENTIRE lifetime of work and data is passed around e-mail as a 80MB Excel attachment.

      This... is retarded enough to loop all the way around the spectrum and land squarely on awesome.

  12. Re:Spreadsheets - best and worst thing there is by Boronx · · Score: 4, Insightful

    Spread sheets are such awesome tools that they allow non-programmers to create the same problems that noob programmers do while writing code.