Errors in Spreadsheets are Pandemic
G Roper writes "Studies show that most spreadsheets have critical errors in one percent of their cells, well beyond a permissible level. Here are some news stories about spreadsheet errors. Spreadsheets won't protect a firm from liability when they are audited and spreadsheet errors found: spreadsheets are not secure, provide no audit trail and won't pass HIPAA or Sarbanes-Oxley auditing. How are Slashdotters coping with the proliferation of spreadsheets in the face of greater legal accountability and auditing?"
"How are Slashdotters coping with the proliferation of spreadsheets in the face of greater legal accountability and auditing?"
With a pencil. haha.
Back to ledgers and slide rules I say!
Well as one of my bosses says, "We need more Double E masters." :P.
Alas he doesnt mean Electrial Engineers, but "Excel Experts."
He's very bitter about his education
j^2
From the abstract: "Although spreadsheet programs are used for small "scratchpad" applications, they are also used to develop many large applications. In recent years, we have learned a good deal about the errors that people make when they develop spreadsheets. In general, errors seem to occur in a few percent of all cells, meaning that for large spreadsheets, the issue is how many errors there are, not whether an error exists. "
I think "how many errors, not whether an error exists" is just as true for applications and programs written in any language or using any technology. What's so insidious about spreadsheets is their integrity and the difficulty to maintain that.
Once you start changing any complex spreadsheet you risk and almost guarantee corrupting other parts of the spreadsheet ostensibly okay. The spreadsheet is so inextricably integrated to itself, you pull one string, and some widget a million miles away suddenly misbehaves, though, you're unlikely to notice until later, if at all.
IT should be strict about policy around spreadsheets... spreadsheets are great powerful tools, but they shouldn't be anointed as applications.
I worked on a team that created a large software development workbench. A critical piece of this workbench included a suite of spreadsheets with amazingly complex macros and formulae hidden way out of the casual users' sight. Immediately upon release (and much aligned with my warning and prediction) the workbench fell apart on a daily, even hourly basis, among many teams out in the field. Turns out users were deleting rows in the template spreadsheets deemed irrelevant and unnecessary to their work. Guess what got deleted along with the "unnecessary rows"? Yep, chunks of macros critical to the proper function of the workbench.
Ok, slightly off topic, but why is this posted in Hardware?
"Nobody owns the fucking words man." - James Dean
Blame it on the users of course, especially the accountants.
If every change even a correction needs to be audited save-to-save of a file, then why don't we implement a Wiki style log of changes to the file? I wonder if Open Document Format would easily support this.
The mountains of next-to-worthless data the piles of auto-saves would generate is mind boggling.
Oh You POS
How are Slashdotters coping with the proliferation of spreadsheets in the face of greater legal accountability and auditing?"
My guess it they're not. I've met FIERCE resistance in the past from accounts trying to reform their spreadsheet ways. Every accountant understands the spreadsheet. The Financial Director understands the spreadsheet. If you can't get the Financial Directory to back your plans then any reform is dead in the water.
The problem is born out of bad communication skills. IT generally assumes that just because the FD doesn't understand C++ he is stupid. We see this kind of behavior all the time on Slashdot:
No fucking shit he understood the GPL. Let's see he probably got a 1st class degree in Law, Passed his BVC with flying colours. He then probably got his pupillage with ease (there are twice as many students each year as there are pupillages) and then rose to the Bar. After that, he'd have spent 15 years working cases in the Crown Court. If he didn't understand the GPL he would have fallen at the first hurdle. My brother is a lawyer and understood the GPL before he even took his LPC. By comparison, you're average IT guy is a mere peon. I'd wager that given your average programmer with no C++ experience, the Judge could beat the programmer hands down in a programming contest. These people are very, very smart.
The same is true of Financial Directors and their ilk. They have to take years of qualifications and have decades of experience before they're allowed to do their job. Talking down to them is a recipe for marginalization. So the solution is to talk to them in clear language. None of this bullshit bingo that seems to be infesting every cranny of IT - clear, plain language.
Explain the problem, then explain the solution. They don't want or care to hear about LAMP, AJAX or Web 2.0. This like a builder telling you the type of screws he's going to use to build your house. All that you care is that your house is well built and will last a long time without significant maintenance. All they want to know why they need your solution, how much it will cost and the consequences if they don't do it. Anything else is a waste of their time and will lower the amount of time they have for you.
Simon
At the end of the day, a human is responsible for ensuring that the data in a spreadsheet is kosher. It's been that way since Bob Cratchet scribbled in his notebook under the light of a candle. If anything, this will make a company think twice before replacing an accountant with a secretary.
"How are Slashdotters coping with the proliferation of spreadsheets in the face of greater legal accountability and auditing?"
I don't know about you, but I actually check my work and co-workers cross-check each other's work. Any spreadsheet whose numbers can't easily be checked out on a calculator should be designed such that the information generally flows in one direction and each step of a calcuation is broken out into separate rows whenever possible to make "debugging" easier.
Bill Clinton: Pimp we can believe in. - The Shirt!!!
and what was teh error rate in spereadsheets back when they were done by hand?
and did this study take into account the error in their very own spreadsheets?
and how about error on the error?
and lastly, who cares? get over it. If whatever you're designing, whatever economic model, or budget can't sustain a 1% hit due to error, then you need to really rethink your design and ideas.
Rather,its about how people make mistakes when working on spread shit.
Sample:
Why does yahoo do this
"...they are also used to develop many large applications."
You think they're large now? Wait 'til the million row version of Excel is phased in and you'll see some LARGE spreadsheets.
Spreadsheets are basically a form of visual programming language, so it is unsurprising that bugs occur. They are basically designed so that ordinary people can use them, which means that they lose some of the strictness that is enforcable in a normal programming language. More worringly, I'd say that some of the properties of spreadsheets naturally encourage bugs. For example, when programming, code duplication is considered bad, and shared common code good, because it encourages simplicity and when bugs are found, they can be fixed in a single location. Conversely, in spreadsheets, the user is actually encouraged to duplicate code, with tools that let you "drag down" equations into neighbouring cells. Perhaps we should be wondering if it would be a good idea to create some kind of "next-generation" spreadsheet system that addresses these problems. Whereas programming languages have evolved constantly over the years, spreadsheets remain unchanged.
were you expecting to see a sig here? perhaps you'd rather see the inside of an ambulance!
I wonder what the study used to keep track of data. I bet for sure that they used Excel or SPSS.
How can we be sure that the 1% is an accurate figure? Word is, spreadsheets aren't totally reliable. :)
provide no audit trail
... )
You can provide an audit trail in Excel:
Tools->Share Workbook->click "Multiple Users"->click "Advanced"->select how many days you want to keep a history for.
(It might not be good enough for HIPAA or SA but there is an audit trail
Sounds like you're advocating the wrong policy. How about locking the cells so users don't screw things up? You wouldn't let non-programers alter code, why would you let them alter the spreadsheet?
The world is made by those who show up for the job.
What color is the sky on your planet?
"I'd rather be a lightning rod than a seismometer." -Ken Kesey
Yet another example of the truth of "those who do not understand UNIX are bound to reinvent it, poorly."
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
Don't get me wrong, Excel is pretty damn great.
BUT...no business should be "running the show" on something like Excel. For serious stuff, you need a dedicated (possibly custom-made) application that does all kinds of sanity checks. A properly coded Excel spreadsheet can do a lot of that for you, but it's not really meant for that.
>they shouldn't be anointed as applications.
Everything about them is wrong for being an enterprise application, but sometimes they're better than nothing.
I know a company where the finance rollups and planning are done with a mountain of Excel spreadsheets and scripts. Only one person understands it. Sick, but every attempt to do it right with a real database has failed after burning millions of dollars. It's been like this for at least ten years.
Advice? Fall back on the ancient wisdom of the finance guys and put cross-checks in place so you can at least detect the errors.
You should see the error rate among people who still use pen and paper to make vital calculations.
Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
Film at 11.
The most rabid believers in American Exceptionalism are the exact same people whose policies are destroying it.
I keep trying to warn my business customers, one of which uses linked spreadsheets for their quarterly accounting (backed up by an auditing firm), that linked spreadsheets are not intended as an enterprise application. But do they listen? Tried to get them to look at alternatives but they keep saying, "It does what we need it to do." But it's always breaking, usually at the worst possible time, and the auditors are constantly pointing out errors.
You can only go so far in protecting customers from their own determined stupidity.
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
In programming, we have learned to follow strict development disciplines to eliminate most errors.
Uhh no we haven't, most software like most spreadsheets have lots of bugs.
Personally I construct my spreadsheets in small logical steps, with comments using clearly laid out input and outputs to each formula and portion of the sheet.
I then hide these sheets or portions of the sheet.
I then run a few test cases testing nominal and boundary value performance, including invalid input (which should be rejected)
Sort of like many other programs actually.
It is difficult to eliminate the errors, so a better solution is to minimize them. The easiest way to do this is to add extra workbooks named "sheet2" and "sheet3" with thousands of extra cells in them. Then, the percentage of error is 3 times lower. Example:
Before: "sheet1" has 50x50 cells, with 25 errors. That's 25 / 50^2 = 1% errors.
After: Add "sheet2" and "sheet3" with another 50x50 cells. Now, the error rate is 25 / 50^2 / 3 = 1/3 % error.
According to my spreadsheet, that is a much better error rate!
I just don't know. I've never had a problem with keeping track or accuracy of any of my spreadsheets. But then again I use Subversion to keep backups and logs on who uses what and when. Anyways back to waiting for my $2.1 billion tax return.
I checked out the article, and the examples, and I'm impressed. Unfortunately this is the same method used in climate modeling, economic forecasting, genetic engineering, and human drug trials.
Did you check out the original article? Were those studies cited put just in a straight table for illustration, or were they tabulated first in...a spreadsheet?
I have to say, though, that some of the studies are rather dated, and the data isn't all similar. However, the example of "whoops"'s that people have run into were frightening, and those were just financial spreadsheets.
I guess that just goes to show you that spreadsheets are good modeling tools, but they shouldn't be in the hands of everyone in the office preparing the reports. Instead the IT department should be writing permanent applications to make the computations, so then at least it's harder to make changes, so it's harder to accidentally replace a value in a cell or a formula that ultimately costs you $1 billion or so...
Friends help you move. Real friends help you move bodies.
Never forget: 2 + 2 = 5 for extremely large values of 2.
A speadsheet is just like a blank word document or piece of paper. You put stuff on it. Any stuff in fact. Right or wrong, it's just data.
Doing accounting on paper leads to hard-to-read or misread digits, space considerations, inverting numbers, aligning numbers improperly and other key problems. A spreadsheet fixes many of these problems, but when it comes down to it, what's on the spreadsheet is what you put there [or what auto-correct put there]. Same thing- a calculator adds what you enter (or mis-enter). If you entered the wrong thing in a spreadsheet, at least it's easy to spot.
The answer? check your work. Go back and verify the numbers there. Go back and make sure things balance. Have the hard receipts of what you're totalling as a good copy of anything you do.
Why is this even a question on Slashdot? Make a formula to total and check sanity of numbers, which may help. When it comes down to it though, just take care in what you enter and make sure it's right afterwards.
-M
when you see the word 'Linux', drink!
I'm amazed at the rut Excel traps IT-based businesses in. Excel was once the best thing computers did, apart from screensavers and ahead of email. But by the late 1990s Excel should have become merely the GUI for relational databases. Even cheap/free ones like MS-Access and MySQL, if not Oracle, Postgres, SQL-Server. Excel should have had macros programmable in the exact same language as actual databases, like VB (not VBA), Perl or something unique to its vertical integration. Upgrading from the starter DB to the enterprise DB should have been a matter of installing the new backend on the network, and configuring the Excel client.
If that path were taken, Excel would be a manageable platform. Instead, it's trapped in the early 1990s desktop, with all its limitations to collaboration, performance, maintenance and dataflow. Every improvement in those areas is a one shot deal, a hack on a once-elegant app now hacked to death.
Maybe the new generation of open formats and distributed computing services offer a chance to try again. Excel will probably include those, just diluted by all the wrong ways retained as its "legacy".
--
make install -not war
Nobody has ever solved the problem of people becoming confused by the rules as to when inserting a row or column expands the range references in formulas that refer to it. Like memory leaks or buffer overflows, everybody gets all macho and implies that competent people never experience these problems. The syllogism seems to be "Truly competent people do not experience these problems. The computer industry is populated by practitioners of average competence. Therefore, it is not a problem."
In the computer industry, any problem that has existed for more than about five years is no longer seen as a problem and nobody is interested in solving it.
Oh, here's the 1997 reference.
Date: Tue, 8 Jul 1997 17:29:14 -1000
From: "Ray Panko"
Subject: Website on Spreadsheet Research
In recent years, there has been a considerable amount of research on
spreadsheets, including error rates. The Spreadsheet Research (SSR) website
summarizes data from field audits of more than 300 operational spreadsheets
and from experiments involving almost a thousand subjects ranging from
spreadsheet novices to long-time spreadsheet professionals. The results are
pretty chilling. Every study that has tried to measure spreadsheet error
rates has found them and has found them at levels that are deeply
disturbing. The URL is:
http://www.cba.hawaii.edu/panko/ssr/
"How to Do Nothing," kids activities, back in print!
Or to borrow a rule from finite element analysis: don't run the analysis unless you know what answer to expect. i.e. Know your stuff and don't blindly trust a computer.
Then again, CS has had a more succinct version for years: garbage in, garbage out.
The world is made by those who show up for the job.
I will give you what I consider the most blatant and insane example.
U.S. Constitution article III
The trial of all crimes, except in cases of impeachment, shall be by jury; and such trial shall be held in the state where the said crimes shall have been committed; but when not committed within any state, the trial shall be at such place or places as the Congress may by law have directed.
U.S. Constitution: Sixth Amendment
Sixth Amendment - Rights of Accused in Criminal Prosecutions
In all criminal prosecutions, the accused shall enjoy the right to a speedy and public trial, by an impartial jury of the State and district wherein the crime shall have been committed, which district shall have been previously ascertained by law, and to be informed of the nature and cause of the accusation; to be confronted with the witnesses against him; to have compulsory process for obtaining witnesses in his favor, and to have the Assistance of Counsel for his defence.
Now to me and I think to 99.999% of americans the phrase ALL clearly means every single instance.
To judges and lawyers however this is apparently different. As it currently stands you do not have the right to a jury trial.
Service guarantees Citizenship! Questions Guarantee GITMO.... Amerika Uber Alles!
is that spreadsheets blur the lines between program (it does something), data store (it stores data), and report (it formats the data for printing) in such a way that someone eliminating unwated stuff from the report (say to make a printout neater) can in the process screw up the program.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
I think you're looking for Subversion + Apache with mod_dav_svn. You can mount your subversion repository as a filesystem, and all of your changes will be transparently tracked via subversion.
I've never done this myself, but I was looking for exactly the same thing a while ago, and stumbled on this (rather amazing) bit of setup. Perhaps someone with a little more knowledge of it can fill you in, but I know it's possible, and Google is probably your friend.
This wheel has already been invented, and extremely well by the looks of it. It's even HTTP-based, so it should work through proxies.
About 10 years ago, a Silicon Valley manufacturer of medical imaging equipment hired me to do accounting work for them. Among my many tasks for this firm was the weekly generation of a report based on the company's current accounts receivable balance. I was told that this report was very important since it was used by one of our execs. during his weekly 'power breakfast' meetings with the other heads of the company.
A month after I arrived at the company, I noticed that the numbers didn't look right when I generated this weekly report. I started examining the spreadsheet formulas and soon found a small error in one of the calculations we used to derive our total balances. I notified my manager and we both agreed that the original spreadsheet wasn't giving accurate results. I corrected the formula and then patted myself on the back -- after all, I'd uncovered an error that many people, including my manager, had missed for months. I thought I was in good shape at the company after that because I'd done the right thing. I'd fixed a problem. Yay for me.
However, a week later, my manager brought me into his office to talk about the issue. I was more than a little surprised when he asked me to go to my desk and change the formula back to what we'd used before. I asked my manager if he still agreed with me that the old formula was giving incorrect data. He just smiled and said yes, he agreed with my original assessment. I was right, he told me, but our exec. had still asked him to revert to the old formula, no reasons given.
Shortly after this incident, my manager again brought me into his office. He had a pained look on his face as he began to tell that the company wouldn't be needing my services anymore. My manager never gave me an explanation as to why, but I didn't really need an explanation. Even though I'd uncovered an error in the company's accounting procedures, I'd made an even bigger error in the process -- I made our exec. look bad when he handed out the correct report during his power breakfast meeting. It turned out that the numbers weren't so rosy that week as they'd been in previous weeks. The other company heads wanted to know why. I'm not sure what our exec. told them then, but I can't imagine it made him look good no matter how he tried to spin it.
I suppose, if the numbers had looked better using my correct spreadsheet calculations, maybe I'd have received a raise from that exec. In this particular case, and much to my surprise, the wrong answer was the right answer in his method of bookkeeping. Frustrated by this incident, I left the accounting business soon afterwards. As it later turned out, the company went belly-up years later. Looking back, I like to imagine that reason was that the company's bankers were using spreadsheets based on mathematics instead of wishful thinking. Then again, after seeing what happened with Enron, I wonder if the bankers were in on it too.
Spreadsheets are the triumph of the presentation of data over common sense.
Quite apart from the fact that some of the calculations are just plain wrong (and lack precision and are subject to systemic errors, such as rounding) the 'matrix' is often abused into serving as a data table.
Dan Bricklin may have latched onto a good thing for Apple but the unleashed idiots who used VisiCalc (and have continued through to Excel where the spreadsheet metaphor finally died,) have never had more than a clue as to what it really was.
ANYONE who trusts it to be any more than a "what-if" exercise is an IDIOT.
ANYONE who trusts it even as a "what-if" exercise without backing up the calculations with some sound math is an IDIOT.
ANYONE who uses it to store tabular data is an IDIOT.
Basically, its a fucking nuisance.
I once had to verify some calculation routine that we were using because 'the user said it was wrong.'
I spent two friggin' weeks going over code, trying it over and over again until I could prove that the code was accurate to fifteen digits on either side of the decimal place with ALL the friggin' math equations.
I finally asked my boss where the error was supposed to be. How did the user know that the calculations were wrong?
I was told to wait and he'd go and ask... The end result was that the IDIOT was actually using Excell to calculate yields on some very large and very long term bonds that they were trading.
The IDIOT was actually expecting my software to give him the same results as Excell.
Never mind that Excell is a fuckin' toy with rounding errors on nine digits (I said these were LARGE bonds) and the calculations used Newton's method of approximating integrals.
I felt like killing them all for wasting my time like that.
If they'd have told me what I was REALLY trying to prove (that Excel is a piece of shit) I could have done that in a minute just sitting down with the user and letting try things (that I knew weren't going to work) to try to prove to me that Excell was an accurate calculator.
Thank God he wasn't trying to use it to store his data. (I'd have ripped him a new ass-hole with a rusty can opener.)
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
An ideal spreadsheet program would be written in a functional style, but since most spreadsheets do not allow you to have extra variables in your functions (all variables are cell addresses), nearly any complicated spreadsheet application uses numerous temporary or hidden cells for intermediate calculation data. That's not fuctional programming at all, it's tons of global transitive state!
Stop intellectual property from infringing on me