Court Orders Breathalyzer Code Opened, Reveals Mess
Death Metal writes with an excerpt from the website of defense attorney Evan Levow: "After two years of attempting to get the computer based source code for the Alcotest 7110 MKIII-C, defense counsel in State v. Chun were successful in obtaining the code, and had it analyzed by Base One Technologies, Inc. By making itself a party to the litigation after the oral arguments in April, Draeger subjected itself to the Supreme Court's directive that Draeger ultimately provide the source code to the defendants' software analysis house, Base One. ... Draeger reviewed the code, as well, through its software house, SysTest Labs, which agreed with Base One, that the patchwork code that makes up the 7110 is not written well, nor is it written to any defined coding standard. SysTest said, 'The Alcotest NJ3.11 source code appears to have evolved over numerous transitions and versioning, which is responsible for cyclomatic complexity.'" Bruce Schneier comments on the same report and neatly summarizes the take-away lesson: "'You can't look at our code because we don't want you to' simply isn't good enough."
Poorly written code is one thing, but does it ultimately work?
Lint, as a static code analyzer, is bound to have false positives. More so in embedded systems, where you're dealing with registers and occasionally "violating" type safety where no type adequately exists. It's really not super surprising that 60 percent of the code is reported by Lint.
I Browse at +4 Flamebait
Open Source Sysadmin
not written well, nor is it written to any defined coding standard
Ah, so it's like most of the code in the world.
Because the output is used as evidence in court?
Ok, I'm not happy that some people almost certainly were measured inaccurately by these things. I'm not happy that this company was allowed to pull this kind of shit -- when you do government contracting, the government should own what you do.
However, I am very glad that the precedent has been set.
And I am especially glad that not only is there precedent, but there's a real live example of why we need this stuff to be open.
Don't thank God, thank a doctor!
80% of the code in business fits this description. With 20 year old legacy code written by 50 consultants, then upgraded in India, then ported from one platform to another to another, and a database engine switch or two. Code gets senile. What do they expect? Good thing we're all just commodities... human lego bricks easily replaced with cheaper plastic.
Disconnect your television. Do your own research. Draw your own conclusions. They're probably lying. Don't be a sheep.
Just because code is not written to some official standard does not mean it is guaranteed to be buggy. Undisciplined coding is as bad as undisciplined specifications - results can indeed be ugly. It is preferable if the coders follow good practices, and there ideally would be a clear system for specifying program behaviour in testable ways. It is easier to produce good code with robust behaviour if good practices are followed from design through coding to testing and documentation, but it is not impossible to achieve good results in other ways also.
Did they find any coding bugs, or did they just criticize the approach to coding?
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
Er, why would it need or be expected to be? It's a commercial product. I don't think most bank websites are "coded" to any specific standard either.
From the article:
1. The Alcotest Software Would Not Pass U.S. Industry Standards for Software Development and Testing: The program presented shows ample evidence of incomplete design, incomplete verification of design, and incomplete "white box" and "black box" testing. Therefore the software has to be considered unreliable and untested, and in several cases it does not meet stated requirements. The planning and documentation of the design is haphazard. Sections of the original code and modified code show evidence of using an experimental approach to coding, or use what is best described as the "trial and error" method. Several sections are marked as "temporary, for now". Other sections were added to existing modules or inserted in a code stream, leading to a patchwork design and coding style.
Ok. Would you want to have something that can cause you to get convicted because it wasn't documented or even tested fully - ("Oh, Crap. That constant should have been 0.001, not 0.01. Ooops. Blood Alcohol level was 0.008, not .08. Sorry !")
Common sense (if it WERE common) should indicate that there should be full tests for a wide range of values performed with the written tests and expected values verified and available to prove that the device/software actually does detect the proper levels of alcohol.
UPS Sucks
This will not stop the state from using this to make a felon of you.
The Navy Motto "IF it ain't broke Fix It" "A day is wasted if you don't learn something new"
In general, I was under the impression that the standard for criminal cases were weighted heavily to reject any technique, evidence, or device that had any appreciable chance of a false positive.
Have you been touched by his noodly appendage?
The problem in a lot of states is that .01 can make a huge difference between a DUI, a DUI with a "high BAC kicker", a wet-reckless, or nothing at all. It has to be accurate to at least a few 9's or for those "on the bubble" cases do have a severe level of doubt. Because driving with a .07 is not illegal (for the most part), but .08 is. The question in court is not "were you drinking tonight", but "how much did you drink" which is a very specific very objective, very deturminable piece of information.
.01 or more margin of error, you're going to get a lot of overzealous cops in cities with revenue shortfalls taking innocent people in for DUIs and hopefully more and more of these "border cases" will bring these devices into question more than the over-the-top blacking out, pissing his pants multiple-offender does in court.
As states lower their legal limits to the point where they intersect with non-impaired drinking drivers, especially with a
Forgive my spelling from time to time. I'm often posting during short breaks.
The good: This particular breathalyzer has been proven to be the unreliable POS that it apparently is. This unit, and others like it, will finally start being held to a stronger coding standard.
The bad: every sleezeball, ambulance chasing, "call lee free", douchebag of a lawyer will use this case to attack the credibility of any and all breathalyzers made in the past, present, or future, spreading enough FUD to juries everywhere that an unacceptable number of drunken idiots get the God given right to keep their license until they finally end up killing someone.
As a person, I think groups like MADD spend most of their time trying to scare monger politicians into pushing us as close to prohibition as possible. I believe that alcohol can be used responsibly. But I also know that this case is going to result in DUI's getting overturned for people that damn sure don't deserve it. Borderline cases will get knocked down, cases will get thrown out, and the people that broke the law, that did something wrong, will walk out of a court room 'vindicated.' They didn't do anything wrong when they had six beers and drove home, it was that confounded *machine* that *said* they broke the law. The *machine* was busted, ergo they didn't break the law. In short, this case is going to make a lot of O.J. Simpson's. The jury said they didn't commit a crime, so they didn't. No harm no foul. Technicality? Bah! They're as innocent as the sweet baby Jesus.
I'd like to think things will wash out in the end. This case will probably end up making it harder to get off on this particular technicality in the long term. In the short term? Here come the appeals. Maybe the state is partially at fault for buying shoddy equipment. (Or maybe not. Did they do a code review? Do they have the resources to one? Probably not. Did you do a code review of the 3com switch in your server room? Their selection criteria can certainly be questioned, but it probably doesn't change the fact that someone drank enough to blow a .22 then decided to drive home.)
But in the end, the drunks are still going to be drunks. And tomorrow some of them will probably get to file appeals, and some of the ones that shouldn't be on the road, or even in public, will get to slip out of this brand new loophole. I'm not sure that that deserves a cork-popping celebration.
(and yes: We all handle our booze differently. Arbitrary limits that determine "drunk" may or may not be the answer. Hardcore drunks will keep driving even after losing their license. DUI's are as much moneymakers for the States as speeding tickets. Yadda yadda yadda.)
There are some people that if they don't know, you can't tell 'em.
Do you really think that you should receive the same consideration as a guy that's 3x over the limit? Blowing .08 and .18 are quite different in terms of state of mind.
Definitely possible that there's no false negatives, but for a device that can have such this level of an impact on someone's life, you'd think it would be held to a higher standard.
In embedded systems programming, it is common practice to disable interrupts if they are not used. It is certainly possible that this app simply does not need to handle these interrupts, whether they are enabled or not.
It is also possible that the other flaws mentioned, which clearly reduce accuracy, do not do so sufficiently to change the outcome in a meaningful way.
The problem with drunk driving law is not primarily one of testing. It is that it presumes someone is incapable of driving with even trace amounts of alcohol, while treating other forms of more dangerous driving (such as driving while texting or on the phone) as being OK or far far less severe.
The way the laws themselves are written is a horrible miscarriage of justice. This is the result of the perverse and hypocritical views of MADD and its ilk, the bastard children of the prohibition movement.
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
Perhaps they're coded inelegantly or poorly, but do they actually spit out inaccurate numbers?
Irrelevant - the test is: do they always spit-out provably correct numbers?
The code is protected in the US by copyright. It is not protected anywhere else, especially in countries where it is cheap to reproduce the hardware. US Customs has proven over and over they will not block the import of infringing devices.
This means that once the software gets out - and it is - look for cheap copies that will put the original manufacturer out of business. Because law enforcement and just about everyone else in the market for such devices is going to jump on the price difference. Same functionality for 1/10th the price.
Do not believe for a second that there are any safeguards left for this sort of thing. There are not.
An example of a DUI conviction using 5 numbers (Assuming 0.10 for simplicity):
Say the breathalyzer gets 5 numbers: [0.0625,0.0625,0.1,0.12,0.15].
If you average the numbers you get 0.099. However if you do a 'rolling' average, you get 0.13. Quite a bit of difference (not to mention, legal vs not legal).
Now imagine that last number was a burped packet of gas that had a ton of alcohol in it and was a 0.25.
Instead of the average being 0.119 (yes still illegal) it's now .18, at which point you hear on the news "he was nearly twice the legal limit!".
(Although playing with the numbers, it looks like the site is wrong, it incorrectly weights the later numbers because one wild number can instantly pull it up very high, or low.)
The fault is the State using output of a device which is an undocumented, unverified black box in legal proceedings.
Yes, of course, most of code out there is a similar mess. But if it fails, the worst that can happen is that your desktop crashes, or your iPod hangs... which is bad, of course, but not as bad as getting a criminal conviction for drunk driving.
These things should be held to the same standards as code in military equipment or nuclear reactors - mistakes are inexcusable.
The US's comparative advantage is speed-to-market, for good or bad. Any service or product that becomes a stable commodity flows to lower-wage countries. Thus, churn-and-burn is the order of the day in the 'merica's.
Now if you could show through some kind of statistical analysis that companies that spend more time planning are more profitable, you may get more listen. I've worked with some crappy code for big, well-known companies that otherwise are financially successful. Thus, bad code and practices are not dooming them (although it would give a sense of satisfaction if it did).
One required 12 programmers because it was a combinatorial mess of factors being reported. They said they tried to use a meta-programming approach once before, but the programmer, who was otherwise well-regarded, got confused. They just instead switched to an army to produce the copy-and-paste combinations. They should have perhaps recruited somebody better at meta-techniques, but that was outside of their familiarity zone. Rather than experiment in their unfamiliar meta-land, they decided to byte the bullet and hire the army. I almost felt like building a demo at home and then showing them. Sometimes that's what it takes to break thru the status-quo.
Table-ized A.I.
Readings are Not Averaged Correctly: When the software takes a series of readings, it first averages the first two readings. Then, it averages the third reading with the average just computed... There is no comment or note detailing a reason for this calculation, which would cause the first reading to have more weight than successive readings.
Maybe that is the intention? Just because Schneier *thinks* it is an average, doesn't make it so. Maybe the device becomes more accurate as more samples are taken, and therefore gives more weight to the last (not the first!) sample.
The A/D converters measuring the IR readings and the fuel cell readings can produce values between 0 and 4095. However, the software divides the final average(s) by 256, meaning the final result can only have 16 values to represent the five-volt range (or less), or, represent the range of alcohol readings possible. This is a loss of precision in the data; of a possible twelve bits of information, only four bits are used. Further, because of an attribute in the IR calculations, the result value is further divided in half. This means that only 8 values are possible for the IR detection, and this is compared against the 16 values of the fuel cell.
Who cares if there is loss of precision? In the end, it spits out a value that is essentially binary: drunk or not. You are not less drunk if it says so with 9 extra bits of precision, and in fact the extra accuracy could itself be considered an error as it shows a certainty about the result that may not be warranted by the design of the machine.
Catastrophic Error Detection Is Disabled: An interrupt that detects that the microprocessor is trying to execute an illegal instruction is disabled, meaning that the Alcotest software could appear to run correctly while executing wild branches or invalid code for a period of time. Other interrupts ignored are the Computer Operating Property (a watchdog timer), and the Software Interrupt.
If the code is tested properly, there is no need to keep the first two interrupts going. As for the "software interrupt", it may sound ominous that it is disabled, but there is absolutely no way to tell if and why it should be enabled, and in fact disabling it is probably correct. There is absolutely nothing in software engineering that says you should always enable all interrupts because otherwise your code would be less reliable.
Besides, a typical result of executing illegal instructions is that the device hangs or reboots. Since alcohol testing devices don't do that (to the best of my knowledge), it appears that disabling those interrupts does not cause any harm.
So, basically, it's designed to always return some value, even if it's wildly inaccurate, and even if the software is executing garbage at the time.
In other words: It appears to be a very low-level equivalent of Visual Basic's "on error resume next".
That conclusion is entirely unwarranted. It appears to be designed to provide a weighted average, not show undue accuracy, and is sufficiently well-tested that it does not need emergency measures like the illegal instruction interrupt. In other words, even though the software may look messy it is working fine.
the fault is the Executive staff that refuse to listen to their experts (programmers) and do what they recommend. Instead we get morons that know nothing about programming making unrealistic deadlines and forcing death march coding marathons to give up the mess we have today.
To some extent, you are correct. However, I also blame the developers. There are many "software engineers" and "computer scientists" I have worked with who didn't understand the basics of algorithms, design, testing, and other topics that are necessary to our field.
With an attitude like that, it's obvious that you have little experience with embedded systems...
:(){
> In embedded systems programming, it is common practice to disable interrupts if they are not used. It is certainly possible that this app simply does not need to handle these interrupts, whether they are enabled or not.
There is rarely a good reason to shut off the interrupt for an illegal instruction if it exists on your micro. It is entirely possible for a stray bit of electromagnetic radiation(cosmic ray, electric motors turning on or off, etc) to flip a bit in the micro, causing an illegal instruction. The illegal instruction interrupt exists for situations like this. It should be caught and handled in an appropriate manner, ESPECIALLY in safety critical or (as in this case) legal evidence gathering applications. I've seen it happen at work when we do our electromagnetic interference testing.
Maybe that is the intention? Just because Schneier *thinks* it is an average, doesn't make it so. Maybe the device becomes more accurate as more samples are taken, and therefore gives more weight to the last (not the first!) sample.
It damn well better be an average -- having worked with cheap, 12-bit ADC chips before, I know you're getting trash for data if you aren't taking multiple readings and averaging. You must average the readings because the readings are noisy, particularly "in the field". The point of the averaging is to get rid of the noise. The noise doesn't go away as more samples are taken. The average needs to be done properly across the range because if you're giving greater weight to the last reading, your failing entirely to eliminate the noise.
I'm certain it's throwing away that least-significant bits from the 12-bit ADC precisely because they're effectively RNG output. The problem is, that's the wrong way to do it. Keeping the entire value and then averaging properly gives a reasonably accurate value, even from a noisy ADC. You can discard the lower bits after the averaging (it's false precision anyhow), but not before, and you do it by rounding, not truncation. Doing what is described in the article gives you trash.
"Convictions are more dangerous enemies of truth than lies."
Testimony is subject to cross-examination (at least in the US). Opposing counsel has the opportunity to exploit weaknesses in the witness's testimony. Also, the witness is subject to prosecution for perjury for lying. What penalty does a faulty (if it be faulty) device face?
Assuming the microcontroller has a 10-bit A/D converter to get the reading, I'm pretty sure such a chip could add 32 numbers together. With the speed of 8-bit microcontrollers these days exceeding 1MHz even at ~$1 price points, emulating 16 bit numbers to get your sum is not a problem. Take a power-of-two number of readings and your average can be a simple bit shift. It will take more horsepower to convert to base-10 on the display than to take the average.
This is not a cheap child's toy or a toaster, it's a law-enforcment grade breathalyzer going for above $100; there is no excuse for being so lazy. Code that runs on small systems should be *clean* because bugs are harder to find without easy I/O, and the efficiency of it needs to be obvious. Also, code that can put someone in jail should not be spaghetti, regardless of the scale of the system running it.