Toyota's Killer Firmware
New submitter Smerta writes "On Thursday, a jury verdict found Toyota's ECU firmware defective, holding it responsible for a crash in which a passenger was killed and the driver injured. What's significant about this is that it's the first time a jury heard about software defects uncovered by a plaintiff's expert witnesses. A summary of the defects discussed at trial is interesting reading, as well the transcript of court testimony. 'Although Toyota had performed a stack analysis, Barr concluded the automaker had completely botched it. Toyota missed some of the calls made via pointer, missed stack usage by library and assembly functions (about 350 in total), and missed RTOS use during task switching. They also failed to perform run-time stack monitoring.' Anyone wonder what the impact will be on self-driving cars?"
I'm convinced. I'll give up my career as a computer programmer now, and go use my bare hands for subsistence farming now. Sorry, I was wrong.
Those working on self-driving cars and those that are watching the technology already know that any such car would need an absolutely 100% rock solid OS.
This changes nothing.
The owner of a self-driving car will have had to accepted the EULA and accepted not to hold the manufacturer liable for sofware defects. (half joking but I wouldn't rule it out)
Sure, they will be more safe. Just like in the aviation industry, where each incident/crash is investigated meticulously, and flying has become safer ever since 1903. With non-selfdriving cars 99% of the incidents were caused by human error. Now no more, so we can fix it!
2nd link, 5th paragraph:
Anyone wonder what the impact will be on self-driving cars?
A longer chapter on debugging in the first edition of "Programming Self-Driving Cars: The Missing Manual."
Brought to you by Carl's Junior.
If there's no human fall back or ability to overthrow the computer's control of the car I'll never drive it. I don't think this will change anything except maybe give the people that are rushing for self-driving cars some pause. Every developer knows the risks of self-driving computer controlled cars (if they don't, well they're naive). Between human error in programming and human maliciousness, there are two camps. People who think they can overcome the possibilities of putting a semicolon in the wrong place and prevent hackers from comprising the software's integrity. And people who realize the first people are fooling themselves.
Your post demonstrates a complete lack of understanding of what JIT manufacturing (i.e. lean) is and what it tries to accomplish. Hint: it's not about doing more with less. Further, you either willingly fail to mention Kaizen (continuous improvement) or just aren't aware that THIS is the heart and soul of the true Toyota Way.
Whatever the reasons they failed in software engineering, neither JIT nor Kaizen would be to blame because neither of those try to nor should they translate to "engineer badly".
Any engineering project requires that the engineers have to answer for what they've done. The mantra is, "As an engineer, if you fuckup, someone dies." Every engineer, regardless of discipline, needs to understand this and if they don't, should consider going into Liberal Arts or something equally useless where the worst they can do is fuck up my food or drink order.
some karma... and kinda lukewarm about it.
"If there's no human fall back or ability to overthrow the computer's control of the car I'll never drive it."
by definition you wouldn't be driving it.
The Kruger Dunning explains most post on
Vehicle tests confirmed that one particular dead task would result in loss of throttle control, and that the driver might have to fully remove their foot from the brake during an unintended acceleration event before being able to end the unwanted acceleration.
The jury could confirm there was a defect because they were able to reproduce it with a physical car. They could confirm the code quality was poor because it 1) It didn't follow the required code standards: MISRA C, 2) The cyclomatic complexity was too high 3) Toyota didn't track bugs.
Are you sure you are a software engineer, and not some programmer with delusions of grandeur?
The Kruger Dunning explains most post on
There was a time after automated elevators first came out when people refused to use them because they didn't trust them without a "human fall back or ability to overthrow the computer's control". Today, when nearly all the elevators we've ever seen were automated, this seems crazy.
In 50 years, when most people have never seen a manually operated car, we'll seem just as crazy for not trusting them.
Did you read TFA?
In a nutshell, the team led by Barr Group found what the NASA team sought but couldn’t find: “a systematic software malfunction in the Main CPU that opens the throttle without operator action and continues to properly control fuel injection and ignition” that is not reliably detected by any fail-safe.
That's proof, not an argument that they could have tried harder to find the system could fail. The bottom line is that its software that puts people's lives at risk. It's reasonable to hold that type of code to a higher standard. There are millions of other cars, trains, and planes out there with similar software but without this type of problem. At some point you should be responsible for the things you create.
I've been reading the transcript. It's fantastic. The expert explains clearly and lucidly in terms that (I imagine are) understandable by non-techies.
The transcriber made some funny mistakes... Let me tell you about "parody bits" and "pointer D references" :)
Couple of details here:
Toyota had no software testing procedures, no peer review, etc. The secondary backup CPU code was provided by a third party in compiled form, Toyota never examined it.
Their coding standards were ad hoc and they failed to follow them. Simple static analysis tools found massive numbers of errors.
They used over ten thousand global variables, with numerous confirmed race conditions, nested locks, etc.
Their watchdog merely checked that the system was running and did not respond to task failures or CPU overload conditions so would not bother to reset the ECU, even if most of the tasks crashed. Since this is the basic function of a watchdog, they may as well not have had one.
They claimed to be using ECC memory but did not, so anything from single bit errors to whole page corruption were undetected and uncorrected.
A bunch of logic was jammed in one spaghetti task that was both responsible for calculating the throttle position, running various failsafes, and recording diagnostic error codes. Any failure of this task was undetected by the watchdog and disabled most of the failsafes. Due to no ECC and the stack issue below, a single bit error would turn off the runnable flag for this task and cause it to stop being scheduled for CPU time. No error codes would be recorded.
They did not do any logging (eg of OS task scheduler state, number of ECU resets, etc), not even in the event of a crash or ECU reset.
The code contained various recursive paths and no effort was made to prevent stack overflows. Worse, the RTOS kernel data structures were located immediately after the 4K stack, so stack overflows could smash these structures, including disabling tasks from running.
They were supposed to be using mirroring of variables to detect memory smashing/corruption (write A and XOR A to separate locations, then compare them on read to make sure they match). They were not doing this for some critical variables for some inexplicable reason, including the throttle position so any memory corruption could write a max throttle value and be undetected.
Instead of using the certified, audited version of the RTOS like most auto makers, they used an unverified version.
Thanks to not bothering to review the OS code, they had no idea the OS data structures were not mirrored. A single bit flip can start or stop a task, even a life-safety critical one.
These are just some of the massive glaring failures at every level of specifying, coding, and testing a safety-critical embedded system.
I am now confident in saying at least some of the unintended acceleration events with Toyota vehicles were caused by software failures due to gross incompetence and negligence on the part of Toyota. They stumbled into writing software, piling hack on top of hack, never bothering to implement any testing, peer review, documentation, specifications, or even the slightest hint that they even considered the software something worth noticing.
Natural != (nontoxic || beneficial)