Slashdot Mirror


Software Bug Behind Biggest Telephony Outage In US History (bleepingcomputer.com)

An anonymous reader writes: A software bug in a telecom provider's phone number blacklisting system caused the largest telephony outage in US history, according to a report released by the US Federal Communications Commission (FCC) at the start of the month. The telco is Level 3, now part of CenturyLink, and the outage took place on October 4, 2016.

According to the FCC's investigation, the outage began after a Level 3 employee entered phone numbers suspected of malicious activity in the company's network management software. The employee wanted to block incoming phone calls from these numbers and had entered each number in fields provided by the software's GUI. The problem arose when the Level 3 technician left a field empty, without entering a number. Unbeknownst to the employee, the buggy software didn't ignore the empty field, like most software does, but instead viewed the empty space as a "wildcard" character. As soon as the technician submitted his input, Level 3's network began blocking all incoming and outgoing telephone calls — over 111 million in total.

3 of 106 comments (clear)

  1. Software did what it was suppose to. by pirodude · · Score: 4, Interesting

    I'm 99% sure they were using the Sonus EMS management software (L3 is a huge Sonus shop) to manage the PSX routing engine. The software works as longest match of the number. Since you have to always select the country, a blank entry would be treated as +1 and block everything after that or everything in the US.

  2. Re:Bug or feature? by rtb61 · · Score: 3, Interesting

    Not even close. Under law, a professional is a professional and that ties to responsibility for actions. That design was professionally criminally negligent and should be treated as such, with the penalty to reflect the harm causes and that means possible custodial sentence along with a massive fine. Let's not get freaky on the custodial sentence though, probably sufficient to let them 'cool their jets' with no more than a 90 day sentence if no one died but at least 30 days, sort of put the wind up them, focus their attention, remind them there are real penalties for being a crap professional, being in a role you should not be in. If anyone dies though, manslaughter charges.

    Find those individually responsible fine them, let them feel the weight of a custodial sentence, 30 days and fine the company much more. Custodial sentences should be the norm for criminal negligence as a professional, start licensing coders because of the harm they can cause. Differing grades, low grade licences for low risk work, high grade licences for high risk work. If you do not force them to do a good job, they will continue to do a shitty job, with a meh, someone else's problem for the shitty work the coder has done.

    --
    Chaos - everything, everywhere, everywhen
  3. The 'Vendor Supplied' 100 second minute by TheRealHocusLocus · · Score: 5, Interesting

    In 1987 I had just taken a job at the local Telco and was hitting a steep learning curve. My experience to that point had been PC computers and networks, assembler, CBASIC dBase and the like. This was an IBM System/38 and their billing software used RPG/III, which was a real structured language unlike its spaghetti-GOTO RPG/II cousin, but aspects were still position sensitive and opcodes were silly-simple compared to languages with which I was familiar. It was more like assembler than anything else. Most data flows consisted of running commands that generated a relational input stream sort of like an SQL query, through simple RPG programs.

    We had just installed an ITT 1210 switch and ITT had sent over a block of sample RPG code demonstrating how to parse the various fields and flags appearing on call tapes. My boss provided specs for the internal call ticket system they were using and the simple (!) task was to write a shim that generated a batch of call tickets from each tape. Pretty straightforward, tedious without being intricate. But one part of their code slapped me across the face when I examined it.

    The tape recorded end time and call duration in whole seconds, call start time would need to be calculated. They had supplied a routine to do this but it didn't make any sense because I could see no modulo 60 arithmetic in it, they were applying the simple RPG subtraction opcode on the zoned fields. I spent the most mystified HOUR of my LIFE searching the language manuals for that surely described RPG's 'magic' ops for manipulating times and dates, which I assumed had to be there because IBM is GREAT and I am STUPID... finding none. Forced to conclude that I was looking at concept code that was dashed off hurriedly in two minutes I confronted my boss with it (and my solution) but it was a hard sell at first, because my boss was incredulous too.

    --
    <blink>down the rabbit hole</blink>