US Monitoring Database Reaches Limit, Quits Tracking Felons and Parolees
An anonymous reader writes "Thousands of US sex offenders, prisoners on parole and other convicts were left unmonitored after an electronic tagging system shut down because of data overload. BI Incorporated, which runs the system, reached its data threshold — more than two billion records — on Tuesday. This left authorities across 49 states unaware of offenders' movement for about 12 hours."
As the astonished submitter asks, "2 billion records?"
Right? You shouldn't feel safe. Not because of the "criminals" but because of the reason why there are so many "criminals." Have a joint on you? You're a criminal. Do you know how many people are in jail because of simple drug-related offenses? Be afraid. http://www.whitehousedrugpolicy.gov/publications/factsht/crime/index.html Look at that. 25% of federal inmates are in there for drug possession. I bet you a good amount of these people wouldn't rob you at gunpoint. Good luck, America!
You are now manually breathing.
it just stopped once it hit that limit rather than failing over to a backup process.
"just over 2 billion" is almost certainly 2^31 (2 147 483 648), or the maximum number representable by a signed 32-bit integer. People usually think of "over 4 billion" (2^32) as the integer limit, but that's for unsigned integers only, which are rarely used, especially in databases. I'm willing to bet that they used an "int" as a primary key in one of their tables, and simply overflowed the maximum possible value.
This kind of bug has impacted lots of systems in the past. If it happens, there's no "fail over" that could possibly save the system. The replica would have the same data, and hence the same issue, and would have failed as well. The usual fix is to extend the key type to 64-bits or longer (e.g.: GUIDs), but for a 2 billion row table, that's going to take hours at best, probably days.
Most database systems do not provide a warning when the keys start to approach large values, so it's easy to miss.
Anyone remember when Slashdot hit 16,777,215 comments, and overflowed MEDIUMINT? The ALTER TABLE statement that fixed it took hours to run. I shudder to think how long it'll take to fix this, even with the problem diagnosed.
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.