US Monitoring Database Reaches Limit, Quits Tracking Felons and Parolees
An anonymous reader writes "Thousands of US sex offenders, prisoners on parole and other convicts were left unmonitored after an electronic tagging system shut down because of data overload. BI Incorporated, which runs the system, reached its data threshold — more than two billion records — on Tuesday. This left authorities across 49 states unaware of offenders' movement for about 12 hours."
As the astonished submitter asks, "2 billion records?"
Assuming that's a normal "US" billion, and assuming it's a journal of historical data going back a few years, I don't think it's unreasonable to think there could be information in there on a couple of hundred thousand people each of whom has been track for an average of at least 6 months. So, approximately and with some guesses, that's around 55 records per prisoner per day. 1 update every 30 minutes? That sounds about right, maybe a little on the low side if anything.
What is surprising is that they were running some sort of database process that maxxed out at 2 billion records, and that it just stopped once it hit that limit rather than failing over to a backup process. But then, this is a government IT contract, so maybe it's not too surprising.
http://twitter.com/onion2k
Prisons and other corrections agencies were blocked from getting notifications on about 16,000 people, BI Incorporated spokesman Jock Waldo said on Wednesday.
- interesting number. Anyway, it's not about the number of people in the database, it's about some number of records associated with each person presenting their location, so probably GPS coordinates taken at some time intervals.
Also note that they are still logging the data, they just can't read it, so it's an application for displaying the coordinates that is failing. Quite possible that the actual problem is in filtering the data, maybe they are just trying to view data for an entire time period per person rather than looking at latest records, something like: 'last month only'. But this is, in the words of infamous W, 'speculaaation'.
You can't handle the truth.
2 billion? That's awkwardly close to 2147483647... This is why your ID field should be BIGINT and not INT.... They where probably logging coordinates etc.
I'm not sure any data has been lost. Say they have a table with the following columns:
id (auto increment) ...
felonid
gps
timestamp
If the 2 billion number is simply id that has run over, there's still enough data in the database to recreate the felons whereabouts using the gps and timestamp columns. Might be a problem in the system pulling data (based on id), but probably no data has been lost.
There seems to have been a period, roughly when hard drive capacity was rising more rapidly than application demands for data, when nobody cared too much. Before that, backing store was limited and we had to worry about data size. Now, application data sets are growing enormous even for quite trivial applications, and we need to worry about keeping data storage in bounds again.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
Sharding? Partitioning? But most importantly, using 64bit int types (or bigger) rather than 32-bit ints for primary indexes? I mean, what the hell they were using to store that data anyways? A Visicalc spreadsheet running on a TRS-80?
It seems to be the crap database of choice these days, especially for consulting companies. Friend of mine got a job not long ago as a consultant for a consultant. Yes really, he consults for a consulting firm. Not like he is someone they hire out, he is a consultant they hire to work on jobs they've been hired to work on. The thing that got him the job was his Quickbase experience. This company loves them some Quickbase for some reason. However they are always bashing in to limits it has. Had they used MSSQL or Oracle they'd be fine, but they didn't. So a major thing he does is work around those limits in various creative ways. Retarded, but that's what they want and they'll pay for it.
Yes, that was the joke. See, GP poster is implying that even though the system should have been using something designed for the load, since it is a government contract, they used Access.
Nerd rage is the funniest rage.
If you haven't, rent a copy of the documentary "Hacking Democracy".
Diebold chose to use MS Access as the backend for voting machines
I don't see those people as criminals, at least not with a capital 'C'. I'm straightedge but I don't see smoking pot as being any worse than alcohol. I would rather have my crime fighting dollars go to jailing thieves, murderers, rapists, *narcotic* dealers, etc. Not someone doing something the equivalent of having or selling a drink.