UK's National Health Service Moves To NoSQL Running On an Open-Source Stack
An anonymous reader sends this news from El Reg:
The U.K.'s National Health Service has ripped the Oracle backbone from a national patient database system and inserted NoSQL running on an open-source stack. Spine2 has gone live following successful redevelopment including redeployment on new, x86 hardware. The project to replace Spine1 had been running for three years with Spine2 now undergoing a 45-day monitoring period. Spine is the NHS’s main secure patient database and messaging platform, spanning a vast estate of blades and SANs. It logs the non-clinical information on 80 million people in Britain – holding data on everything from prescriptions and payments to allergies. Spine is also a messaging hub, serving electronic communications between 20,000 applications that include the Electronic Prescription Service and Summary Care Record. It processes more than 500 complex messages a second.
Is this a big IT project that actually worked? Where's my fainting couch???
I can't help but get the feeling that within a few months they'll be running back to Oracle or some other real database system.
At this point, anyone who works with databases in industry knows that "NoSQL" has come to mean inconsistent data, corrupted data, and silently lost data.
One just can't throw away atomicity, consistency, isolation and durability without running into some serious problems.
And that's totally ignoring how it becomes damn near impossible effectively query NoSQL databases. Sorry, writing complex queries in some imperative subset of JavaScript is totally the wrong way of doing things. Intentionally not learning SQL takes more effort than learning how to use it!
As service-user I've always had the impression that the NHS database was a large Excel workbook and a load of VB macros written by interns.
Obviously you have never worked with HL7. One message will have hundreds, if not thousands of pieces of data.
..., they actually rolled out something., Didn't a huge replacement project runs for years and years, soak up bazillions and then get cancelled? But maybe that's the 'clinical' side of things. Yes, here it us .. http://www.theguardian.com/soc...
"The greatest lesson in life is to know that even fools are right sometimes" - Winston Churchill
Obviously you have never worked with HL7. One message will have hundreds, if not thousands of pieces of data.
Yeah - at least in the US and Canada, even parsing HL7 transactions can be a pain. Different rules and practices in different hospitals, inconsistent rules and practices within the same hospital, apparently contradictory transactions, out of order transactions... I predict a royal mess with NoSQL. With Relational they had at least some assurance that what was read out of the DB was an accurate representation of what was put into it.
'The Economy' is a giant Ponzi scheme whose most pitiable suckers are the youngest among us and the yet-unborn.
The "NoSQL means Not Only SQL" crap you're shitting out is nothing more than the NoSQL community frantically backtracking after their "NoSQL means No SQL" ideas were shown to be disastrous bunk.
Instead of owning up to the fact that they were horribly, horribly wrong, and made some really fucking stupid suggestions, the NoSQLers have just decided to change history. They pretend that they weren't saying what they very clearly said in the past. And they obviously need to admit that SQL and relational databases are the only viable option, but can't do this without looking like the fools that they are, so they admit that it's okay to use "sometimes". And this "sometimes" ends up being "all the time", but again, they can't openly say that without looking like the incompetents that they are.
Face it, "NoSQL" does mean "No SQL". It always has, and it always will. No amount of backtracking will change the fact that the NoSQL crowd was full of shit, and still is.
Summary says: "It logs the non-clinical information on 80 million people in Britain"
Well, yes it does hold clinical information. That is a big deal.
From the UK's HSCIC web site there's more (and authoritative) information on SPINE
http://systems.hscic.gov.uk/sp...
"The Summary Care Record:
SCRs provide emergency and out-of-hours healthcare professionals with faster access to key clinical information, including details of allergies, current prescriptions and bad reactions to medicines. The Summary Care Record helps to ensure continuity of care across a variety of care settings, and is provided by the Spine."
Having or losing corrupt information in a clinical record is a good way to kill some random person. However, it is a summary, so if a physician suspects a problem in the summary, they can go to the patient's main record. Getting prescriptions crossed can also be problematic for the patient.
Ignoring the NOSQL issue, I wish we had something like SPINE here in the USA.
"It logs the non-clinical information on 80 million people in Britain " when the population of Britain is about 64 million.
I just interviewed with one of the largest healthcare focused tech companies in the US, Epic Systems. On of the more interesting things I learned while I was there was that they use InterSystems Caché, a non-relational system that's built on b-trees instead of tables. The main draw of this system is the speed at which they are able to operate, which is one of the big things they've built their reputation on. They claimed while I was there that roughly 47-49% of Americans are covered by Epic's software at some point. Now, obviously that's not just records stored in databases they designed, implemented, and support, but, especially considering that Epic targets medium to large healthcare companies, with very little involvement with smaller outfits, and the fact that they do their best not to parcel out their software, but to sell integrated top to bottom systems... well, they seem to not only be doing fine without a relational system, but thriving. I don't work for them, so I can't say any more than that since I don't have experience, but I just thought it might be of interest in relation to the relational/non-relational debate in this thread.
That dropping ACID is not hazardous to your health.
I know I shouldn't feed the trolls, but I'm bored and can't help myself.
acid isn't so important when the unit is a patient's records. there is also no need for a rigid data model.
This is unbelievable. Holy fuck, I sure hope that you don't work with databases professionally. I hope you don't work with them as a hobby! Nobody with an ounce of intelligence and even a minute of working with data would ever consider saying something as utterly stupid as what you just said.
As someone who actually has worked with patient data in hospitals, he is pretty on the money regarding the non-structured nature of some patient records. Full ACID compliance is not that important in many cases, often a proper audit trail will suffice. It is similar to banking transactions, which are almost never ACID (despite being used in so many textbook examples of ACID compliance).
One difference between an amateur and professional is knowing how to balance a system's requirements and create a design that actually fit the system's needs. Strict adherence to some guidelines is just plain stupid.
-- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
I use the Electronic Prescription Service (EPS) component of the spine and take issue with the successful claim. The upgrade has been appalling.
It was rolled out over the UK's August bank holiday, with no advance notification. After the holiday, prescriptions pulled down from the spine (they haven't implemented push messaging ... ) had invalid digital signatures, rendering them illegal. Prescriptions that had been completed and payment claimed for in Jan 14 were redownloaded from the spine. Post dated prescriptions for October also began appearing. These are only supposed to be downloadable on after the valid date for obvious reasons.
Not only was this a logistical nightmare, some issues are still broken after two weeks.
I am amazed that so many issues got through testing.
Utter shambles.
Do you even know what ACID means?
Atomicity - either everything is committed or nothing is. I find it crucial to avoid inconsistent states.
Consistency - data is valid in all states.
Isolation - ensure that concurrent transactions work correctly.
Durability - no data is lost due to software crashes or power failures
How could these not be important for banking is beyond me.
If you're storing data, you need to use a system that provides atomicity, consistency, isolation and durability. Using anything less is pure idiocy. [etc, etc]
They are using Riak which is currently being used by 25% of the Fortune 50 (fifty, not five-hundred).
The CAP theorem states there is a trade off between: Consistency, Availability, and Partitioning tolerance. Riak sacrifices consistency (although it does have eventual consistency) in favor of availability and partitioning. The people who wrote Riak (in Erlang) actually seem to be very smart. They say they are firmly in the "right-technology-for-the-right-job" camp. They are not crusading to replace all RDBMS with NoSQL.
The availability and partitioning tolerance of Riak are amazing. For certain applications these strengths greatly outweigh sacrifices in atomicity and consistency. Due to the CAP theorem, there is no one single database architecture that will be optimal for all applications. Granted, a completely different mindset is needed to use Riak if your previous database experience is all RDBMS.
From a cursory look, Riak seems to have some excellent documentation. I suggest you look at their page that explains the trade offs between using Riak and a traditional RDBMS. It also contains links to similar documentation.
We don't see the world as it is, we see it as we are.
-- Anais Nin
I also suggest you read CAP Twelve Years Later: How the "Rules" Have Changed by Eric Brewer. He concludes with:
In general, because of communication delays, the banking system depends not on consistency for correctness, but rather on auditing and compensation. Another example of this is "check kiting," in which a customer withdraws money from multiple branches before they can communicate and then flees. The overdraft will be caught later, perhaps leading to compensation in the form of legal action.
You can claim Eric Brewer is a fucking idiot as much as you want. Eventually all you will do is destroy your own credibility.
We don't see the world as it is, we see it as we are.
-- Anais Nin
I have worked for a health insurer in UK that treated ACID compliance as a bonus, not a requirement. At the time I left them, they had a whole "data correction team" - 12 people working full-time to do live SQL queries to fix database inconsistencies. I wish I made this up, but it's real. If this is considered acceptable practice, I don't want to work in this industry ever again.
There are certain ways ACID compliance is important and certain ways it is not, in fact sometimes it's a hinderance. In particular the following:
One patient's records must be consistent only with itself, you don't need the whole patient table to be consistent. It's a problem because you do need to have cross-table consistency (patients, episodes, diagnosises, treatments, medications and so on) which can lead to locking issues while they're really millions of records living in parallel. Really I'd like to treat them as millions of microtables that happen to share the same structure but never cross lock.
Perhaps in a hospital you can do synchronization at a database level but for an exchange or common journal you have to assume records come in asynchroneously, your general physician might finish some paperwork while you're in emergency care at the same time as a lab result you've waited a week for comes in. The actual ordering they're applied in doesn't matter, there must be rules so (A,B,C), (C,A,B) and (B,C,A) all end up the same result. This means you can relax the hard synchronization of for example a bank account where it is essential that the transactions are applied in order and rejected if you're overdrawing your account, but that's hard in SQL.
That doesn't just apply to the ordering of writes but also querying. If two people at different hospitals tries to pull up your medical records it is important they're not corrupt but it's not essential that an update being distributed is presented to both or none. In fact, for essential robustness they should be able to continue working independently if the connection is broken and when the connection is restored the records are reconciled. That kind of shard and merge is generally a problem relational databases don't handle while the distributed synchronization is rather essential and implicit in NoSQL solutions.
Live today, because you never know what tomorrow brings
If a bank doesn't care about ACID, which means it doesn't care about losing completed transactions, which means losing track *OUR* money so they can get more profit.
Perhaps this is where you have gone astray. The opposite of ACID is BASE where the "E" stands for eventual consistency. The beauty of this is that it DOES NOT lose completed transactions and at the same time it allows for high availability.
Strict consistency (the "C" in ACID) is a much more stringent requirement than eventual consistency. In particular it conflicts with high availability. This is the essence of the CAP theorem. In many industries, including banking, eventual consistency plus high availability (NoSQL) is preferable to strict consistency plus lower availability (RDBMS). Of course there are many other factors involved in selecting a database architecture.
One way to see this is by noting the three typical things you can do at an ATM: deposit, withdrawal, and show balance, commute (in a sense) when you are only worried about eventual consistency but they don't commute when you require strict consistency. This is why relaxing the requirement to eventual consistency gives you higher availability (when the database is partitioned). Transactions can be logged and later merged when the partition has healed. It is true that "show balance" does not strictly commute with deposits and withdrawals but: a) this does not cause the system to lose track of your money, and b) no one expects it to strictly commute. There is usually a warning that it may take X hours or days before a transaction shows up on your balance. IOW the balance will eventually be correct after you stop making transactions.
The strict consistency alternative you think is better will mean that all ATMs have to stop working whenever the database is partitioned. For most customers this is totally unacceptable especially since the only value it adds is ensuring that the "show balance" function always includes all of the latest transactions. Even the average person on the street would tell you this approach is really "stupid". No one wants the ATMs to be broken most of the time just to be sure "show balance" is always perfectly up to date.
We don't see the world as it is, we see it as we are.
-- Anais Nin