NYT: Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice
First time accepted submitter conoviator writes "The NY Times has just published a piece providing more background on the healthcare.gov software project. One interesting aspect: 'Another sore point was the Medicare agency's decision to use database software, from a company called MarkLogic, that managed the data differently from systems by companies like IBM, Microsoft and Oracle. CGI officials argued that it would slow work because it was too unfamiliar. Government officials disagreed, and its configuration remains a serious problem.'" The story does not say that MarkLogic's software is bad in itself, only that the choice meant increased complexity on the project.
Who owns this company?
how much do they contribute to XXX???
There has got to be some reason that this DB that ive never even heard of (and i work with DBs, its not my main point of work but I know my way around DBs) got the gig over the more established players.
or, perhaps they went with it because it is less known and therefore reduce the risk of known attacks in other DB systems?
have you seen my sig? there are many others like it but none that are the same
Maybe they should choose better code names for their projects. This reminds me too much of Project Mayhem.
FTA: "An initial assessment identified more than 600 hardware and software defects — 'the longest list anybody had ever seen,' one person involved with the project said. "
Strikes me as none of these people seemed to have ever worked on large projects before.
I do not approve my taxes being spent for this.
You could take a handful of proven DB technologies such as Oracle/DB/MSSQL, throw a web (Apache/IIS) and app (.Net/WAS/Jboss) front end to it, and it would work. Why did these guys fuck up the whole thing? It's like the scene in The Fountainhead when the second-rate architects smash up the plans and add their own stuff, "to express their own individuality". This could have been a solved problem - hell, it WAS a solved problem.
I want to delete my account but Slashdot doesn't allow it.
It's a clusterfuck. And it was designed from the start to be a clusterfuck because half the people involved wanted it to fail. So they could blame the other half.
A little googling turns up that MarkLogic's offering is NoSQL. Without getting into the whole SQL/NoSQL debate, I can't help but noting that this is clearly relational data with a fairly limited number of records (clearly there can't be more than 300M people listed) and for which ACID is (or should be) a major concern.
This is the problem every time you try to use some nifty new tech that hasn't matured yet. Heaven forbid your skills lead decides to leave for greener pastures (and believe me, there are a lot of greener IT pastures than government contracting...), you're stuck trying to replace him or her and you soon find out that there's only 10 people in the entire universe that actually are proficient with WizBangDB.
And here is a fantastic example of what happens when hype trumps common sense. NoSQL is the new hawtness, and apparently the dumbasses running the project wanted to be part of that. Now MarkLogic, and NoSQL in general, will have a massive blow to their reputations, and it's unknown how badly this will hurt them.
As someone who has done databases for a long time, I have very little respect for NoSQL, but that is mostly because everyone keeps trumpeting it as a killer of traditional databases. There are scenarios where NoSQL systems are an ideal fit. However, NONE of those scenarios require data to be very reliably stored in a guaranteed and predictable way.
If you don't get your tweets or your friends facebook posts as soon as they are posted, no one will really care. But for something as truly important as health insurance coverage? Are you f__king kidding me? And that's just from a reliability standpoint. Nevermind the fact that NoSQL is currently at the wild west stage where nobody is compatible with anybody else, there is nothing resembling a standard set of APIs between products, making it very difficult to develop expertise.
Sounds like the Gov was just begging for problems.
"Some people, when confronted with a problem, think 'I know, I'll use XML.' Now they have two problems."
-JWZ
MarkLogic is an XML database, not a relational database, so if your data primarily consists of XML content then it's the right tool for the job. Sounds like the vendor building the system had a favorite hammer and decided that a rather traditional database problem looked like a nail.
MarkLogic itself is fine if your data fits neatly into an XML schema, but with healthcare.gov that tree is probably enormous and hard to optimize for DB activity.
Software Shouldn't Suck
E-mail: frank at jacquette dot spamless com (remove the spamless!)
Have no fear, citizens! All that glorious leader Obama needs to do is declare that the website must get fixed by the end of November and by golly, it WILL get fixed by the end of November. ...And then he'll amend it to say "by the end of December -- PERIOD!" ...And then he'll later say months later when it is still in shambles "In all fairness I didn't say December 2013".
PPoper. Nothing Is Ndying and its if you don't win out; either the
Would make for interesting reading
her and you soon find out that there's only 10 people in the entire universe that actually are proficient with WizBangDB.
And one of those 10 is in the Tardis, asking the Doctor what went wrong with healthcare.gov
Well they did have the flashiest ads which used lots of buzzwords.
Undetectable Steganography? Yep, there's an app fo
Anyone who has worked for anything in the govt knows most contracts are a formality. The deal is done before the job is pulbished seeking bids.
Why not just use MySQL, MongoDB or MariaDB? At least use a database system that has good support, an easy learning curve and loads of followers. That beings said, proper testing would easily of mitigated this entire issue.
I can create an account, but I can't fill out the application form. The website gives me an error message when I click the button to go to the next page. arg
CGI = Computer-generated imagery
Just when I think.. wow.. you couldnt screw this up any worse... they step up and prove me wrong.
The guy who Obama appointed the first US CTO with great fanfare? He was supposed to be heavily involved in getting stuff like this right.
Oh right, he resigned in 2011 to pursue a job in the private sector. I don't think that absolves him responsibility. He should've stayed to get the health care rollout right, then he could leave.
I've lost count of how many projects I've been on where the architect decides to "make his mark" by using unconventional design choices. Then the project gets stuck in a dev hell where the actual developers struggle with either integration headaches or difficulties with the code not acting like they expect. There's something to be said for plain vanilla.
I worked as a contractor developing a system at FDA. It lasted for 5 years. Inside the Beltway, it's pretty much the same all over. Dysfunctional communications and ridiculous breakdown of authority not corresponding to lines of management. No accountability. Project management requirements that have never been followed by any project. No commitment to the output of requirements gathering. No requirements change control. No performance engineering. Inadequate testing. No acceptance process by the government. IT groups with oversight for contractor output that have never written a line of code. All in all, pretty sick and ugly. Prior to my project there were 5 failed attempts. My project followed PMI practices, worked them hard and succeeded.
See: http://assets1.csc.com/innovation/downloads/LEFBriefing_MarkLogic_031512.pdf (slide 23)
My two cents:
From the summary:
"The story does not say that MarkLogic's software is bad in itself, only that the choice meant increased complexity on the project."
Unless that complexity was necessary to solve a problem, then it is in fact bad.
We were looking at what we could figure out about the architecture of healthcare.gov, and one problem is that it looks like it's using Oracle Identity Manager to manage the permissions of what users can/can't see. That means that OIM is burned in - and it's probably brutally slow, since every time you need to check a permission you go through OIM.
I'm not positive that's the case, but it fits given what pieces of the architecture I've seen. It would also explain why the system doesn't perform - permissions checking is always brutal, especially if you don't cache them. Caching permissions has more issues.
It is NEVER the tool, but who and why it was chosen.
Furthermore, MarkLogic is a legitimate NoSQL vendor with strong ACID.
So the question is "how does MarLogic screw up the site", without the answer to that question, we should all refrain from pointing finger to merely a small piece in a huge software project.
NoSQL is NOT the new hotness, it's been out there for at least 5 years and many successful projects are using them, so for the ones that things NOSQL shouldn't be used, wake up and breath some fresh air.
Is MUMPS used by government agencies as a healthcare database?
It's even more basic than that. According to TFA the design goal was: "creating a cutting-edge website that would use the latest technologies to dazzle consumers with its many features. "
In other words, even if they had gotten this thing up and working as they wanted, right on time, it would have been an accessibility nightmare that would never work unless you have a brand new computer configured with no security anyway.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
"The story does not say that MarkLogic's software is bad in itself, only that the choice meant increased complexity on the project. "
But the subliminal take away is that MarkLogic was used and things went poorly. I feel for the people at MarkLogic in that they may succumb to someone else's stupidity even though their product is fine.
So why that decision? Whose family member is in that company?
"If any question why we died, Tell them because our fathers lied."
So the private-owned company has "officials". Cool...
none
First Lady Michelle Obama’s Princeton classmate is a top executive at the company that earned the contract to build the failed Obamacare website.
Toni Townes-Whitley, Princeton class of ’85, is senior vice president at CGI Federal, which earned the no-bid contract to build the $678 million Obamacare enrollment website at Healthcare.gov. CGI Federal is the U.S. arm of a Canadian company.
Read more: http://dailycaller.com/2013/10/25/michelle-obamas-princeton-classmate-is-executive-at-company-that-built-obamacare-website/#ixzz2laSNJyGs
'I don't know what it's called. I just know the sound it makes, when it takes a man's life.' ~ Four Leaf Tayback
I've used MarkLogic; even took MarkLogic training at their office in bay area. ... WTF .
While it's been a while since I last used MarkLogic, and don't rate myself at more than at intermediate level comfort level with it, still
MarkLogic IS quite excellent at handling large XML content repositories. The last project I worked on at a publishing company, they used MarkLogic to sift through & transform content to the order of multitude of TB.
But HealthCare.gov ??
If there was ever a case of wrong tool for job ... this appears to be it.
Big Federal government contract for something that hasn't been done before, health care industry lobbyists hovering, little known NoSQL vendor chosen, storing everything as XML.
But how would a traditional relational database scale to the 1 billion, or 1,000 billion users, huh? Did you think about the need to future-proof the application?
Current population of the U.S. is a little over 300 million. That includes children, people who have company provided health insurance, etc. who don't need to access HealthCare.gov so that the number of users of HealthCare.gov is expected to be about 30 to 45 million people.
The only way HealthCare.gov would need to support 1 billion or more users would be if we inflicted it on the China or India. That could lead to war or poor technical support and customer service. Although the later does raise an intereting question of who does someone at a call center call when they need technical support? Likewise, the population of the planet is between 7 and 8 billion. For the 1,000 billion users are you planning on also providing health insurance to E.T?
Cheers,
Dave
They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
Ben
I get there must be tons of complexity in managing healthcare.gov site interacting with all necessary stakeholders... Must be quite a lot of different databases, systems, operators to say nothing of complexity of working the actual problem space. I can understand how there might be glitches that cause wrong rates and plans to be communicated.
What I am still puzzled by are the "waiting rooms" with 40k people waiting to use a site. What the hell can justify it being so computationally expensive to spit out a list of plans? Does the universe need to be recalculated every time someone signs up? Is there some manner of massive graph problem needing to be solved for each user? If you ran a profiler on the web site what would it most be spending its time doing?
Last I checked the government can't tell contractors what software to use, they can only define requirements. Its up to the contractor to satisfy them. Now that's not to say that they can't get super specific with the requirements where one product is only one that fits, but on something of this scale I would be surprised if they did since it would open them to intense scrutiny. One of the contractors choose unwisely and screwed the others.
It uses NoSQL. Which does add a serious amount of complexity to the picture. SQL by comparison, row based, column addressable and relational. Gee, what could be so wrong with that?
The system had to coordinate with multiple independently run systems from multiple agencies and wasn't led by anyone with authority to dictate terms or experience in ugly systems integration.
After 30+ years as a database expert, teacher (I have trained engineers for many major companies in SQL and relational database theory), developer, using a NoSQL database for this sort of application is, in my opinion, just plain stupid! I do use NoSQL/Hadoop databases for other purposes, but this is a classical example of a transaction database, and SQL is very well suited for that purpose. I have developed systems that process millions of queries per hour, joining many tables, and these systems run most major semiconductor, flat-panel display, and disc drive manufacturing plants world-wide today.
The current crop of NoSQL databases are a reversion to the old network databases of the past, but with some "enhancements". They are NOT good for ad-hoc queries and indexed searches. SQL is very good for that. So, IMHO, someone who thought NoSQL was a good idea had the pull to force it to be used as the database access methods for ACA. Whoever they are should be taken out to the woodshed and taught a painful lesson!
As a European living in a country with a fairly well working public health system (Sweden), and a tech background, I am amazed that Slashdot and other tech media is behind the curve on this story. Why is NYTimes and other mainstream media the ones running with this and finding the background etc? Damn common sense tech people in the U.S.! Get on top of this to fix it!
Die dulci fruere. Have a nice day.
The real problem with this website is too much ambition. They should have started with the simplest working project, something like Health Sherpa, proved that it worked, and then tried scaling up from there. That's the Lizze Borden school of web design. Cut your design down to the bone. Just when you think you're done, no matter -- there's always one more whack you can make!
I felt bad for the man who had no signature, until I met a man who had no comment.
Healthcare IT has bigger problems.
The #1 EMR (clinical information system) being deployed across the nation is called EPIC.
It's written in MUMPS and runs on a non-relational database called Cache'.
The problem is this:
Government officials like issuing "cost plus" contracts because they allow them to remain in control of the work while not doing any of it themselves. They probably wanted to use it because they had used it in other projects in the past. If the'd just issued a fixed pice contract and said "get it done" the contractor would have selected a database they know and everything would have been much easier for them.
Now they got you arguing about technical details of the issue, making you accepting this state-imposed bullshit healthcare implicitly.
Since the days of computers first came into common use there must have been tens of thousands of database programs written. There are probably thousands of database projects underway as you read these words. And somewhere today, at least one of them is failing. ... Databases are wheels that have been re-invented so often that many veteran developers could stumble through such projects with their eyes closed. Yet these efforts sometimes still manage to fail.
MarkLogic is a good product when used as intended. Bad use case here. Like driving in wood screws with a hammer - defeats the intended purpose of the screws and causes things to fall apart.
VAT is disproportionally more painful for those without means than those with a lot of money.
I've used it, personally, to implement a public-facing website. That site endured the dreaded 'slashdot effect' several times. No failures.
When implemented properly, Marklogic is damn near unkillable. It will slow down, it will reject connections when queues are full, but it will not fall over. Naturally, this assumes proper underpinnings and capacity calculations. With Marklogic, those are actually documented.
Mandatory disclosure: I do not have and have never had any association with Marklogic other than a paying customer.
A few data points: ...
* MarkLogic does supports XML and JSON really well, but it does not use either as it's on-disk format.
* MarkLogic is fully ACID using MVCC
* MarkLogic is a scale-out cluster architecture that supports, replication, failover, HA/DR, XA transactions,
* MarkLogic database operations and full-text and structured search
* It's often used when the data to be managed comes from lots of different sources, with lots of different schemas (as is the case to the nth degree with this system)
* It takes time to get used to ML, but after about a month, it's a highly productive
* It's been around for 10 years and is used in some extremely demanding and mission-critical apps
It's long, but if you want to know more about the database, take a look at "Inside MarkLogic Server": http://developer.marklogic.com/pubs/architecture/inside-marklogic-server-r7.pdf
A few weeks ago, the press did a bunch of reporting that the Gov had reached out to Google and Oracle to come help fix their website. Haven't heard anything else since. Anybody know what happened with that move?
Given what was in this article, I wouldn't be surprised if the Google engineers showed up, looked at the mess and said, "yeeah. good luck with that." and walked out. Maybe even said, "We'll help you if we get to throw out everything and rewrite it clean in a couple of months and host it on the Google-plex."
Bell, CEO of MarkLogic is the Gay Lover de jour of Obama.
NoSQL and Notes types databases are ideal when volume is low, and variety is high and needed values change frequently. Seems to me the data they were collecting and presenting are ideal SQL candidates, and terrible NoSQL candidates. Even if the DB is not bad, it IS a very bad choice for this type of application. Although I'm sure the salesperson/campaign contributor would disagree.
I'm sure it is more difficult to 'hack' simply because it is less familiar to hackers, but validated input, and well patched servers is a better way to prevent getting hacked.
(If at first you don't succeed, do it different next time!)
I don't understand why we as American's keep giving up what's made our country great and what all other's wanted to be! Our freedom's! I mean the Federal gov should only have two jobs 1-security and 2-international relations otherwise the state should be that there own state we are no longer the united states of America were just America and were looking less n less like capitalist nation n more communist over the last 10years are we really living in that much fear that we'd sell our souls?
I used MarkLogic at a customer for over 8 years and its the best technology we used by far. This is a blame-fest and I never had any problems with the technology. MarkLogic scales really well and our website had over 300 hits per second AVERAGE to it daily and MarkLogic was never slow and always performed. In something like this I would look towards the group doing the finger pointing, thats usually a good place to start!
They went the super cheap unreliable route...
Starting in 2009, the Government's data strategy has emphasized data interoperability across organizations. MarkLogic has a fine product that optimizes XML communications between and among organizations. Because the public sites' data stores would be transmitted among the states' repositories, it's easy to see that data strategy would center on exchanges in the chosen lingua franca. Contrary to many of the posts here, I will not take issue with XML. Using it is complex, and it requires verbose communications, certainly - but the real issue with the healthcare.gov fiasco was too many moving parts. I have never seen a fully functional XML capture-and-exchange DB, but I have seen several long, drawn-out projects that sought to produce such a DB as their end product. For a mission-critical system such as healthcare.gov, they should have built two prototypes, one with XML architecture as the to-be model, and the other with a functional architecture that could evolve into an XML structure - but engineered for risk reduction. The real problem here was ineffective risk management, and the ship just kept running toward the rocks while the helmsman held the wheel steady.
I should make clear that I'm posting on the basis of personal experience. I have no particular ties to MarkLogic (though companies I've worked for do use the technology), or the US government.
Firstly, the NYT article is poorly researched. My sources tell me that MarkLogic Server is far from the only data storage technology and vendor involved, and that while the MarkLogic-powered aspects of the system did require some remedial action, those aspects were not central to the publicised problems. Remember when MS used their press contacts and marketing clout to smear OSS on the server back in 1999? I wouldn't be surprised if that's going on here.
Secondly, I have it on good authority that the primary points of failure upon launch were related to middleware connecting the modules that were in fact using RDBMS technology.
Thirdly, many Slashdotters will be learning of MarkLogic Server for the first time through this article, and I urge them to give the technology a go before making a judgment call. For one thing, while XML is the (apparent) native storage format and XQuery is the native language, the actuality is somewhat different from what might be assumed. For one thing, the native storage format is not raw XML, but a fully-indexed compressed format which provides a decent compromise between storage requirements and rapid query/retrieval. Additionally, while XQuery is still the native language used by the technology, there has been a significant effort to provide a usable interface to the data through REsT, native Java and even basic SQL. Storage is journaled, implemented as an MVCC system and ACID-compliant in a way that no other "NoSQL" platform can offer. As an example of the platform's resilience under load, I am reliably informed that the BBC's real-time online coverage of the 2012 London Olympiad (which left other news and broadcast organisations in the dust) was powered by MarkLogic Server.
Just like Linux and the OSS BSD implementations that were creaming Microsoft's NT server implementations at the turn of the millennium, alternative data storage, query and search technologies are now challenging the old guard, and the old guard are running scared, and I'd bet significant money that the NYT sources are from the established RDMS vendors. I've been contracting with employers who have been using MarkLogic Server for the best part of a decade, and IMO the technology represents the most viable threat to the RDBMS hegemony and the vendors that rely on it that has ever existed. This opinion is not based on hype, it's based on nearly ten years of experience with the technology and fourteen years of experience with RDBMS development strategies.
I'm not preaching, in fact I urge you all to make your own minds up.
- "How do we do it? Volume!" - The Bursar of Unseen University.
http://www.opensecrets.org/orgs/summary.php?id=D000043843&cycle=A