Giving Doctors Grades Has Backfired
HughPickens.com writes: Beginning in the early 1990s a quality-improvement program began in New York State and has since spread to many other states where report cards were issued to improve cardiac surgery by tracking surgical outcomes, sharing the results with hospitals and the public, and when necessary, placing surgeons or surgical programs on probation. But Sandeep Jauhar writes in the NYT that the report cards have backfired. "They often penalized surgeons, like the senior surgeon at my hospital, who were aggressive about treating very sick patients and thus incurred higher mortality rates," says Jauhar. "When the statistics were publicized, some talented surgeons with higher-than-expected mortality statistics lost their operating privileges, while others, whose risk aversion had earned them lower-than-predicted rates, used the report cards to promote their services in advertisements."
Surveys of cardiac surgeons in The New England Journal of Medicine have confirmed that reports like the Consumer Guide to Coronary Artery Bypass Graft Surgery have limited credibility among cardiovascular specialists, little influence on referral recommendations and may introduce a barrier to care for severely ill patients. According to Jauhar, there is little evidence that the public — as opposed to state agencies and hospitals — pays much attention to surgical report cards anyway. A recent survey found that only 6 percent of patients used such information in making medical decisions. "Surgical report cards are a classic example of how a well-meaning program in medicine can have unintended consequences," concludes Jauhar. "It would appear that doctors, not patients, are the ones focused on doctors' grades — and their focus is distorted and blurry at best."
Surveys of cardiac surgeons in The New England Journal of Medicine have confirmed that reports like the Consumer Guide to Coronary Artery Bypass Graft Surgery have limited credibility among cardiovascular specialists, little influence on referral recommendations and may introduce a barrier to care for severely ill patients. According to Jauhar, there is little evidence that the public — as opposed to state agencies and hospitals — pays much attention to surgical report cards anyway. A recent survey found that only 6 percent of patients used such information in making medical decisions. "Surgical report cards are a classic example of how a well-meaning program in medicine can have unintended consequences," concludes Jauhar. "It would appear that doctors, not patients, are the ones focused on doctors' grades — and their focus is distorted and blurry at best."
How could no one have foreseen the potential abuse and pitfalls of a system like this? Without even reading any further than "Giving Doctors Grades..." I immediately conjured images of a bunch of doctors huddled around each other saying, "I don't want that one." "Well I don't want that one either. My feedback is back at 85% and I can't risk another death screwing me over."
bad metrics lead to bad results. Who would've guessed?
Gotta go, must write a million lines of code so I am "productive".
That's the problem with using metrics as incentive. You'll find people caring more about the metrics rather than the outcomes that are actually important.
I think that this Dilbert comic captures the idea quite well.
In NY, where I live, we're now "grading" teachers based on how well their students do on standardized tests. Any teacher who strays from the "prep for the test" subject matter and uses inventive ways of helping their students learn is going to have students who might know more, but who will perform worse on the tests. Teachers who stick to the script and drill test preparation into their students will wind up with better scores even though their students will know less (except how to fill in bubbles).
Just like the Doctors example in the article, the "teacher grading" system is going to backfire. Talented teachers will be kicked out (test scores are tied to their jobs now, your students get low scores and you're out) and mediocre teachers will remain. It's almost like trying to take the jobs that teachers and doctors do and standardize their job functions across every student/patient they see doesn't work. Maybe because their jobs require using their brains and trying different techniques as opposed to an assembly line worker who just needs to perform the same task every time with no variation.
My sci-fi novel, Ghost Thief, is now available from Amazon.com.
Wasn't this addressed by the Scrubs TV show years ago?
competence isnt being measured here. the altruistic goals, "live" or "dead" instead are supplanting good science to determine which doctors are and are not performing well. Death is not objectively bad in cases where it is an unavoidable consequence of environment or genetics. Quality of care and quality of life, the two metrics doctors have always used, is a far better judge of performance. If a 78 year old chronic smoker dies from emphysema then it is of little use to chastise a surgical team or doctor for the death.
Good people go to bed earlier.
It's still a good idea, but the metrics need to be better thought-out to account for the patients that are being seen. A proper system will also "grade" each patient based on how bad their condition is, and then combine the mortality rates to come up with a metric that reflects how well the doctor is doing at improving outcomes where it is possible to do so.
This sounds tough, but how much of the high risk- low success operations being done contribute to the high cost of health care in the US? maybe in some of the high risk situations somebody needs to say no. sorry, but costs are out of control and we need some realistic assessments. On a similar note, its been some years since I've heard people say ' I don't care how much it costs, if it just saved one life it was worth it.'
The doctors are chumps to operate under these rules rigged by the lawyers.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Painfully obvious that a single metric like this would backfire. A better model is one where we assume (unless demonstrated otherwise) that everyone in the profession at hand is striving in good faith for excellence, then provide mechanisms to self-report errors and close calls without fear of punishment. The body handling this then uses the lessons learned to continually improve the systems and processes that the professionals interact with to lessen the likelihood of impact due to human factor errors in the future. Everyone's weaknesses and experiences in aggregate paint a much better picture of what the ultimate risk mitigation strategy looks like. Check out the airline industry. It works extremely well, and I'm underselling this.
"Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs."- Bill Gates. " Thank god some places don't believe in such madness
Any system that people can game, will be gamed.
That reminds me of a story about a survey ranking healthcare facilities by mortality rates. The best-ranking facility, with 0% mortality, was a dental-surgery center; the lowest-ranking, with 100% mortality, was a hospice.
I imagine that sparked a few interesting conversations with insurance companies about where to send someone for end-of-life care.
When the statistics were publicized, some talented surgeons with higher-than-expected mortality statistics lost their operating privileges, while others, whose risk aversion had earned them lower-than-predicted rates, used the report cards to promote their services in advertisements."
So just grade them on talent and/or risk aversion. The quote above implies that both are identifiable qualities.
^ sarcasm
systemd is Roko's Basilisk.
"They often penalized surgeons, like the senior surgeon at my hospital, who were aggressive about treating very sick patients and thus incurred higher mortality rates," says Jauhar.
It is true, some surgeons who are willing to treat very difficult cases would be adversely graded. But shouldn't there be some mechanism to apply brakes to the aggressive treatment? Some patients, and some of the relatives will be seeking treatment even when the situation is utterly hopeless. There are incentives for the doctors and the hospitals to pursue aggressive treatment. So, under these circs, is it really bad these grades are making them reevaluate the cases and be more realistic about the prognosis?
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
With finite resources (credibility) is it more appropriate treat one very sick patient or three moderately sick ones?
I'm going to admit it -- I first read "Giving Doctors Grades Has Backfired" as "Giving Doctors Grenades Has Backfired."
I think my version was the MUCH better article.
Shut up, sexconker
"If you have nothing to hide, you have nothing to fear." - Every fascist, ever
Doctors don't like bad outcomes. They already balance the possibility of helping the patient against potential bad outcome. Tipping that balance away from the "helping the patient" side seems a little perilous to me.
Sure, but what of the top cardiac surgeon who loses half of his patients but the other half go on to a high quality of life lasting years longer? Sometimes aggressive treatment does offer a fair chance for meaningful recovery.
OTOH, an expensive cancer treatment that buys an extra month of agonized delirium and never results in remission is an example of excess aggression.
How could no one have foreseen the potential abuse and pitfalls of a system like this? Without even reading any further than "Giving Doctors Grades..." I immediately conjured images of a bunch of doctors huddled around each other saying, "I don't want that one." "Well I don't want that one either. My feedback is back at 85% and I can't risk another death screwing me over."
People did foresee it, but that doesn't mean the decision-makers decided against it.
Doctors knew that having hospitals professionally administered as a business would be a nightmare with lots of deleterious effects on patient care, but it's still how the profession has evolved.
The fact is that unless you personally know people who are familiar with a doctor's skill from a medical perspective, it is fundamentally *impossible* to tell if they're any good. And none of those people will talk outside their profession because it's a very private profession in a lot of ways.
I've known about surgeons with world-class reputations who were terrible in the operating room and others who don't have great reputations because they don't publish a lot but are amazing with patients or have amazing surgical skill. You just don't know 99% of the time, so you make the best decision you have with the information you can get. And if it's a major surgery, you don't go to whoever it is your HMO suggests you go to--you actually do some research and ask intelligent questions and consider options and get a second opinion on how best to proceed and reject a doctor who can't answer a basic question or is flustered at being asked and so on.
A neutral rating system is a good idea but it has to be able to normalize for the extent of the diagnosis, and that is a hard problem that apparently wasn't done well here. There are metastasized cancers that are curable and ones that are not, and a whole range of treatability, and lumping them into N3 or N4 after a certain point is going to discourage doctors from operating on harder N4s or harder N3's, for example.
You can also get a really aggressive cancer or the like that a good surgeon can tell under the microscope is incredibly aggressive, that needs to be treated quickly and radically, and the system is really bad about penalizing people for spending the money to do that because the system will say "it just fits into type X," an early-stage small cancer, for example.
These rate my doctor sites seem to generally be right on the money. Our first two dentists really sucked, and when I checked them out on these sites the consensus was that they sucked. Then when I read about some doctor losing their licence in my area I will check out their rating and with a single glaring exception they always have comments such as, "I have no idea where Dr. X got his licence to practice but a crackerjack box would be a good start.".
Then when I finally used these sites to find our present Dr. and Dentist the sites said they were great and they were causing me to add the chorus of glowing reviews.
'Aggressive treatment', in this context, refers to being willing to operate on a sub-optimal candidate for the surgery because their prospects otherwise are *dire*. (If we don't perform the procedure, they're likely to be dead in weeks.)
Risk-averse surgeons will say, 'I won't do this procedure, because the odds are less than N% that it will work.'
'Aggressive' surgeons will say, 'The odds that this procedure will work are less than N%, but I'm willing to give it a shot if you are.'
Assuming equal skill & competency:
If your surgeons are *too* risk-averse, you'll have a lot of people, who should have survived, dying from surgically treatable problems.
If your surgeons are aggressive in treatment, you'll have some people dying a bit sooner than they would have otherwise (the failures), but you'll also have people surviving who wouldn't have otherwise (the successes).
... and I'm sure many other Slashdotters are, too.
The same effect happens in many industries that use junk metrics. At my old job, at a Sprint store, it lead to most reps actively avoiding scenarios that would result in a lower customer survey. That left the more honest people and better problem solvers to deal with a disproportionate amount of "bad" scenarios.
The good problem solvers ended up with "bad" rankings while others received bonuses and promotions for effectively dumping their work. After a couple years you saw a noticeable exodus of quality reps. Meanwhile, bad reps had been promoted and just took the "avoid it" mentality up the ladder. This lead to whole stores or call center departments doing a little dance of avoidance, hoping the problem would be handled by some other sucker out of their hierarchy. Then they compounded the problem by making a large part of pay based on this instead of sales. So, you saw the good sales reps leave and you're stuck with the mediocre bunch who are mostly good at deflecting anything marginally problematic. That looks great for the little surveys, but kills your *real* customer relationships and kills your sales.
The sad thing is that you could almost sympathize if this was a homegrown idea. But, nope, it's a common thing we see from standardized school tests to the examples of medical care in this article, and many other industries. It's as if none of the people in charge of implementing these ideas has ever heard of the concept of perverse incentive or unintended consequences.
medicare is a cash cow for DR's so they very willing to do a lot for older people.
Suppose a patient comes in for a routine checkup and the doctor finds an advanced cancer and the patient dies. The primary care doctor who had the patient "in for a routine checkup" should not be punished for the poor outcome.
I get the feeling that is not what you meant to happen when you said "losing a patient who was just in for a check-up should count HUGE", but that is what you said. It highlights the difficulty of doing this kind of thing correctly.
--PM
I agree that in some cases it is cruel to provide treatment beyond pain management.
Where do we draw the line between little to no hope and a "fair chance"?
I am curious to see how many people who would support not treating hopeless cases, especially for monetary reasons, also support doctor-assisted suicide laws like Oregon has.
HODAD- "Hands of Death And Destruction" - A Hopkins doctor wrote a book about the subject.
From the article:
"At a medical conference Dr. Marty Makary saw one of his Harvard professors who “looked out at a room of 2,000 doctors and asked ‘How many of you know of another doctor who should not be practicing because he is too dangerous?’ Every hand went up.” Yet few report bad doctors and those that do often get fired.
Hospital staff knows they are practicing bad medicine and mostly do nothing. In Makary’s provocative book, Unaccountable, he describes one Ivy League-trained doctor who’s popular with patients yet dubbed Hodad, by his colleagues, for his continuing string of patient deaths. Hodad is their dark humored acronym for “hands of death and destruction.”
Doctors are kind of like cops. They both do a life and death, high stress job, and are under assault from all corners (for different reasons). So they protect their own. But to improve illness survivability, and in the interest of trying to get more information to patients, there has to be some way to get information about doctors to patients.
On the other hand, any metric will be gamed. So - if doctors aren't willing to police themselves... what choice is there but trying to get metrics on them? We're not talking about a good and a bad choice, we're talking about a bad and worse choice - which one is less bad?
And if you think the teachers union is badass - the AMA is made up of doctors, who are smart and relentless and wealthy. They're a big lobby in DC (although smaller than I thought prior to looking them up. In recent election cycles, with Obamacare, I recall seeing them near the top of the list).
Central Planners will never learn. There is no way to objectively measure these things because everyone's standards are different. I know people that are perfectly happy (and surprisingly healthy) visiting all sorts of quacks like chiropractors and acupuncturists.
People have different values for every aspect of these things which is why you just need a free market in medicine.
I love Jesus, except for his foreign policy.
Where do we draw the line between little to no hope and a "fair chance"?
We don't draw that line. That line is drawn by consensus, between the patient's representatives and their doctor. And it's none of anyone else's business.
I'm gonna play devil's advocate here and say that maybe this isn't a problem. Doctors are getting downgraded for choosing to perform risky surgeries, but maybe that's a poor decision on their part: perhaps they should be seeking safer solutions, or suggesting that patients might want to make the best of the time they have rather than dying on the operating table. And that's not even to mention the cost of risky failed surgery.
Hopefully there's an exemption for truly experimental procedures, or medical progress will slow to a crawl, but one could argue that all things considered, medicine could use a little more caution in choosing surgery.
If your surgeons are aggressive in treatment, you'll have some people dying a bit sooner than they would have otherwise (the failures), but you'll also have people surviving who wouldn't have otherwise (the successes).
While this is true, it overlooks a few significant things, like the fact that those last few weeks can often be very meaningful for the family and the patient... time to "say goodbye" and perhaps do some final things with family and friends. A surgeon who "oversells" a risky treatment or doesn't properly weigh the decisions with the patient and family may deprive them of some really important time. It may be "just a few weeks or months," but that is often precious time to lose. And studies have shown that doctor authority carries great weight with people, so they'll likely go along even with a risky procedure if the doctor presents it in a positive light.
Also, your two outcomes (patient dies or patient recovers) are not the only possible ones. Others include: patient experiences severe complications and continues living but in severe pain or disabled, patient goes into a coma or non-responsive state, drawing out the grieving process for families and shouldering them with difficult decisions, patient gains a short time but quality of life is degraded a bit in those last few weeks or months, not allowing the patient to do what he/she wanted to at the end of life... etc
Our medical establishment is very focused on prolonging life at all costs these days. But length of life is not always what's best for the patient overall.
Tipping that balance away from the "helping the patient" side seems a little perilous to me.
Agreed. The problem comes when we take "helping the patient" to be synonymous with "keeping the patient alive at all costs, no matter how pain, suffering, or disruption in quality of life may occur." Medical practitioners often make that equation, but it's not always true. Sometimes what's best for the patient is to listen to their needs... and sometimes prolonging life no matter what (and sometimes only gaining an extra few weeks or months, often with great suffering) is NOT "helping the patient" overall.
Whenever a program is designed to link quality of medical services to outcomes, allowances must be made for the relative levels of risk that exist before and during treatment. Without some kind of risk adjustment, you will have false signals of less skilled medical personnel self selecting into less risky cases to protect a high score and otherwise brilliant and skilled staff unfairly penalized for taking on the more challenging cases. The goal of risk adjustment is to remove the riskiness of the patient from the outcome scores so that relative quality of services can be more accurately compared, without the noise created by inherent risks that have nothing to do with quality of care.
This was my first question when I heard about this program. :(
I feel like Cassandra a lot.
Opponents of "Hillarycare" in the 1990s predicted that her plan's rating of doctors based on outcomes would lead to excellent doctors who took-on difficult cases and very ill patients getting lower grades than bad doctors who only took on healthy young patients with the result being that doctors would avoid patients who were very sick or severely injured. Hillary's supporters expressed outrage at the criticism.
When "Obamacare" came up for debate and similarly proposed to reduce costs and increase quality by rating doctors based on outcome, the same criticism arose from opponents, and the same left-wing supporters of "universal healthcare" similarly claimed this was a false argument.
This big lie at the heart of "universal healthcare" plans that promise to do opposing things (limit costs while improving quality) within a big bloated institutional (government and/or corporate) scheme have long been known and predicted as even the LA Times (hardly a Republican rag) noted in 2005 long before Obama was even running for President. In fact, it has long been known, and there are studies to prove it, that many things about the patient drive the outcomes more than the quality of the doctor (within reason, of course - frauds and quacks in ANY field should not be lumped-in with valid practitioners).
Some of the leftist true-beleivers who were architects of both Hillarycare and Obamacare (guys like Rahm Emmanual's brother) actually embrace this sort of effect as a natural form of rationing (some people get denied care, but without a government order and therefore with full-deniability for government). This goes hand-in-hand with various plans favored by fans of Eugenics and advocates for "whole life" systems of healthcare resource allocation, where the young and the old and the sick and handicapped are discarded in favor of those who are productive and in their reproductive years. Most people see "Logan's Run" as a cautionary tale, but these people fetishize it as an ideal.
Sadly, no matter how many times anti-Marxists predict things that are absolutely predictable because they result from physical or economic laws, Marxists will deny them and convince the ignorant masses that they have a way to violate the laws of physics or the laws of economics.
Trial lawyers are more-important to the Democrats than almost any other contituency. They, along with labor unions and Wall St investment bankers (not the same thing as actual small-town bankers) provide much of the money the party runs on. This is why Democrats always fight any scheme that prefers arbitration over lawsuits, oppose malpractice reform, oppose plans to let non-lawyers do non-critical legal functions, demand ABA ratings for judges (who should not need to be rated by the private lawyers' guild, given that judges are not required to be lawyers), and so forth.
Lawyers are to the Democrats what the NRA is to Republicans.
Evidently, this question is common in engineering (see failure mode and effects analysis), but not so much in regulations.
That's a tough question and I have no bright line test to answer it.
One telling point is surveys of oncologists. In those, most say they would not want the aggressive treatment themselves.
Who would have thought that collecting these kinds of metrics could ever possibly give the wrong impression.
I mean it works so well in the public school system.
A little ignorance goes a long way, especially when the ignorant presume to think everybody else is even more ignorant.
The question of measuring medical provider performance/quality has been around probably as long as there have been doctors, witch or otherwise. A lot of work has been done, a lot of different measures devised; the bad measures didn't get too far, the good ones survived; but rather than just look into the literature, people prefer to reinvent the wheel, then complain that it is useless because of the corners on it.
Obviously, you have to take into account "degree of difficulty" both of the patient, and the actual problem being repaired. Everybody should realize this going in; anybody publishing a quality measure that doesn't address this needs to be laughed out of the business.
Of course the concept of adjusted rates is standard for epidemiology, or even statistics of any form on any subject. "Case mix adjustment" in the general sense is a standard technique that goes into things like Medicare reimbursement; doctors get paid more for taking on the more difficult patients. Been that way for years. The same technique can be used to level the playing field for outcomes, so that apples and oranges and so on. Of course, it's not always meaningful to just roll everything up into one quality score, by definition you are throwing away a lot of data. Much more informative to find out who is better at specific types of cases; somebody who is great at really tough cases might be "better" but if your case is just routine, you might even do better with somebody who handles only routine cases.
I used to make a living cranking out quality reports for various surgeries for hospitals; not only did the hospitals not argue with them, they actually paid large $$ for us to do them for them. Key to the project was getting the hospitals' buyin from the start, by having their doctors sit down with us in a big meeting right from the design phase, tell us what to take into account; age, gender, diabetic status, smoker status, etc., plus things like "If there's a transfusion on the record, give the doctor a black mark, that means he nicked an artery", etc. Then every year after the new ratings came out, the same panel would sit down with us again and discuss whether the report was reflecting reality, like "Take that transfusion thing out again, all the ones we're seeing are just alcoholics with bleeding ulcers, not the doctor's fault. Just look for actual nicked artery diagnoses"
the proof was that the rankings generally fell into two categories: Good, and, occasionally, not so good. We wouldn't see anybody as "way better" because of that patient choice thing above; the doctors and hospitals you might expect to be better restrict themselves to the more difficult cases, so there really isn't much overlap of similar degrees of difficulty to show where one is better than another. Each one was good in the category they chose to work in.
And, when once in a while somebody scored below par, it was always obvious what the problem was; in one case, inadequate preoperative testing by the hospital compared to what everybody else was doing; they happily changed their practice once we pointed out the cause and effect. In another case, it was obvious that every patient from one particular nursing home would get an infection once they had checked out of the hospital. Nothing the doctors could do but hint to patients not to move into that place.
Eventually, the project collapsed when cheap and shoddy rankings like "percent of total patients who died, unadjusted" started to be demanded by big national organizations, and the hospitals couldn't afford the luxury option of paying for both sets of measures; and our management couldn't peddle the program to the folks who wanted the stupid measures.
Bottom line, you can rank medical quality as well as any other repeatable process, even with all the individual differences between cases; you just have to be smart about it. Why people are still approaching it like the world was just created this morning is a mystery.
Star Trek transporters are just 3d printers.
Sure, but what of the top cardiac surgeon who loses half of his patients but the other half go on to a high quality of life lasting years longer? Sometimes aggressive treatment does offer a fair chance for meaningful recovery.
OTOH, an expensive cancer treatment that buys an extra month of agonized delirium and never results in remission is an example of excess aggression.
These dumbass ratings tend to rely on whether the patient left the hospital horizontally or vertically; even if he dropped dead in the parking lot; because that's all there in the record, no added effort required. A much better measure is, obviously, how many are still alive a year later, but that requires something more than just reading the hospital chart. That measure isn't appropriate for the kind of stuff that buys an obviously dying person another month or two but that's a different thing than more or less routine surgery, no matter how difficult, that attempts to actually fix the problem. If the patient is expected to go home and get back to normal within six months, and it turns out he's dead six months later, that's a sign of something suboptimal. I'd estimate that maybe half the cardiac patients released by hospitals as OK are dead within a year. Often, they die in a different hospital than did the surgery, so the feedback never gets to the place that could use it.
Star Trek transporters are just 3d printers.