Doctors Perform Better Than Internet Or App-Based Symptoms Checkers, Says Study (sciencedaily.com)
An anonymous reader quotes a report from Science Daily: Increasingly powerful computers using ever-more sophisticated programs are challenging human supremacy in areas as diverse as playing chess and making emotionally compelling music. But can digital diagnosticians match, or even outperform, human physicians? The answer, according to a new study led by researchers at Harvard Medical School, is "not quite." The findings, published Oct. 10 in JAMA Internal Medicine, show that physicians' performance is vastly superior and that doctors make a correct diagnosis more than twice as often as 23 commonly used symptom-checker apps. The analysis is believed to provide the first direct comparison between human-made and computer-based diagnoses. Diagnostic errors stem from failure to recognize a disease or to do so in a timely manner. Physicians make such errors roughly 10 to 15 percent of the time, researchers say. In the study, 234 internal medicine physicians were asked to evaluate 45 clinical cases, involving both common and uncommon conditions with varying degrees of severity. For each scenario, physicians had to identify the most likely diagnosis along with two additional possible diagnoses. Each clinical vignette was solved by at least 20 physicians. The physicians outperformed the symptom-checker apps, listing the correct diagnosis first 72 percent of the time, compared with 34 percent of the time for the digital platforms. Eighty-four percent of clinicians listed the correct diagnosis in the top three possibilities, compared with 51 percent for the digital symptom-checkers. The difference between physician and computer performance was most dramatic in more severe and less common conditions. It was smaller for less acute and more common illnesses.
I can get twice as many A.I. programs to look at me for free, as opposed to your cartel-controlled ass.
"Eighty-four percent of clinicians listed the correct diagnosis in the top three possibilities, compared with 51 percent for the digital symptom-checkers. The difference between physician and computer performance was most dramatic in more severe and less common conditions. It was smaller for less acute and more common illnesses."
I'm surprised that digital diagnosis is that good already. The era of an "iDoc" app being as good as a gateway practitioner is probably not far off.
In the study the doctors knew they had to perform well. In the real world you're lucky if they even listen to you for two minutes before prescribing what ever the pharma rep recommended at the free lunch yesterday
There is a hell of a lot more to observe with a patient than simple a checklist of yes/no values to see if someone has a particular diagnosis. For example, years back when I had a severe sore throat, I went into the doc. She took one look at me, mentioned there is a unique smell associated with strep throat, did the test for it, and handed me a prescription for the antibiotics all within a few short minutes. WebMD, as we all know, diagnoses cancer for when you stub your toe!
The answer, according to a new study led by researchers at Harvard Medical School, is "not quite."
Oh, well, that's okay then. Everyone, pack up your computers and smartphones; they're completely useless. Let the medical school researchers diagnose your conditions in the future, because this is the best it's ever going to be.
Ask me about repetitive DNA
ONLY apps can app apps, NOT LUDDITE doctors!
Apps!
It's only a matter of time before that's not the case for the vast majority of diseases. Combine data (both historical and from diagnostics) and machine intelligence, then let someone collect a vast dataset of symptoms, diagnoses, treatments and outcomes and train some algorithms on it, and voila, you $300K med school degree is now mostly worthless. That'll make healthcare quite afforable, though, right up until you need something done that machines can't do yet. I don't believe human intervention is actually necessary for at least 80% of what we currently think of as "healthcare". Machines will do just fine, give it time.
I say that as a physician in training. As AI accuracy improves with learning, doctors will hopefully start utilizing it more and more in their own practices to increase their accuracy and make further improvements in patient care. Regarding job insecurity, logic would say that a combination of AI and physician should still be more accurate than AI alone. If/when AI is so advanced that doctors are no longer needed, I'm sure they (we) can find other jobs.
This study was conducted by medical doctors and published in a journal run by an association of doctors. So it isn't entirely surprising that doctors determined that doctors are really smart.
That's called a reputable peer-reviewed journal which is the highest standard, and an experiment conducted by rigorously trained experimenters. If you can find an actual flaw don't just post it here, send it in and they will redact the study. Otherwise, try again.
Of course "they" in the above case being JAMA Internal Medicine, the journal itself.
You might want to reread all of that "more than twice as often" means 1/3~33% accuracy for the entire group of 23 symptom checking computer programs vs 2/3~66% accuracy for the doctors. Machines have the advantages of pure data processing, so this result shows that instant recall and effectively infinite knowledge bases still don't measure up to the cognitive processes performed by trained medical doctors during diagnosis.
And to be pedantic "more than..." suggests that the group of programs did worse than 1/3 correct diagnoses.
I managed to track down the actual text of the cases. TFA was only adding the human doctors to an analysis already done with the aps. The aps paper is http://www.bmj.com/content/351... and the cases are in the supplementary material ('data supplement') http://www.bmj.com/highwire/fi...
A 48-year-old woman with a history of migraine headaches presents to the emergency room with altered mental
status over the last several hours. She was found by her husband, earlier in the day, to be acutely disoriented and
increasingly somnolent. On physical examination, she has scleral icterus, mild right upper quadrant tenderness, and
asterixis. Preliminary laboratory studies are notable for a serum ALT of 6498 units/L, total bilirubin of 5.6 mg/dL, and
INR of 6.8. Her husband reports that she has consistently been taking pain medications and started taking additional
500 mg acetaminophen pills several days ago for lower back pain. Further history reveals a medication list with
multiple acetaminophen-containing preparations.
(This one is acute liver failure requiring emergency care).
An 18-month-old toddler presents with 1 week of rhinorrhea, cough, and congestion. Her parents report she is
irritable, sleeping restlessly, and not eating well. Overnight she developed a fever. She attends day care and both
parents smoke. On examination signs are found consistent with a viral respiratory infection including rhinorrhea and
congestion. The toddler appears irritable and apprehensive and has a fever. Otoscopy reveals a bulging,
erythematous tympanic membrane and absent landmarks.
(Acute otitis media - requires 'non-emergent care', i.e. needs professional medical care but is not an emergency)
A 34-year-old woman with no known underlying lung disease 12-day history of cough. She initially had nasal
congestion and a mild sore throat, but now her symptoms are all related to a productive cough without paroxysms.
She denies any sick contacts. On physical examination she is not in respiratory distress and is afebrile with normal
vital signs. No signs of URI are noted. Scattered wheezes are present diffusely on lung auscultation.
(Acute bronchitis, self-care appropriate.)
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
They told my uncle he had six months to live, and he lasted for nearly seven. So much for the so-called "experts".
And have you seen their handwriting? My five year old can do better.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
than a motivated patient.
Or as my Chinese friend says is a phrase in China - long time patient is a good doctor
I was dealing with a few ailments over the past few years. I solved 3, the doctors solved 1. $20k (pre-insurance) operation was unnecessary, and 0% of the medications I was prescribed were necessary. If I trusted them to care for me, I would still be suffering, while on several meds and inhalers.
But I will agree that symptom checkers are bullshit as well, as they rarely get anything right.
This: 'Increasingly powerful computers using ever-more sophisticated programs are challenging human supremacy in areas as diverse as playing chess and making emotionally compelling music.', is highly subject to personal opinion.
I'm sorry that this is way off topic, but some fucker took the last two oreos from my kitchen and I damn well know that Russia had something to do with it. Those sneaky bastards left the bag empty to make me believe I was going to have a couple later, who the hell does that? Obvious Kremlin plot. Can someone tell me how I report this? It's no emergency for sure, but we need to document these things they're doing to us for future retaliation.
Hope you enjoyed my cookies Putin!
I'm a doctor, though not a diagnostician. Diagnosis is rarely hard - there are some hard cases, but they really mostly aren't. Do you have a persistently elevated blood glucose level? You have diabetes. Do you have consistently high blood pressure? You have hypertension. Etc. It's hardly surprising that computers are just as good as humans at diagnosing diseases that are mostly defined by strict, objective criteria.
What is harder is management - finding the right collection of drugs that will effectively treat a patient's diseases without introducing too many side effects. And what's even harder is anything procedural - we have no computers that can actually do procedures at all. Those aren't what most people think of as "going to the doctor", but it's what most doctors do - either manage disease, or do procedures, both of which are either mostly or severely beyond the ken of computers. Show me a computer that can do something as simple as put in an IV, and I'll be greatly impressed. So many subtleties boil down to "well, I saw something once that looked just like this, and the solution was X..." that it's worth trying X before going on to Y and Z.
My wife is a diagnostician - a neurologist. She sees stuff on a daily basis that would flummox any non-neurologist (really, I barely know what she's talking about half the time, and my peers would be much, much worse at that), let alone a computer. As the old joke goes, it's like being a car mechanic - who has to work on the car while it's doing 70 miles per hour down the highway, with zero downtime acceptable.
That's called a reputable peer-reviewed journal
... and all the peers are also doctors.
If you can find an actual flaw ...
Here is a flaw: The entire study was done with contrived "vignettes" rather than actual cases. The vignettes were written by human doctors, so just because other human doctors were better than apps at reading between the lines and figuring out the intended diagnosis, does not mean that they would be better at diagnosing actual patients.
I think there is only one clear conclusion from this study: Doctors really don't like these apps.
How well do doctors compare to a patient with access to the internet and a good dose of motivation?
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
Being able to take the "well, I saw something once that looked just like this, and the solution was X..." part and scale it up will be where AI wins out.
If you have something outside the common GP experience then it is hit and miss finding one who has the right experience to check for the right thing. Or even finding one who can point you to the appropriate specialist.
With access to sufficient data, then an AI based system could aggregate the data from not just patients it has diagnosed itself (and subsequently whether the diagnoses was correct or not), but other patients it also has the data for. Add to that the ability to ingest all the relevant papers published each year, weighting them against other studies (whether they support or contradict) and the resulting calculated reputations of the authors and journals, then the AI system has the potential to better diagnose the uncommon and rare conditions that elude the majority of GPs and even specialists.
I also note that from the article, they seem to be only comparing to simplistic consumer level symptom checkers. A more interesting test would be against powerful AI systems such as Watson.
This study was conducted by medical doctors and published in a journal run by an association of doctors....
Damn "eggsperts". The same idiots who, invented gravity, claim that water is made up of "molelcules" and said that Brexit might reduce the value of the pound. We should ban them all.
A Dr in a country practice will soon get to understand what most of their active patients present with.
A Dr enjoying the normal, safe, happy suburbs will soon find their way around the over fed, middle-class life style aliments.
Having to serve in the inner city slums will present the vast complexity of poverty, live style, work related exposure and drug related conditions combined with poor nutrition.
Combine that with a flood of very sick people who bring with them rare, regionally eradicated and contagious conditions as they enter a nation illegally. Other patients present with inherited congenital conditions thanks to way too many generations inward looking tribal and faith based marriages.
What will an average AI designer do? Over load a database with rare poverty conditions that bemuse wealthy doctors and their middle class patients? Would such a broad system sell well?
Have upfront questions about the "origins" of the patient to offer a drop down GUI of very common regional conditions?
Its not that the AI is lacking, its that its database has been set for expected conditions found in normal hospitals that turn a profit.
Skin conditions, contagious conditions are been imported and the local AI expecting an average regional count of expected contentions cant keep up.
The US software designers have to spend time in charity hospitals, free clinics, boarder states and areas where uncontrolled migration related conditions present.
What used to be called "tropical medicine" https://en.wikipedia.org/wiki/... in most advanced and healthy nations for the rare cases that returning tourists needed a specialist for now needs to be hardcoded in.
US database designers should also contact the smarter pathologists and epidemiologists and talk in person about what they are seeing nation wide. Pack the database with conditions they are seeing more and more of but go under reported in official federal digital statistics.
The US may only track certain types of TB and list many other contagious conditions as no longer needing active tracking. So official federal numbers are worthless as they are not been fully collected as well as they once had to be.
Once the database has been corrected to what is actually presenting daily, the AI will function as expected.
Domestic spying is now "Benign Information Gathering"
How did the humans do against the *best* of the algorithmic diagnosers?
Sheesh, evil *and* a jerk. -- Jade
Some symptom checker APP isn't the right benchmark. Watson type AI 'deep learning' based trained by numerous cases will most certainly outperform any rule based programmed solution as probably has been used in that APP.
if you kill yourself, you'll help me more than you can imagine.
Way to bring Brexit into an entirely unrelated discussion. You're a fucking idiot.
I'm going to call this Spunky's law - As an online discussion grows longer, the probability of a comparison involving Brexit approaches 1.
If this follows the same trend as we saw in computer vision in the last few years, then doctors will be outperformed by machines in less than a decade in all the simpler tasks. The thing is, we truly are only in the beginning of the era of machine learning, and currently, there is no upper bound to what it can possibly do.
Video of some good progressive thrash music
Well actually in the latest story. The main witness is saying that she was taken severely out of context. Trump's lawyer have already fired off a letter to the NYT demanding a retraction or he'll sue.
We'll see where it falls out in the next few days.
You mean humerus?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
I forgot, bad attempt at a thread hijack.
I'm a doctor, though not a diagnostician. Diagnosis is rarely hard - there are some hard cases, but they really mostly aren't. Do you have a persistently elevated blood glucose level? You have diabetes. Do you have consistently high blood pressure? You have hypertension
You do realise that diabetes and hypertension are not real diagnoses?
Diabetes does not describe the types of diabetes a person. Type I, Type II, other rarer types. All diabetes means is that you have elevated blood glucose levels.
Ditto for hypertension.
When doctors are no longer necessary a lot of other people will no longer be necessary. I believe at that point we can all relax more and have 10 hour work weeks.
A person with years of intense medical training plus years of additional real-world medical experience beats a couple of Mountain Dew chugging pot smoking Javascript (or Swift, or Java...) coders with BA degrees in Computer Science working for an internet startup and using Wikipedia while dreaming of the "big bucks" of an IPO...
Say it aint So!
Fact: There's no such thing as artificial intelligence; there is only simulated intelligence which is FAR from the same thing. It's all just a glorified "magic 8 ball" plus a feedback mechanism.
Fact: Humans actually KNOW and UNDERSTAND things, and as such are able to actually REASON rather than simulating these things enough to fool an uneducated reporter or two or a scifi/robots fanboy. This makes humans superior in solving complex problems, and is why even computer "solutions" are actually just instances of humans figuring out a solution and than asking a computer to do the heavy trial-and-error simulations and/or data processing.
None of this makes doctors perfect - I actually dislike them rather intensely. For any problem that requires reasoning, however, the human will nearly always be superior. There will always be the occasional exception, like a drunk/incompetent human or a computer hitting the right answer by random chance, etc. There are many tasks where computers appear to upend this, like flying a plane, but they are not really in this category of tasks that require reasoning. These tasks, like flying tend to be in fields that are more about physics than about reasoning, and as soon as an emergency arises it's time to ditch the autopilot and let the human sort things out.
Be interesting to see how IBM's Watson would perform in the same test, I suspect (some) doctors would really, really, dislike those results. It must also be said that Watson is not intended to be a "diagnosis app", it is supposed to be a research assistant for human doctors.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
Clinton is a weasel, but there's no credible evidence that he considered abusing his power to make his conquests. I only know of one person who complained that he didn't know when to stop, and she has overtones of Fatal Attraction.
Trump, on the other hand, has bragged that when you have what he has, you can just grab it without any consequences.
So I think that poor Bill ends up in third place.
The cognitive healthcare medical applications of Watson are still so expensive to use they are niche products more useful as hedges against malpractice from misaligned treatment. Imagine Azure prices for diagnosing a runny nose....
When caught, just claim it's all lies from the Liberal Press. The faithful will believe.
You quoted it yourself, he's a doctor. He's dumbing things down to very simple examples for the consumption of you and I.
I didn't catch the name of the apps they evaluated. In machine learning, the difference between cutting edge and older algorithms could be very large. If they compared AI to humans, they should at least compare to the best medical diagnosis AI.
That's because it is really lupus, but he didn't want to scare you.
"Wait. Something's happening. It's opening up! My God, it's full of apricots!"
Where's the second premise and the conclusion?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Do you have consistently high blood pressure? You have hypertension.
That's not really a diagnosis. That's just a different name for the symptoms. Bonus points for diagnosing "Pirmary Hypertension" which of course means "yeah dunno".
SJW n. One who posts facts.
You must not be a scientist if you can't understand how this could be a conflict of interest.
Why is it the medical field gets paid for a incorrect diagnosis and the treatment as well as correct ones? I think performance would increase if they knew they wouldn't get paid or have to refund it. Right now they can walk in and just say "It's this" and walk out knowing they'll still get paid.
How much taxpayer money was wasted to prove this little bit of obviousness?
But my doctor has a much slower startup time than my health app.
My app needs around 3 seconds while visiting my doctor needs an hour long drive and a 2 hour waiting room wait.
Broaddrick accused Clinton of full up rape, Willey accused him of unwelcome groping (a la Trump), Jones accused him of indecent exposure and other forms of sexual harassment.
OTOH, the latest Trump accusers conveniently came out after that video tape, so who knows if they just made up their stories? (I personally have doubts about there one who claimed a younger billionaire groped her without consent in the middle of a first class air cabin, mid-flight. But whatever.)
If only you people were this sceptical whenever St. Kurzweil blathered on about his imminent singularity bullshit as well.
I think there is only one clear conclusion from this study: Doctors really don't like these apps.
I think there are other possibilities that are maybe a little less obvious:
My takeaway from this is that the machines aren't going to replace doctors, but they may some day help the docs come up with an obscure diagnosis... or maybe train the newbies.
I'm a doctor, though not a diagnostician. Diagnosis is rarely hard - there are some hard cases, but they really mostly aren't. Do you have a persistently elevated blood glucose level? You have diabetes. Do you have consistently high blood pressure? You have hypertension. Etc. It's hardly surprising that computers are just as good as humans at diagnosing diseases that are mostly defined by strict, objective criteria.
Excellant point. My lay viewpoint on what most doctors do is not diagnose, but treat symptoms, especially at the primary care level. Someone comes in displaying X, the thought process is X is treated most commonly by doing Y and hence we will try Y. If the symptoms disappear than the patients is cured and all is well. The doctor may also offer a diagnosis, often because the patient expects it, but finding the cause is less important than deciding what will make the symptoms go away. To extend your car analogy, a low oil light can be caused by means things, but is fixed by adding oil. If it doesn't come back too quickly the cause is largely irrelevant since the vehicle is performing satisfactorily. If it comes back quickly, then it's time to find the cause an take it to a mechanic, much as a primary care doctor would refer a patient to a specialist if symptoms don't improve and indicate a more severe problem.
I'm a consultant - I convert gibberish into cash-flow.
They are comparing doctor diagnosis vs. self diagnosis. It doesn't surprise me at all that doctors are better.
However if we compare doctor vs. doctor&software the latter wins by a mile. The best diagnosis software out there is Isabel HealthCare with proven, peer-reviewed results.
These were internet diagnosis apps, designed essentially as novelties to get ad revenue.
Both Google and IBM are designing diagnostic systems for real. It will be interesting to see how they do.
Assuming that the greatest skill of a doctor is asking the right questions? Doing a vlookup from symptoms is going to be trivial for an app, but not easy to narrow down without extra info perhaps? In 'House' this would involve breaking into the patients house to look for evidence!
The computer AI, doesn't have the heart, compassion and thinking outside the box ability that the HUMAN brain does.
That's because it is really lupus, but he didn't want to scare you.
It's never lupus. (pops a vicodin)
The listed authors are someone with a Bachelor's of Arts, someone else with a Masters of Arts and a couple of medical doctors. The first MD appears to have completed a research fellowship (probably six months to a year). The senior author appears to be the most scientifically qualified, with an MSc in epidemiology. An MSc isn't exactly highly trained in science, although it is pretty good for an MD.
I have to write my own abstract this morning, but a quick scan of this thing brings up some concerns.
First, it's a "research letter" which is basically an abstract. There's very little detail about what they actually did.
Second, and perhaps most important, the responses from the humans were free text, which was evaluated (non blinded) by the study authors to decide whether or not the respondents had listed the correct diagnosis; there's no discussion of what the evaluation criteria were, what they did if the top three couldn't be established, how partial answers were handled, or what they did if more than three diagnoses were listed or not ranked.
Third, they have repeated responses from some physicians and not others, but their simple chi squared test of proportion doesn't take that into account.
Fourth, there's no discussion of how the online programs were used: how did they input the case histories? What did they do if a question couldn't be answered? Was all the information in the case histories used by each of the programs?
Lastly, they list several limitations themselves: the vignettes they used are very simplified, the human respondents weren't controlled and may not be a representative sample (they were doctors who routinely use a volunteer diagnosis web site), and online symptom checkers are not the only type of diagnostic system and others may have superior performance.
Yeah right. That's how IBM is describing it now. The demo they did here a few months ago included an ER system that diagnosed a patient, ordered a CT, evaluated it, and prescribed treatment.
RTFA: "were asked to evaluate 45 clinical cases"
Clinical cases ass-hole not something made up. Read the study before you comment. You'll look like less of an ass.
This is BS. No human can match machine learning algorithms. But - If you put garbage in you get garbage out....This just show how un-informed this so called research was.
"Eighty-four percent of clinicians listed the correct diagnosis in the top three possibilities, compared with 51 percent for the digital symptom-checkers. The difference between physician and computer performance was most dramatic in more severe and less common conditions. It was smaller for less acute and more common illnesses."
I'm surprised that digital diagnosis is that good already. The era of an "iDoc" app being as good as a gateway practitioner is probably not far off.
I was surprised that the doctor difference was smaller for common illnesses. I would have guessed the opposite. I would have expected a properly written AI to be much better at identifying obscure illnesses than a doctor that has never seen that particular illness. Also, they are using off the shelf symptom checker apps. Some of them might be ok but many are likely crap. If they are already at 51% then if someone like google seriously tried to tackle it with a decent budget then likely they could do much much better.
I'm a doctor, though not a diagnostician. Diagnosis is rarely hard - there are some hard cases, but they really mostly aren't. Do you have a persistently elevated blood glucose level? You have diabetes. Do you have consistently high blood pressure? You have hypertension. Etc.
You could not be more wrong. I'm a scientist (bio + math), and all I see is doctors unable to use their brains as shown in your open paragraph. My intention is not to insult, just to highlight that doctors just follow the book as opposed to the symptons. For example, just in the UK there's something like 10%(ish) of obese people with thyroid problems that have been misdiagnosed (apparently you only get fat eating and hormones do not play any role). Another example, antibiotics. I have been prescribed antibiotics after a doctor saw just my face (I'm not alone, several clinicians also told me this at a conference about antibiotics). Bacterial infection? Don't know. Gram positive/negative? Don't know. But there you go lad, penicillin.
So, the bottom line, diagnosis is hard. If it isn't, the doctor is doing something wrong.
Peer-review is highly overrated. Remember that is essentially people giving their opinion about a piece of work. Because scientist are also people, including all their biases.
If by only beginning you mean they've been working on it since at least the 70s and though things are better then they were then, they're still extremely error prone, then yes, I agree.
Machine learning isn't a new field in computer science. It's nearly as old as transistor based computers. And the current upper bounds of what it can possibly do is that which can be summed up by a statistical model, as that's all that machine learning is. Creating a statistical model.
Within our lifetime computers will replace the GP. That's where the lowest hanging fruit is. 1. Cheaper for insurance. This will drive the implementation, much as Uber pushes self driving cars. 2. 24/7/365 access in both good and bad neighborhoods. 3. GP bot only has to weed out to the base of the problem. If the case and treatment doesn't fall within the first line of medical defense, then GP bot just gives a referral to a human specialist. 4. One central aggregate database will lead to better care. The next advance after that will be the specialists. Humans will still have a role in the long term, but only for the outlying cases.
No you don't understand. He is a software developer, which means he is an expert at everything except programming. /s...maybe
Your response is anecdotal, doesn't match your claimed background.
The other problem is that some patients will lie about their symptoms -either exaggerating or denying. Doctors know this and have the problem of seeing through what the patient says without pissing the patient off so that the patient would then refuse to cooperate or comply.
It will be interesting to see how an AI will deal with that aspect of medical care. And yes, I'm aware that there are tests, but you can't run every test just because.
So doctors have maybe a decade left of usefulness before machines can beat them and it becomes 80% machine, 72% doctors?
You find it surprising that a system based on statistical analysis has trouble identifying things that are statistically unlikely? It is likely to assume everything is going to be that which is statistically the most likely. Common illnesses are the most statistically likely, thus AIs are going to perform best at those.
I'm reminded of my machine learning class. If you write a machine learning tool that when asked if you have cancer always says no, you'll have an incredibly high accuracy.
An AI can't make mistakes, and only ignorant Luddites with severe paranoia issues would retard process and a better world without human error and the terrible death tolls that follow when doctors are texting or drunk or ... Oh wait. That's self-driving cars. Only SDCs are perfect, I guess, though one would think SD AI doctors would be far better than humans, given the premise of SDCs. If you trust an AI to drive a car, you should trust it to diagnose your cancer. Mistakes on either's part will kill you.
If we could get every programmer coding up health care AI apps and educational teaching apps, there is not a doubt in my mind that we could stop 99% of disease and educate 99% of students without the need or cost for doctors and teachers.
What's stopping them? Flappy bird is more profitable to program.
I didn't read the article, so I don't know what apps they used for comparison, but it would be interesting to see how doctors fair in comparison to IBM's Watson. I'd guess it would do better than a phone app.
It actually sounds like people to me.
You know maybe you're retarded, or have aspergers. You should get some ritalin. I can tell because you disagree with me and a lot of aspies disagree with me. /inarguable Fark diagnosis
Seriously, though, people who have nfc about medicine (or any field) will recognize a set of familiar things and tie it to their experience. Their experience is going to be the most common thing anyone encounters, so they'll identify what looks vaguely like that as that.
Support my political activism on Patreon.
>My takeaway from this is that the machines aren't going to replace doctors, but they may some day help the docs come up with an obscure diagnosis... or maybe train the newbies.
Way back in the days of the Apple ][, there were diagnostic programs that produced results on a par with what is reported in this study.
On the flipside, I've read about clinics in Japan, where a bot gets the patient history, does the evaluation interview, and looks at a lot of other stuff, before settling on a diagnosis, and proposed treatment plan. The doctor spends about one minute reading the history, diagnois, and proposed treatment plan. If it "looks right", that is the treatment plan. The downside is that one spends about three hours with the bot. The upside is that doctors can see 500 patients a day, and be fairly confident that each patient is both correctly diagnosed, and has an appropriate treatment plan.
You find it surprising that a system based on statistical analysis has trouble identifying things that are statistically unlikely? It is likely to assume everything is going to be that which is statistically the most likely. Common illnesses are the most statistically likely, thus AIs are going to perform best at those.
I'm reminded of my machine learning class. If you write a machine learning tool that when asked if you have cancer always says no, you'll have an incredibly high accuracy.
That's the wrong kind of statistics to use for diagnosis. It's not how many other people have those particular symptoms. It is how closely those particular symptoms match a known illness. Now you could always have a disclaimer that the best match is rare and therefore unlikely but it should still be included as the best match. There should also be a question that differentiates between the "common" illness and the "unique" illness even if that question is just something like "have you been to africa recently" but preferably the question would be an actual symptom because just because it is rare doesn't mean that you aren't just unlucky.
Your IV example is not a good example, that is actually something people are working on, here some quick list of some links:
https://www.sciencedaily.com/r...
https://www.youtube.com/watch?...
https://web.stanford.edu/group...
http://www.yissum.co.il/techno...
New things are always on the horizon
"Eighty-four percent of clinicians listed the correct diagnosis in the top three possibilities, compared with 51 percent for the digital symptom-checkers. The difference between physician and computer performance was most dramatic in more severe and less common conditions. It was smaller for less acute and more common illnesses."
I'm surprised that digital diagnosis is that good already. The era of an "iDoc" app being as good as a gateway practitioner is probably not far off.
It's not that surprising when you consider that most of the diagnosis was done by the humans that chose what questions to ask and what tests to run before the data was presented to the AI. When the AI correctly chooses what questions to ask and which tests to run, then we'll have something.
No f***kin s**t
I usually have the opposite problem. I can't identify or understand my symptoms, and often tell the doctors I need to find counseling and dig around a bit; they try to diagnose me anyway. That's why I have a diagnosis for anhedonia as a symptom of ADHD (as opposed to depression), which is correct; but I also have no idea how to measure treatment, which is bad. The drugs I'm on (amphetamine) have made me more-responsive, more outgoing, more engaged, and generally identifiable as a happier and more-emotionally-active person; but I don't feel any of it internally. I can see my behavior, but I don't feel anything about it.
So my rewards system is broken. This is kind of cool, because I respond to drug problems by discontinuing the drug instead of struggling with addiction (btw, sudden-discontinuation of Adderall hurts). It sucks because, lacking a sensation of pleasure and an activation of the rewards mechanism, it's really hard for me to set up rewards structures for reframing--my dorsolateral prefrontal cortex has to work to get me to do things, and eventually the load is beyond the effort I'm willing to put out, and I can't reduce the load by making something I want (i.e. a thing that causes a sense of pleasure indicating to my brain that I should repeat that behavior) a recognized outcome of something I have to do (i.e. a thing that demands effort and thus should be avoided to conserve energy).
Amphetamine was the first attempt, and the dose I'm on now is great: no symptoms, broader span of mood, weird side-effects at 10mg and 20mg are gone (15mg all side effects vanish, lower or higher starts to hurt, what?!), and the attention issues went away. I can now focus. I don't feel bad, but I also don't feel awesome; my mood seems to be slightly better (general feeling of well-being?). Maybe if I stay on this long enough, get some physical exercise in, and address some sleep issues (which aren't caused by amphetamine), it'll work itself out.
So here's the thing.
My doctor is talking about putting me on Wellbutrin. He wants to fix the anhedonia. I'm not entirely comfortable with Wellbutrin; it's more-complex, with prescribing information instructing to take it at the same time each day, and to skip it if you forgot to take it (do not take the medication when you remember later). MAS ER says to take in the morning, and possibly after breakfast if it causes appetite problems; you can float your MAS ER dose around by an hour or so (probably 4 hours) safely, and nobody is instructing anyone to keep it on an absolutely-strict schedule or to not take it a little later one day. That's more risk and more effort (see the pattern?).
*I* want to try adding Atomoxetine to the amphetamine. It won't fix the anhedonia; it'll either make me vomit a lot, or it'll raise serotonin, norepinephrine, and dopamine levels in the prefrontal cortex (it's an SNRI, but dopamine uptake is primarily mediated by NET in that region instead of DAT), which should lower the threshold of effort required for me to self-start and stay on the tasks I want to be on. I can focus, and now I want to tweak motivation; with no useful rewards system--with no impulse to pursue a sensation of pleasure from accomplishing goals--I can't lower the load on my dlPFX, so I want to increase its load capacity. This has worked for some people; it may or may not work for me, and could be a huge mistake and result in hilariously-shitty side-effects for a couple weeks, which I'm okay with because it might work and it won't actually injure me in the attempt.
I don't lie to my doctors; I just argue with them a lot, and sometimes leave out details (nobody asked me if I was schizotypal...). Sometimes I tell them I'm not even sure wtf is going on in my head, and that I don't know if I can reliably answer questions about a certain condition; an actual psychiatrist will usually respond to this by probing for detailed information instead of laying out a standard questionnaire. I still feel like there's some problem here, and I'm not sure where; I've probably got different concerns than the doctors.
Support my political activism on Patreon.
Actually my Universal Social Security proposal has a slight side-effect in that regard: it relieves a complex system of inefficiencies in the U.S. economy, and ends up raising consumer buying power beyond labor supply. That is to say: we end up with negative unemployment (-18% to -23% by my simple models).
The immediate remediation is to make everyone poorer by reducing productivity by 20%. To do this, you'd reduce working hours to 32/week (4 days). That means each human being is only applying 32 hours of productive time rather than 40 hours, and so the amount of purchasing power per capita goes down. That lands you back to ~5.6% unemployment.
That remediation may be incomplete: salaried office workers have a lot of slack time, and many services workers are part-time. Full-time defined as 26-32 hours would still leave all service workers of 3 or fewer working days with no change in productivity. When you couple these effects, my remediation may only be half-effective; we may actually need a 4-day, 28-hour work week with full-time defined as 25-28 hours to reach remediation. (As you reduce working hours, the slack time and underemployment impact decreases; thus decreasing to a 3-day week would overshoot.)
So it's possible to have four 7-hour days or a 3.5 day work week as standard under current United States economic conditions simply to prevent an economic collapse caused by remediating our welfare system, ending homelessness and hunger, and relieving the U.S. taxpayer of $1 trillion of burden.
I don't think I can get below 28 hours as-is. There's a Federal Reserve policy and a mortgage system adjustment that would eliminate inefficiency in debt management and extend the purchasing power of the average consumer by about 11% in total, which might require a 3-day work week; but our productive capacity would also fall at that point, and it might hobble the U.S. economy. Even the model I described would require a slower transition (which it has built-in); and the alternative is to control the change by taxing the American consumer (that is, middle- and upper-middle class) enough to minimize their financial benefit and divert that tax money to paying off the National Debt (which gives government flexibility for future major efforts). Then you could slowly adjust working hours down, more slowly than productivity gains, to prevent a loss of national wealth (GDP-per-capita).
Given 20 years's time, I could adjust the United States's full-time working hours to a 24-hour, 3-day work week. At this time, I cannot predict a shorter work week.
Support my political activism on Patreon.
When comparing percentages, you need to compare them in the manner which has the biggest consequences. A good example is OCR software (optical character recognition). If one has an accuracy of 99.99% and another cheaper one has an accuracy of 99.95%, you might think there's very little difference between the two and you should buy the cheaper one. But the cheaper one has a 5x the failure rate (.05% vs .01%), meaning you'll have to spend 5x as much time fixing errors in the scanned text.
Likewise, a 84% vs 51% success rate is not a mere 33% difference. It's a 3x higher failure rate (16% vs 49%).
A beautiful example of Homo homini lupus, if I've ever seen one.
Ezekiel 23:20
But three years from now and five years from now, what will the differential error rate be?
In fact, when we didn't have insurance or steady work, the samples were very valuable. Dad and I repaired the Porsche, Doc Dave repaired us.
Now, of course, we are forced to pay the insurance companies instead of getting health care, Doc Dave said "Fuck it" and retired, and health care got worse, but at least we have insurance and Obama got re-elected.
This is an important point, these are just simple apps and they still perform decently. The real test for doctors will be against something like Watson:
"The artificial intelligence machine correctly diagnosed a 60-year-old woman’s rare form of leukemia within 10 minutes — a medical mystery that doctors had missed for months at the University of Tokyo."
Mechanics are better at replacing car tires than mobile phones, notebooks and desktops. Allegedly being able to actually touch the subject matter proves a significant advantage.
I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
Of course there is. Machine learning can't change the laws of physics, for example, so we can use that as a starting point, and then use linear regression and K-means clustering to.. oh god damnit.
https://www.eff.org/https-everywhere
What is harder is management - finding the right collection of drugs that will effectively treat a patient's diseases without introducing too many side effects. And what's even harder is anything procedural - we have no computers that can actually do procedures at all. Those aren't what most people think of as "going to the doctor", but it's what most doctors do - either manage disease, or do procedures, both of which are either mostly or severely beyond the ken of computers. Show me a computer that can do something as simple as put in an IV, and I'll be greatly impressed. So many subtleties boil down to "well, I saw something once that looked just like this, and the solution was X..." that it's worth trying X before going on to Y and Z.
The short answer is that a nurse with an expert system can do a better job of all that than a doctor without one... if the system is trained by competent doctors and statisticians.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Contrary to the claim in this "study", articles have been coming out over the past 6 years or so claiming just the opposite.
I guess results depend on who is funding the study.Also, I'm sure there's zero bias from the AMA.either way on this topic
http://caledonianmercury.com/2...
another from 2013: http://io9.gizmodo.com/5983991...
Yet another from 2013
https://gigaom.com/2013/02/11/...
I don't really understand this study.
If the doctor says I have hayfever and the computer says I have a cold, which one is correct?
The statistical significance is more important than the difference, regardless of how that difference is expressed. If the confidence level for the test is +/-0.05%, then the difference is utterly meaningless anyway.
https://www.eff.org/https-everywhere
...at least for now.
Except that twice as often was biased because half the sample was extremely rare things yet given the same weight as common things. The pro-doctor bias in reporting these results is unsurprising.
It's nearly as old as transistor based computers.
I hear you're still working with polymers, laddie.
No, the other Clinton. "Bill" Jefferson William Clinton.
The one who famously contested what the meaning of "is" is.
Is sorely needed to interpret this study.
The study reports on what fraction of the time the diagnosis given by the computer is what the real ailment is. But let's take a closer look at "what the real ailment is"... and more importantly, who determines it.
Corrected summary:
Diagnoses produced by human doctors correlate better with diagnoses produced by other human doctors than diagnoses produced electronically do.
Just saw something on Watson and its achievements in the field of Medicine. Certainly an interesting tool that may change how medicine is practiced.
A very specific diagnosis is, after all, just another name for a list of the symptoms; you're just complaining that the list isn't precise enough. How much are you willing to spend to try to figure out precisely what's causing it? It's not so much "yeah, dunno" as "yeah, not worth trying to figure it out".
I mean, there are lots of genetic mutations with "variable penetrance". Why do some people get just a touch, and others get slapped down hard? Could be auxiliary genes, could be genetic mosaicism, could be something else. Likewise with common diseases: there are many things that could cause symptoms XYZ, but once you've ruled out the ones that are going to kill you right soon, there's not much point in going on a long hunt for the exact causative agent, because the tests cost a lot and have false positives and negatives. Treat symptomatically. If it doesn't get better, look deeper. But most of the time, it does.
I understand the POV you're espousing here - but the reality is that although nurses and doctors work in what seems like the same field to the layman, the reality is that the training regimens and skill sets are completely different. I've lost track of how many times I've had to explain things to very good, very experienced nurses that I would absolutely crucify a third-year medical student for not knowing.
In theory, you're totally right. In practice, the nurses don't know what questions to ask, or how to ask them, or how to evaluate their validity. If you had a computer that was as good at interrogation as a doctor, we wouldn't have TSA agents.
other rarer types
I'm always up for some continuing medical education. Type I is lack of insulin. Type II is insulin resistance. What are the other, rarer types?
Wrong. This report was based on a previous self-selected sample in a study advocating the computer diagnostic systems. This simply added the human comparison.
Try reading the article preview and stop being stupid: http://jamanetwork.com/journal...
If you had a computer that was as good at interrogation as a doctor, we wouldn't have TSA agents.
The TSA has never caught a terrorist and middle-aged crazy women manage to get through security and get on planes. The TSA is a jobs and harassment program, it is not about security.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Way to miss the point.
Is doesnt help if you think diagnostics is easy most of the time. Lets say you have a lot of patiens with a headache. Only in rare cases its a brain tumor, the rest is something relatively harmless (just as an example). If you do good diagnostics you will have to find the really threatening stuff, otherwise you fail and something has a lower chance to survive. And as it is probably rare among your patients to have a brain tumor (or insert anything you think is rare) you are not used to it. I would say good diagnostics is damn hard!
A very specific diagnosis is, after all, just another name for a list of the symptoms; you're just complaining that the list isn't precise enough.
Not really. I was complaining about your *specific* choice. High blood pressure and hypertension mean the same. All you've done is given the symptom a different name.
Compare to, you go in with a persistent, nasty sore throat and the doctor tells you have strep throat. There are many underlying causes of sore throats, such rhinovirii [*], smoke, excessive voice use etc etc. Once you know the underlying cause you can then treat or not depending on that cause, e.g. giving strep throat antibiotics, or simply telling an excessive-voice user to give it a rest.
Simply telling you that you have oresay oatthray doesn't actually achieve anything at all.
[*]Suck it latin pedants.
How much are you willing to spend to try to figure out precisely what's causing it? It's not so much "yeah, dunno" as "yeah, not worth trying to figure it out".
Well you could literally describe anything that way. Any "dunno" is essentially "too expensive to figure out" if you're willing ot assume you can figure it out given, say 10 years research into hypertension with a budget of 100bn per year.
But basically, saying it's primary hypertension means that they don't know the underlying cause.
there's not much point in going on a long hunt for the exact causative agent, because the tests cost a lot and have false positives and negatives.
Sounds like you're agreeing that the cause is unknown.
Treat symptomatically. If it doesn't get better, look deeper. But most of the time, it does.
Yikes! Well, sure treat any symptoms which have to change RIGHT NOW, but then how about, you know, trying to remove anything obvious the patient's doing which might be causing their symptoms? Which is incidentally, how hypertension is usually treated here. Proscribe something to drop BP quickly if it's too high, then deal with any lifestyle factors, see if there's a change, then look at deeper causes (adrenal tumour, kidney problems etc) then if that fails, call it "primary hypertension".
Which means "sorry don't know".
SJW n. One who posts facts.
Way to miss the point.
Uh no. See, you spewed some crazy in the middle of your comment there, and I corrected it. If you want me to believe you're working on logic, you're going to have to pick an example that isn't literally insane.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Who do you think determined what the correct answers were? Was it a computer? Oh, wait, it was other doctors.
TL;DR: Misleading news article - Computers have been making things better for a long time if you control of who is inputting the data/answering questions. We're not getting rid of doctors any time soon, however.
We've known that M.D.'s are really bad at diangosing themselves, and that it's not because they have different knowledge when diagnosing themselves vs others, it's because humans are really crappy at self evaluation and generating unbaiased information and spectaculrarly crappy when it comes to themselves -- Apps don't help with that, at least not yet. so this study compared less biased evaluators with experience to extraordinarily biased evaluators with no experience using an App. Was the App enough to counter crappier input and no experience? Looks like a cliffhanger....
Yes, the outcome of this study is not surprising (Patients using Apps solo vs. Doctors) but the snipped and news article are misleading.
"Computers" have been helping to outperform physicians-only interations when it comes to arriving at the correct differential diagnosis, minimizing the number of tests asked for to arrive at the correct differential diagnostic, minimizing adverse events, and maximizing treatment impact for some time (first publication from out of MIT around 1999-2001 if anyone wants to find it for me), but never "in the wild" with patients doing it "solo".
So, having a nurse or a physician assistant walk through a patient a differential diagnosis and treatment program results in significantly better desired outcomes than having an M.D. do it without the help of such a program and has for quite time time; however, both patients and healthcare providers are unwilling to accept how this can be currently implemented, so it just does not work in real life. Note to Doctors, you're still worth something here since an M.D. waking patients through do better than RN and much better than a someone not trained in healthcare at all.
Having patients walk through themselves, however, shows pretty abysmal performance and has for 15 plus years, so "The analysis is believed to provide the first direct comparison between human-made and computer-based diagnoses" is probably the author saying that's what they believe, because folks in Biomedical Informatics know better.
... and all the peers are also doctors.
Probably, but not certainly. I'm not a physician, but am asked to review studies like this from time to time even by tier one journals. I've never been asked by JAMA, however, and they don't publish reviewers, so you're probably right. Even so, not all M.D.s who do research like this practice. Skepticism in this case seems reasonable, but far from a reason to toss the whole thing.
If you can find an actual flaw ...
Here is a flaw: The entire study was done with contrived "vignettes" rather than actual cases. The vignettes were written by human doctors, so just because other human doctors were better than apps at reading between the lines and figuring out the intended diagnosis, does not mean that they would be better at diagnosing actual patients.
This is a very valid critique. We've known for decades that question biased has strong effects across different populations, so I would expect that doctors would perform better than random individuals if the questions were about planting flowers but written by doctors. I would, however, be surprised if they seemed like reasonable questions to the average person but that doctors scored 80% and others 40%, as the effect size is usually significant but small. Since this is doctors writing about docrtory stuff, maybe it could account for most of the variance, who knows.
I think there is only one clear conclusion from this study: Doctors really don't like these apps.
No comment except, no, that's not a conclusion.
So just another hare-brained "it's big in japan" comment, aimed at being both rose-colored anecdote and obscure enough to be difficult to fact-check. Useless.
https://en.wikipedia.org/wiki/...