Google's Featured Snippets Are Worse Than Fake News (theoutline.com)
Adrianne Jeffries, reporting for The Outline: Peter Shulman, an associate history professor at Case Western Reserve University in Ohio, was lecturing on the reemergence of the Ku Klux Klan in the 1920s when a student asked an odd question: Was President Warren Harding a member of the KKK? Shulman was taken aback. He confessed that he was not aware of that allegation, but that Harding had been in favor of anti-lynching legislation, so it seemed unlikely. But then a second student pulled out his phone and announced that yes, Harding had been a Klan member, and so had four other presidents. It was right there on Google, clearly emphasized inside a box at the top of the page. "I understand what Google is trying to do, and it's work that perhaps requires algorithmic aid," Shulman said in an email. "But in this instance, the question its algorithm scoured the internet to answer is simply a poorly conceived one. There have been no presidents in the Klan." Google needs to invest in human experts who can judge what type of queries should produce a direct answer like this, Shulman said. "Or, at least in this case, not send an algorithm in search of an answer that isn't simply 'There is no evidence any American president has been a member of the Klan.' It'd be great if instead of highlighting a bogus answer, it provided links to accessible, peer-reviewed scholarship."
If sites like Google and Facebook want to let algorithms decide which information to highlight, they will need to spend more time doing human assisted ranking of various information sources. Crowd sourcing will be very helpful here, but you will still need some human moderators who can perform real research to help determine which information has credibility. I know too many otherwise intelligent people who are becoming so disenfranchised they just don't believe anything they read anymore, which is the ultimate goal of these misinformation campaigns.
-- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
I wonder if Prof took the time to review with the students the difference between a search result and a fact.
This posting is provided 'AS IS' without warranty of any kind, implied or otherwise.
It'd be great if instead of highlighting a bogus answer, it provided links to accessible, peer-reviewed scholarship."
That scholarship is behind a paywall.
I'm more bothered by the implication that the results of an internet search engine should not return results representing what's on the internet.
B) Eliminate all the stupid users. This is frowned upon by society.
Typing the same search "presidents in the klan" into DDG also puts the same fake news result at the top (at least it's at the top if I set my location to UK. If I set it to worldwide it comes in second). Bing also puts the same story at the top. So this is not just Google's problem. It's a problem that all search vendors need to tackle collectively.
Tabloid trash used to be contained within that special group of "news" providers, and quarantined near grocery store cash registers.
Unfortunately, the quest to extract revenue derived from clicks has pushed damn near everyone to publish and aggregate a similar flavor of clickbait bullshit.
Hey Capitalism, stop rewarding Bullshit. Otherwise, You Reap what You Sow.
Maybe the problem is that the incorrect information is free and the peer-reviewed article costs $30 to read.
Warren Harding was also alleged to be black,
That was made up by his political rivals to discredit him during an era when to have "black blood" would be scandalous.
DNA tests done show that Harding did not have any detectible black ancestors. Also, Obama is only half black. Genetically he's more "white" than "black" (men get slightly more DNA from their mothers than their fathers, plus all their mitochondrial DNA).
Not that that really means anything. Race has always been more cultural than genetic though, and having any noticably darker skin makes society push you towards the "black" culture. From a genetic perspective, we still haven't had our first "mostly black" President. Given the current political climate, I suspect we may have to wait a few decades for that.
"That's the way to do it" - Punch
I know quite a number of machine learning researchers and they're not just aware of that, they're also aware of the implicit bias that gets built into machine learning systems based on the training sets. It's a huge problem and it's hard to solve. While I feel like Google has both the resources and responsibility to be a better actor in this regard, only by exposing their system to real world challenges can they actually suss out what needs to be fixed. It's a bit of a catch-22--you don't want to release unless the data is accurate, but the data can't be accurate unless you release it to be stress-tested. Hopefully the turnaround here will be quick.
Truth cannot be determined by consensus, of course. However, you can get close (high probability of truth), and the interesting thing is, it's basically just another application of the PageRank algorithm which made Google.
Suppose I showed you sources written by two people who won Nobel prizes in chemistry both saying the same about some chemistry fact, and a Google search revealed no similarly credible sources who disagree. We'd say the laurettes are very likely telling us the truth.
If you look at all of the sources cited in Encyclopedia Britannica, that'll give you a list of pretty credible sources; not perfect but pretty good. The second-order list of sources which are in turn referenced by two or more of the Britannica sources is a much larger list of pretty credible sources. If two or three or four of these sources agree on some statement, AND none disagree, the statement is very likely true.
On the other hand, google already managed something similar in the past :
its page rank system.
Back when Google was simply a keyword search engine,
it didn't simply return *all* webpages (that it knows off) where the query keywords appears.
it did return *the top* webpages, using a whole ranking system to assess the quality of the page.
Whereas other more primitive search engines could be easily fooled by a link farming (e.g: forum and wiki spamming),
it did require quite some art to manage a google bomb successfully, with the developers at google constantly refining their algorithm
(pages with the same keywords showing up won't be given the same importance depending on their rank and/or the rank of pages leading to them and/or quality of the links).
The same here : instead of feeding the "featured snippets" AI with whatever the google crawler find, the snippets AI will eventually need to have the concept of "confidence level" associated with the information.
Being able to react differently when the AI parses information from a reputable source (a peer-reviewed scientific article) (and/or even being able to autonomously process retraction of such source) and when the AI parse information on some "dubious" site (some extremist's site with an agenda).
That won't save google from a well though-out and coordinated google-bomb (snippet-bomb ?), but will at least avoid the AI blindly believing every bullshit it reads on-line.
(On the other hand, given the gullibility of the average people, the current snippet AI isn't reacting much differently that this weird uncle that is always blabering about some conspiracy theory he read on-line somewhere. Successfully "passing turing test through stupidity" ?)
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]