Is Google's Comment Filtering Tool 'Vanishing' Legitimate Comments? (vortex.com)

← Back to Stories (view on slashdot.org)

Is Google's Comment Filtering Tool 'Vanishing' Legitimate Comments? (vortex.com)

Posted by EditorDavid on Sunday February 26, 2017 @11:37AM from the toxicity-reports dept.

Slashdot reader Lauren Weinstein writes: Google has announced (with considerable fanfare) public access to their new "Perspective" comment filtering system API, which uses Google's machine learning/AI system to determine which comments on a site shouldn't be displayed due to perceived high spam/toxicity scores. It's a fascinating effort. And if you run a website that supports comments, I urge you not to put this Google service into production, at least for now.

The bottom line is that I view Google's spam detection systems as currently too prone to false positives -- thereby enabling a form of algorithm-driven "censorship" (for lack of a better word in this specific context) -- especially by "lazy" sites that might accept Google's determinations of comment scoring as gospel... as someone who deals with significant numbers of comments filtered by Google every day -- I have nearly 400K followers on Google Plus -- I can tell you with considerable confidence that the problem isn't "spam" comments that are being missed, it's completely legitimate non-spam, non-toxic comments that are inappropriately marked as spam and hidden by Google.
Lauren is also collecting noteworthy experiences for a white paper about "the perceived overall state of Google (and its parent corporation Alphabet, Inc.)" to better understand how internet companies are now impacting our lives in unanticipated ways. He's inviting people to share their recent experiences with "specific Google services (including everything from Search to Gmail to YouTube and beyond), accounts, privacy, security, interactions, legal or copyright issues -- essentially anything positive, negative, or neutral that you are free to impart to me, that you believe might be of interest."

2 of 101 comments (clear)

Min score:

Reason:

Sort:

Really? by Webs+101 · 2017-02-26 11:41 · Score: 5, Insightful

There are 400,000 users of Google+?

--
"Even for Slashdot, that was a very obscure reference!" - Anonymous Coward
Thing about spam by buss_error · 2017-02-26 13:06 · Score: 5, Interesting

The thing about spam is that for as long as I can remember (at least back to 1997) people have insisted upon a technical solution for spam. The issue is that spam is not a technical problem. It's a human problem. Like any other problem/response cycle, if you are solving for the wrong issue, don't be shocked if the solution isn't as bad or worse than the problem. Another issue, not directly on point, is Google Email and anti-spam. I know of several organizations that have completely shut down their email infrastructure in favor of Google email services. An unaddressed problem is that these organizations have also laid off their email folks since "Google takes care of it all" so subtle and not so subtle issues often go not simply unaddressed, but unknown to the organization. The result has been a high rate of false positives, including senders without DKIM. I once got into a argument with John Lavine about DKIM, in which he got pretty passionate. I argue that DKIM is:
1. Needlessly opaque
2. Prone to abuse from over zealous admins
3. Google does it wrong (Checking the header chain all the way back instead of the last system the recipient does not run)
4. Breaks email standards
5. Doesn't solve any issue that SPF does not solve more directly, without possible abuse, and much more simply, requires far fewer CPU resources and skill, and does not break email standards in the process.
I'm told that "I'm too stupid" to know how it works and "I should get out of computers since you obviously are too stupid to know your f'ing job!" (both quotes from right here on slash dot). I won't try to prove otherwise, but one question I've asked over and over again is how DKIM, checked back further than the last untrusted relay, does not break email standards for list or forwarded mail. SPF won't break those, DKIM will, every time.
So getting back to our muttons, I'm not surprised that Google's spam engine (or anyone's, for that matter) has a high false positive rate, or a lower than desired true positive rate. That issue is simple - they are attempting to solve a problem with technology that isn't technical in nature. Stop using a hammer to try to screw in a light bulb. Doesn't work well.

--
Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.