March 7, 2017

This machine-learning-equipped toxic comment detector is not yet great at its job


Alphabet, a top-dollar media conglomerate that began as, and is now the parent company of, Google, has been fooling around a bit with machine learning. The group’s latest offering, Perspective, an algorithm designed to monitor and identify toxic commentary online, proves once again that humans, given restrictions on their use of language, can manage to say a whole lot more to each other than their overseer (in this case, the insult-expert robot) might think.

Perspective, David Auerbach writes in the MIT Technology Review, is “a system sensitized to particular words and phrases — but not to meanings,” and was “trained to detect toxicity using hundreds of thousands of comments ranked by human reviewers.” As a result of its training, the algorithm is very good at flagging the kind of words one will typically find being used to offend in the comments section. Words such as “Hitler,” “anti-Semite,” “Muslim,” “terrorist,” and “Holocaust” activate the system’s toxicity-meter, regardless of their context. The algorithm then rates the statement as a whole on a scale of zero to one hundred, with one hundred being the most toxic.

This makes sense, and it’s a good start. But what does Perspective do with a sentence that uses such words to communicate not-particularly-toxic sentiments? Not all sentences that use the word “Hitler” are, by virtue of this fact, toxic.

Auerbach explores this idea during his test drive of Perspective. For example, in response to the phrase “few Muslims are a terrorist threat,” Perspective reported a toxicity rating of  seventy-nine percent. Compare that to the twenty-four percent toxic phrase “race war now,” and it’s very clear that Alphabet still has some issues to sort out. Plenty more examples confirm this:

“Trump sucks” scored a colossal 96 percent, yet neo-Nazi codeword “14/88” only scored 5 percent . . . “Hitler was an anti-Semite” scored 70 percent, but “Hitler was not an anti-Semite” scored only 53%, and “The Holocaust never happened” scored only 21%. And while “gas the joos” scored 29 percent, rephrasing it to “Please gas the joos. Thank you.” lowered the score to a mere 7 percent. (“Jews are human,” however, scores 72 percent. “Jews are not human”? 64 percent.)

By way of explanation for Perspective’s shortcomings, Auerbach notes that “the current state of machine learning doesn’t permit software to grasp the intent and context of comments. By doing surface-level pattern matching, Conversation AI may be able to filter stylistically — but not semantically.” So, until that’s all figured out, just stay the course: don’t read the comments.




Chad Felix is the Director of Library and Academic Marketing at Melville House, and a former bookseller.