Click here to read GNET's latest report Migration Moments: Extremist Adoption of Text-Based Instant Messaging Applications

AI for Content Moderation – what can we expect?

AI for Content Moderation – what can we expect?
31st December 2019 Marie Schröter
Marie Schröter
In Insights

Algorithms will save the Internet. Machine learning or deep learning applications will identify malicious content on Social Media and deal with it accordingly. Technical developments will allow free and informative debates again, after all that stir about disinformation campaigns, electoral influence and even dissemination of terrorist content – so the narrative of techno-centric optimists that borders religious conviction. If anything, it shows the deep wish for a magical solution to identify and free the Internet from any form of bad activity.

The ‘bad guys’ use the online space in many ways – from terrorists´ recruiting and fundraising to spreading ideology, facilitating radicalisation and maintaining a supporters network. Although this is general knowledge, we still lack a comprehensive understanding of it. Structural disinformation and agenda setting is on the radar since the Cambridge Analytica scandal; yet research is still in its infancy. However, malicious content must be tackled without further delay.

An obvious place to start is content moderation online – preventing terrorist content from being spread as a barrier for the terrorist´s use of social media platforms. Already 98% of the malicious content on Facebook is filtered out by machine learning algorithms as stated in the latest report for the EU´s self-assessment reports on the practice of Disinformation. Users flag the remaining 2%. Twitter reports to challenge 10 accounts per second and Google, the owner of YouTube, removes 80% before it gets any views according to their own information. That sounds like a successful effort and thus content moderation certainly improved over the last years. The loopholes remain enormous though, especially when leaving the standard language zones. At the moment only 14 out of Europe´s 26 official languages are covered in Facebook´s fact-checking language repertoire. Thanks to third-party contracting now 15 African countries are being monitored – that is less than a third of the continent´s countries. It remains unclear whether that applies only to official languages or if it is inclusive towards dialects as well. Omitting especially languages spoken by minorities is a known phenomenon from other highly populated and diverse regions and countries, such as India. The question comes to mind, if it should be allowed to operate in areas where content moderation cannot be guaranteed. After all, Facebook offers the service to connect people and content moderation seems to be vital to this. Examples from Sri Lanka show that in order to understand hate speech, Facebook’s algorithms not only encounter language barriers but also fail to identify the multi-layered cultural context. In 2018 and in the aftermath of the Easter Bombings 2019 Sri Lanka blocked Facebook nationwide. It was simply not possible to master the fake news and hate speech that led to violence and riots especially against the Muslim population. The algorithms were not able to understand the complexity of hate speech. Algorithms would have to be able to understand the ethnicities of the parties involved to classify in a second step the slang language as hate speech. Which is a skill they currently lack. The problem goes even wider as languages that do not use the Latin alphabet are sometimes “translated” into Latin alphabet for convenience of the user. Some languages have no grammatical explicit future tense, so what would a future threat even look like? If automated filtering shall be scaled up, it has to face those design failures.

Algorithms need a huge amount of training and testing data to function properly and still there are false positives and false negatives – terrorist content that does not get flagged and non-terrorist content that gets banned. The recent example of the messenger Telegram lashing out against Islamists using the service also blocked numerous researchers and experts who are monitoring the scene and gathering Open Source intelligence. Those cases have to be dealt with immediately otherwise crucial hints as to where the scene migrates to get lost. Research has also shown that algorithms can be tricked by simply adding the word “love” to a hate post. The question that must ultimately be asked about the technology is, where are we willing to tolerate mistakes and how tolerant are we?

Not everyone is convinced about blocking unwanted content online, especially when done so by governments. The current US administration finds itself on the same side as some civil rights organisations – united in the concern about free speech online. One argument is that merely deleting content does not prevent the ideas from popping up on other platforms. Another controversial point is that it can be considered censorship. The narrative of unpopular messages being banned by `those at the top` can have counter-effective consequences and work as a recruitment pull-factor for right-wing extremist groups, shared Dr. Sharri Clark a US Government´s Senior Advisor for Cyber and CVE at the Internet Governance Forum in November.

As long as there is no universal regulation, Social Media companies have to define themselves what constitutes illegitimate content online – especially when it comes to blurry concepts of hate speech and fake news. Germany´s Network Enforcement Act which requires social media platforms to delete supposedly illegal content within 24 hours shall be extended and require the platform to pass on details of the account posting the content. The amendment of the 2018 enacted law is a direct consequence of the terror attack in Halle in October 2019. In case of systematic failure to take down the illegal content companies can face up to 50m EUR penalty fees. The intentions of the law are noble; however the execution in its current form can prompt a contraire effect. It can lead to over-reaction from the sides of the companies to delete questionable content in order to avoid paying fines. From a different point of view it allows bigger companies to pay their way out of trouble whereas it can break the neck of smaller ones.

The human brain is insufficient when required to make decisions based on large data sets, multiple conditions and uncertainty – which is in turn a speciality of algorithms. On the other hand, humans surpass technical solutions when there is not enough data. Algorithms are not able to decide whether one single piece of information is terrorist content, but they can do so when faced with large datasets. Machine Learning can help to search for patterns and keywords and organise vast amounts of data, but at some point in the chain, this always requires human analysis. A human can detect the nuances in sarcasm, hate speech and irony, for example. Thus, technical solutions must always be supervised. The danger that something gets lost in translation from the real world into the virtual world is simply too big. Without a doubt, the enabling aspects of the Internet for society shall be enhanced and strengthened and therefore malicious content must be met with determination.