Applying Systematic Content Moderation for Extremist Deterrence

2nd November 2021 Kris McGuffie

By Kris McGuffie

2nd November 2021 In Insights

Platforms that are committed to making their online communities inhospitable to extremism require pragmatic, layered approaches to policy, detection, moderation, and enforcement based on established knowledge about online behaviours within extremist communities. Essential to this systematic management of online communities is a diverse workforce that includes subject matter experts, data labelers who have expertise in multiple languages and cultural contexts, as well as analysts and moderators with training in extremist tactics and terminology. This diverse and well-resourced workforce requires the use of dynamic systems that are adaptable to evolving extremist content and techniques. Effective deterrence against extremist actors and the sympathetic users who are pulled into their networks is manageable on platforms that empower personnel with sufficient training and data analysis tools and deploy systems that incorporate targeted data collection and analysis of platform trends.

Using detection to identify the repeated behaviours of extremist actors enables moderators to direct greater efforts towards higher risk users whose activity on a platform is likely to broadcast pockets of additional toxic behaviours. A portion of these higher risk users act in coordination, and some incorporate the use of inauthentic accounts. Comment-based detection alone will have only limited success, whereas systems that incorporate metadata and accommodate the tracking of repeated behaviours enable more efficient identification of extremist users. The harmful, repeated behaviours of these users include hate speech, insults, violent rhetoric, threats, and disinformation. Comparing the violations of violent extremist users whose behaviours eventually escalate through those various related behaviours to users who do not show signs of violent extremism provides an invaluable window into extremist tactics.

Extremist actors and the sympathetic users who those actors influence to become their propagandists exploit gaps or weaknesses on platforms, including policy ambiguity and detection limitations, to ensure that their content persists and proliferates. It is subject matter expertise that can help fill those gaps with insights about current tactics, content, and emerging language that can be used to make extremist deterrence much more precise. Emerging extremist language is invariably used across multiple platforms, and propagation of extremist content is closely followed by researchers who specialise in extremist use of the Internet. Being clear about where and how a platform is being exploited by violent extremist actors and their supporters enables moderators to focus on smaller, higher-risk sets of data. The content that is verified by trained moderators to be violative can in turn be used to strengthen detection frameworks and provide on-the-ground insights to analysts who evaluate problematic trends.

A sometimes overlooked element of shoring up detection and moderation is to ensure that employees and contractors who are not subject matter experts, such as some moderation staff, data labelers, and related personnel, are provided with regular trainings and resources that offer insights into how extremists navigate, connect, mask their intentions, and update their language on platforms to avoid moderation. Providing essential staff with timely insights into the language, goals, catchphrases, jokes, and beliefs within violent extremist movements and groups enables efficient and accurate responses to changing tactics and obscured propaganda. There is no shortage of skilled open-source researchers and journalists with specialised knowledge in this area publishing on current trends in online violent extremism, making training materials easily accessible and open source. Specifically, training that includes examples of violative and borderline violative content from a given platform will enable moderators to refresh their understanding of policy in the context of changing violent extremist language and tactics, while providing data scientists who refine detection approaches with the means to identify where the line can be drawn for training detection models as the first line of defense against extremist actors.

It benefits platforms to move away from heavy reliance on keywords as a stand-alone means of detecting problematic content and build systems where violations and near-violations are used to refine detection and moderation approaches, as well as serve as a prediction mechanism for identifying worrisome behavioural patterns. An increase in specific types of veiled hate speech, for example, is a strong indicator of an emerging trend that can be addressed before it becomes normalised on a platform. Directing subject matter experts to analyse and characterise near-violations gives platforms the data samples required to improve moderation efforts before small trends become significant problems, like what is observed surrounding major events such as mass violence, polarising elections, or the conditions created during the COVID-19 pandemic.

Finally, ensuring that policy is well-aligned with the manner in which violent extremist actors exploit online communities will buttress systems with verifiable means to plan and gauge proper coverage. Outlined in the table below is a sample of violent extremist tactics with associated functions and policy overlap.

Violent Extremist Tactics & Content	How the Tactics Function Online	Policy / Moderation Category
Promotion and glorification of violent extremism, including references to and descriptions, photos, and videos of extremist violence and hate crime.	Enable kinship of fellow believers and sympathisers while encouraging hate and novel identities for people not yet fully committed to a violent extremist cause.	Violence, Graphic, Crime, Weapons, and Regulated Goods
Promotion of hate-based beliefs, ideologies, and discrimination.	Facilitates scapegoating and false attribution of societal ills. Empowers sympathisers with a misguided sense of superiority.	Hate Speech, Bullying, Harassment, and Threats
Exploitation of current events.	Provides false but appealingly simplified explanations within circumstances that create fear and societal uncertainty.	Mis- and Disinformation

While the table above outlines some basic examples of how extremist behaviour and content line up with platform policy, it is important to use proactive measures to sample and analyse ongoing content to ensure that a given platform has sufficient policy coverage and that moderation properly addresses the toxic trends being created by violent extremists. Successfully managed platforms use nuanced detection and moderation that is informed by up-to-date content analysis. As with all behaviours online, violent extremist behaviours are constantly adapting and evolving, and it is only with regular, clear-eyed evaluation of data and trends that a platform can ensure that it has the resources in place to mitigate harm.

Tags: Content Moderation, P/CVE

Applying Systematic Content Moderation for Extremist Deterrence

By Kris McGuffie

Kris McGuffie

Share

GNET’s Research Digest