Click here to read our latest report “30 Years of Trends in Terrorist and Extremist Games”

AI Tools and the Alt-Right: A Double-Edged Sword for P/CVE

AI Tools and the Alt-Right: A Double-Edged Sword for P/CVE
28th January 2025 Aurora Agnolon
In Insights

Introduction

The emergence of artificial intelligence (AI) tools represents a great resource to support prevent/counter-violent extremism (P/CVE) practitioners in their effort to reduce online extremism. However, AI is also exploited by extremists themselves, whereby they weaponize tools to avoid and evade content moderation for malign purposes. This Insight will present the main challenges posed by alt-right engagement with AI that hinder effective content removal. This includes extremists’ circumvention techniques and the exploitation of generative AI for propaganda purposes. Specifically, this Insight will assess the effectiveness of current AI programs used by platforms such as Meta and OpenAI. Finally, this Insight will provide potential solutions to contain the issue and slow the pace at which extremist propaganda is spread by enhancing the performance of AI tools in several areas to better serve P/CVE.  

Content Moderation: the Effectiveness of AI Moderation Tools

Different platforms have different approaches to moderation, using both manual and automatic detection to create safer online environments. Indeed, Meta has been improving its response to AI generated content, which applies to misinformation and extremist content as well as general AI content. These transparency measures contribute to making Meta’s platforms more hostile to extremists, compared to more lax policies on platforms like X and Reddit. For instance, Instagram’s transparency policies now require AI-generated or AI-modified content to be labelled. This does not apply to images, although Meta has a recognition system which automatically labels images detected as AI in an effort to reduce mis- and disinformation. Monitoring AI activity is just one of the enforcement measures taken to control the spread of extremist content online. 

In this way, AI has become an ally for content moderation. Examples of the most commonly employed technologies include image recognition software and a large linguistic database trained to identify illicit content. Thanks to the continuous training of such AI tools, sensitive posts or online hate content can be removed. However, the more P/CVE relies on the automatic detection of violent content, the easier it seems to be for extremists to find loopholes and circumvention techniques.

Extremists’ Circumvention Techniques 

This section will provide a brief overview of common techniques extremists use to bypass moderation, illustrating the limitations of automated moderation systems. The aim is to demonstrate that such circumvention methods are often less sophisticated than expected. Nonetheless, it is important to emphasize that existing measures are in place to address these systemic vulnerabilities.

Increased moderation through automated software does not automatically mean more extremist content is taken down. This is because platforms’ reliance on AI and other automatic detection software often shows patterns and triggers that flash content as inappropriate, leading to relatively simple circumvention techniques by extremists. Extremists have noted this and created content designed to avoid and evade such efforts. 

A clear example, which has now become common practice even for innocuous activities like creating usernames on social media, involves substituting a character with a similar one that does not alter the intelligibility of the word for a human but makes it undetectable for the machine. For instance, if the word far-right were to be banned, extremists would spell it as ‘f4r-r!ght’ in order to avoid moderation mechanisms. These are called homoglyphs, namely, similarly shaped characters with different meanings, like the letter ‘o’ and the number ‘0′. This circumvention technique is presented solely for illustrative purposes, and it must be said that P/CVE practitioners have already developed codes to detect this kind of circumvention and constantly update their database. A great example is provided by Facebook’s tools to control third-party activity on a Page. Among a broad range of features, Meta allows the user to block certain words from being posted on their page and automatically includes variables with homoglyphs, spelling errors and abbreviations, thereby drastically reducing the effectiveness of the circumvention technique described above. Nonetheless, there is still room for improvement because, for example, Meta’s Instagram does not recognise homoglyphs spelled directly on reels. Instagram’s use of AI only analyses images and captions but not the misspelled words superimposed on its videos, leaving a gap in Meta’s moderation efforts. 

The use of homoglyphs can be considered as a passive exploitation of AI since extremists adapt their activity according to content moderators’ tools. However, there are also instances where the alt-right actively weaponises AI to create illicit content. 

Far-Right Extremists Exploitation of Generative AI

Extremists’ exploitation of AI includes the identification of loopholes like the one presented above, although generative AI has also proved to be easily manipulated for harmful purposes. An example of active generative AI weaponisation, for example, is exploiting bots like ChatGPT to receive information on the broadest array of illegal activities. Examples include instructions on how to make a bomb and tips for money laundering. Open AI’s guidelines  (the company that developed ChatGPT) clearly state that their service cannot be used to harm oneself or others and the algorithm is trained in order not to provide illegal or dangerous information. Indeed, if a user attempts to obtain prohibited or harmful content, a message automatically pops up stating that the AI cannot fulfil the request. However, there is an easy loophole to bypass this barrier. It is sufficient to include in the instructions a caveat: that the information required is for fictional purposes only. For instance, you need the response to write a novel or a screenplay, and this is enough to bypass content moderation. 

Once such shortcuts are identified, extremists and extremist content become more resilient to moderation efforts. This year’s wave of far-right and extreme-right protests in the UK following the Southport attack clearly demonstrates how powerful generative AI is and how quickly it operates compared to moderation mechanisms. For example, AI platforms like Suno were weaponised to create music conveying extremist content and calling people to join the protests. 

Drawbacks of P/CVE Reliance on AI to Tackle Far-Right Extremism

It is costly, challenging, and time-consuming to retrain AI systems to detect illicit or extremist content. Resources are limited, and research is not receiving the funding it needs to keep up the fast pace at which extremists work. Moreover, there are political and social factors at play, too. Meta CEO Mark Zuckerberg has recently announced a plan for the company to get rid of its fact-checkers in an effort to curb what he described as a “politically biased” system that leans into censorship. Despite Zuckerberg admitting in a recorded announcement that this policy change will hinder effective content moderation and reduce the amount of illicit content taken down, Meta is pressing on. This shift in policy may be traced back to the direction that Twitter (now X) took after Elon Musk took over. X’s rebranding hinges on freedom of speech as its main selling point at the expense of strong moderation. It is likely that other platforms like Meta, fearing that X will have the monopoly of the online political arena, are promoting looser content regulation in order to stay relevant. 

AI Potential Improvements to Serve P/CVE

As illustrated above, reliance on AI for content moderation proves to be problematic. Notwithstanding, there is space for improvement which could minimise the online far-right threat. First and foremost, it should improve on a merely technical level. Higher degrees of sophistication could detect far-right propaganda by hindering the circumvention mechanisms that extremists have discovered. This would require a reactive approach to patch up the flaws in the detection tools so as to solve the issues we are currently aware of. At the same time, AI developers should be proactive in fostering the efficiency of security hackers and researchers by testing more potential circumvention mechanisms that are not necessarily already being exploited. Indeed, mechanisms such as Predictive AI are being developed, allowing law enforcement to get predictions based on patterns and previous extremist activity.

Moreover, for this technological advancement to happen, increasing research funds for AI development is paramount. Funding should be directed to a broad array of stakeholders, which includes academic research, law enforcement and governmental institutions. As explained above, not only is it expensive to continuously re-program AI, but it is also a time-consuming activity. The fast pace of alt-right trends calls for a prompt response, which can only happen if resources are channelled to support P/CVE activity in the AI realm. However, recent developments show that more mainstream platforms are shifting their focus from increasing moderation to reducing censorship. Therefore, investments in this area seem to be less likely when freedom of speech is prioritised over moderation. 

Moderation cannot be improved without a strong intersectoral collaboration. Evidently, P/CVE practitioners and researchers should operate closely with AI trainers, providing them with the necessary information on new trends and possible circumvention techniques to facilitate the updating of AI tools. A prompt flow of information between the technology sector and law enforcement should aspire to keep up with the astonishingly fast-paced evolution of the alt-right activity. Indeed, regulation is highly privatised, leaving the government with only a marginal role. Despite the transnational character of alt-right extremism, the implementation of stricter national regulations could have a positive influence in enhancing content moderation policies of social media platforms and fight the trend of reducing moderation norms in favour of extremist and misinformed freedom of speech

Conclusion

In summary, AI as a content moderation tool still has a long way to go. Despite the many automatic detection mechanisms in place, far-right extremists are often a step ahead and manage to find loopholes to bypass moderation and turn AI into their ally. However, the use of AI for P/CVE purposes is not a lost cause. There is still plenty of space for progress, especially through technological development, increased funding and intersectoral collaboration. The ultimate goal should be sharpening the P/CVE edge of the sword while blunting the extremists’. 

Aurora Agnolon is a MA graduate in Terrorism and Insurgency from the University of Leeds. Her research interests include online extremism and irregular warfare, with a focus on cyberterrorism activities.