‘RedPilled AI’: A New Weapon for Online Radicalisation on 4chan

7th June 2023 Daniel Siegel

By Daniel Siegel

7th June 2023 In Insights

Introduction

Recently, the world of artificial intelligence large language models (LLMs) has witnessed a surge in competition between major tech companies like Google’s BARD, OpenAI’s ChatGPT, and Meta’s LLaMA. LLMs offer users personal assistants capable of conversing, conducting research, and even producing creative products.

While these AI products offer enormous potential in ushering in an era of unprecedented economic growth, the risks of this technology are slowly becoming more apparent. Even the CEO of OpenAI, Sam Altman, has expressed concerns that these digital products may have the capability to enable large-scale disinformation campaigns. In a previous Insight, authors explored how AI models could enable extremists to create persuasive and interactive extremist propaganda products like music, social media content, and video games. This Insight will provide an analysis of a different but parallel threat – far-right chatbots – and forecast the risks of AI technologies as they continue to advance prolifically.

LLaMA Leak on 4chan

On February 24, Meta released their new advanced AI model nicknamed LLaMa. Only A week later, 4chan users got access to Meta’s advanced AI model (LLaMA) after someone posted a downloadable torrent for the model on the far-right platform. This event allowed far-right extremists on 4chan to develop chatbots capable of enabling online radicalisation efforts by imitating victims of violence that lean into stereotypes and promote violence. Before this, access to the model was limited to specific individuals in the AI community who had to receive permission to use it in order to ensure its responsible use. Despite the complexity involved in setting up the LLaMA model, 4chan users managed to gain access to the program and started sharing tutorials and guides on how to replicate the process. Many of these users claimed to have discovered methods to ‘semi-customise’ the AI, allowing them to modify its behaviour and bypass several of the built-in safety features designed to prevent the spread of xenophobic content.

The LLaMA model’s potential use for hate speech and disinformation alarmed both Meta and the wider AI community. In an effort to mitigate the damage, Meta requested the removal of the download link for the chatbot, but fully customised models and screenshots of conversation logs continue to flood 4chan forums. Some users adapted Meta’s LLaMA model, while others used publicly available AI tools to create new and problematic chatbots (Fig.2).

Figure 1.0 Leaked Meta AI post with users suggesting they can bypass safety features

Figure 2.0 Decentralised online effort to develop chatbots on 4chan

After the leak, 4chan users found ways to make alterations to Meta’s model. Screenshots of this model online suggest edited models have the capacity to express deeply antisemitic ideas. A 4chan user posted a screenshot of a modified AI model suggesting the Jewish population “controls the flow of information.” Another shows the AI suggesting the Jewish goal is to obtain “world domination” through the manipulation of the non-Jewish population. The unique weights or training that were applied to models are not yet known, but the fact these manipulations are possible highlights the successes of decentralised efforts to manufacture ‘extreme’ chatbots. The speed with which these models were developed is indicative of a worrisome trend. Presumably, individuals classified as ‘amateur hackers,’ who frequently engage with the online platform 4chan, managed to devise methods for substantially modifying sophisticated artificial intelligence models within two weeks of the leak. This carries ominous ramifications, as it suggests that forthcoming AI models, if accessed unlawfully, bear the potential for rapid adaptation and repurposing towards malevolent objectives.

Figure 3.0 Meta’s LLaMA being used to generate bigotry

The presence of bigoted chatbots exacerbates the problem of echo chambers, where individuals are primarily exposed to information confirming their existing beliefs. Similar to how individuals communicate in homogeneous spaces online, bigoted AI models have the potential to continuously reinforce extremist views, making it increasingly difficult for users of manufactured problematic chatbots to escape insular cycles of hate.

Simulating Victims of Violence

4chan users have also found ways to use AI in order to perpetuate harmful stereotypes. By creating chatbots that simulate targeted groups embodying harmful stereotypes, 4chan users can further promote discriminatory beliefs to reinforce extremist beliefs in marginalised communities. Individuals on 4chan found ways to make changes to Meta’s AI as well as utilising an online program named Character.AI that was created by previous Google engineers.

A series of posts on 4chan (Fig.2) show that users have successfully made alterations to Meta’s LLaMA model and Character.AI to create an African American character named ‘Trinity’ who fetishizes white men. In the post, the anonymous 4chan user suggests that they are working on incorporating “African American vernacular English,” into the model to make the program lean more into stereotypes. 4chan users also created the chatbot ‘Antifa Patty’ which embodies far-right stereotypes of a young liberal who belongs to the LGBTQIA+ community. In the conversation, this model is vegan, obsessed with TikTok, and labels most prompts as racist or capitalist.

Fig. 4: Two chatbots (Trinity and Antifa Patty) that reinforce racial and cultural stereotypes

By presenting targets of discrimination as embodiments of harmful stereotypes, this AI can reinforce existing prejudices and biases among users who interact with them contributing to the further marginalisation of minority communities. Extremists interacting with these models may become more entrenched in their beliefs about a particular group and therefore less tolerant when interacting with individuals in the real world. At some point, it is possible these models may become so advanced that users may not even know if they are conversing with a real person, thereby prompting them to believe that their simulated interaction with a person who belongs to a gender minority, political party, or ethnic group is real. This is particularly worrisome; by interacting with these chatbots, it may become easier for extremists to justify acts of real-world violence or discrimination, as they will be even more predisposed to view vulnerable communities as less than human and therefore undeserving of empathy and respect.

Chatbots of Violence

Users on 4chan have also created multiple variations of customised ‘smutbots’ which allow users to easily generate explicit content that is both descriptive and violent. The authors found instances on 4chan of ‘smutbots’ generating descriptions of graphic scenes of gore and violence involving babies in blenders and neo-Nazi sexual assault. Images of this model’s chat logs posted on 4chan were not included in this article because the stories produced by these models may be triggering to some.

The emergence of such models has serious implications for society, as some suggest that can contribute to users becoming desensitised to violence and increased potential for aggression. Through regularly viewing online violence, it is believed that this content may slowly become more enjoyable for users and not result in the anxiety that is generally expected from exposure to such content. There is an increasing trend of extremist perpetrators like the Highland Park shooter who view and participate in ‘gore’ forums online and subsequently engage in destructive, nihilistic thinking and actions.

As individuals are empowered to create personalised violent fantasies, they may start to perceive the world around them through a distorted lens, where violence is normalised and even glorified. Moreover, when individuals engage with these models, it is conceivable that AI will improve its ability to generate increasingly graphic descriptions of violence and produce more aggressive narratives. As with any machine learning model, the more frequently it is utilised, the more proficient it becomes at delivering the sought-after material. This process is commonly referred to as reinforcement learning.

Forecasting the Risks of AI Technology

As users increasingly utilise AI models, problematic AI has the potential to make users more susceptible to manipulation than other tools employed by extremists. There is a widespread belief that AI is more competent than humans; this misplaced trust may allow far-right groups to utilise the models discussed in this Insight to further convince audiences to internalise violent and bigoted beliefs. Furthermore, advanced versions of bigoted AI models may be able to exploit emotional vulnerabilities. By identifying and capitalising on users’ fears and frustrations, future AI models could manipulate individuals into accepting radical beliefs. This targeted emotional exploitation is already being used by extremists and can be particularly effective in recruiting new followers to extremist causes by fostering a sense of belonging and commitment to these ideologies. A recent study from Stanford University found that autonomous generative agents tasked with interacting with one another in a virtual world were capable of operating in a way that resembled authentic human behaviour. If far-right AI chatbots were deployed to act as real users in chat rooms, online forums, or even social media platforms, the far-reaching consequences discussed in this article have the potential to be even more devastating.

Advocates of open-source AI may suggest that Meta’s leak and extremists’ use of artificial intelligence in its early stages will allow technology giants to fine-tune their models to prevent further abuse. By releasing artificial intelligence models without safeguards, it is possible for companies to more effectively train their models to stop generating harmful content. OpenAI received criticism for outsourcing this training to Kenyan labourers and paying them less than two dollars an hour. These labourers were exposed to depictions of violence, hate speech, and sexual abuse for the restricted version of ChatGPT to exist today. Yet, as this emerging technology is further disseminated, extremists online as well as foreign actors may develop the ability to locally host their own chatbots capable of engaging in even more troubling behaviour. No longer bound by the restrictive weights and safeguards of available AI chatbots like ChatGPT, these models could be fine-tuned to act nefariously. Since the Meta leak, online users have successfully collaborated to create even more sophisticated open-source models that are entirely uncensored, and capable of providing advice on how to join terrorist groups like the Islamic State. An example of this is ‘Wizard-Vicun’ which is regularly discussed in the Reddit channel r/localLlaMa. This model, as well as other variations of it, are publicly available to download by anyone. In the future, it may be even more difficult to regulate the distribution of problematic models as it is with other forms of digital content like pirated videos, music, or video games.

Conclusion

The prolific development and dissemination of AI models make it increasingly difficult to monitor and control how these tools can both be manipulated and recreated by bad actors. This leads to complex challenges that need to be addressed as technology giants compete against one another to create more sophisticated artificial intelligence. The incidents discussed in this Insight underscore the potential for extremist groups and individuals to weaponise AI chatbots and highlight the responsibility of tech giants, researchers, and governments to ensure ethical AI development and use.

Daniel Siegel is a Master’s student at Columbia University’s School of International and Public Affairs. His research focuses on extremist non-state and state actors’ exploitation of cyberspace and emerging technologies. Twitter: @The_Siegster

Tags: Artificial Intelligence, Chans, Radicalisation

‘RedPilled AI’: A New Weapon for Online Radicalisation on 4chan

By Daniel Siegel

Daniel Siegel

Share

GNET’s Research Digest