Click here to read our latest report: Going Dark: The Inverse Relationship between Online and On-the-Ground Pre-offence Behaviours in Targeted Attackers

Exploring Extreme Language in Gaming Communities

Exploring Extreme Language in Gaming Communities
20th January 2022 Sam Andrews
In Insights

There is ongoing speculation that extremism might be prevalent in video game communities, or that these communities are especially vulnerable to extremism. In a previous article for GNET, we outlined the current state of the literature on extremism and video games. While there is disagreement among many, there is an overarching concern that video game communities have a latent hostility to feminism and minorities, and this makes them vulnerable to reactionary and far-right ideologies. This concern is reflected by counter-terrorism police in the United Kingdom, and the United Nations Office of Counter-Terrorism.  However, the evidence base for this is small, and the diversity of gaming communities draws into question whether we can consider gaming communities as a whole to be vulnerable.

To contribute to this conversation, this Insight uses a dataset of 300,000 comments from three video game forums (subreddits) on Reddit to understand how prevalent extreme language is in these communities.

Subreddits and Gaming Communities

Reddit is a well-established and popular site that hosts a variety of interest communities, organised as ‘subreddits’. Each subreddit is a unique community of users that generates conversation and shares media, and are generally created and maintained by users. While multiple subreddits might cover the same interest, each is unique in how it is run and how users interact. Subreddits discussing gaming are some of the most popular on the site. For instance the subreddit /r/Gaming has 31.4 million subscribers, and also has 981 affiliated subreddits related to more specific gaming interests. There are many more not affiliated to this subreddit.

Three forums were selected to get a broad understanding of how language is used in gaming spaces. The subreddit /r/VideoGames currently has 81.1k subscribers, and restricts discussion to video game-related posts. ‘Seriously offensive’ language, such as racism and sexism, is not allowed and serial rule-breakers are banned. /r/HeartsOfIron has 13.8k subscribers and no explicit rules. The forum is dedicated to discussing the game Hearts of Iron IV by Paradox Interactive. This game is supposedly popular among the extreme right; the game gives players control of a country between 1933 and 1949, allowing them to choose political, military and economic strategy during World War II. Alternative histories can be played out, including Nazi victory. Finally, /r/KotakuInAction calls itself “The almost-official GamerGate subreddit!” and has 129k subscribers. It has an extensive ruleset which includes banning harassment, doxing and threatening or violent language. It is however considered to be a subreddit saturated with a “geek masculinity” that encourages and allows “networked misogyny”, ultimately fostering extremist beliefs.

These three subreddits represent a small slice of gamer culture – general gaming discussion, spaces for fans of individual games, and wider discussions about the culture surrounding gaming. While /r/VideoGames was selected as an example of a mainstream video game forum, /r/HeartsOfIron and /r/KotakuInAction were selected as exemplary problem cases. Because of the popularity of Hearts of Iron IV with the extreme right, it is here that we might find examples of extreme right language. Because of the association between /r/KotakuInAction and GamerGate, we would expect to find misogynistic, anti-Feminist and extreme-right language in this subreddit. Thus we can see how closely aligned the mainstream is with a space already identified as problematic.

To gather data from these subreddits, a custom R function was written to get the 100 most recent comments per day for the past 1,000 days for each subreddit, from the pushshift.io API. A total of 100,000 comments per subreddit from between 14 March 2019 to 8 December 2021 were retrieved. The comments were cleaned and a corpus of words was created, totalling 9,253,196 words.

A custom lexicon was used for dictionary analysis, based off Farrell et. al.’s. This lexicon was developed to detect extreme language in web forums, including misogynistic, sexually explicit, violent, racist and homophobic language, and was therefore useful for detecting both extreme racialised language, and the kind of anti-feminist and misogynistic language expected from /r/KotakuInAction. Modifications to this lexicon were made to expand it, including a wider range of harassment-related words from Revzan’s Harassment corpus. The final lexicon is classified by word type, including the categories homophobia, hostile language, Incel and anti-Feminism, political, racism, sexual, and violent language.

Language Prevalence

/r/VideoGames
wordtypen
shithostile language1403
beathostile language906
fucksexual716
fuckingsexual706
smashviolent language532
killviolent language482
hitviolent language450
finishviolent language441
asssexual428
dumbhostile language240

 

/r/HeartsOfIron
wordtypen
attackviolent language3387
borderracism1125
forceviolent language1081
shithostile language1057
killviolent language788
fucksexual660
chineseracism638
beathostile language630
fuckingsexual543
hitviolent language434

 

/r/KotakuInAction
wordtypen
shithostile language5227
fuckingsexual3092
fucksexual2759
sjwincel and antiFeminism1383
racistracism1379
sjwsincel and antiFeminism1273
gayhomophobic1036
chineseracism1029
asssexual794
killviolent language755

Dictionary analysis allows for a simple comparison of the prevalence of words within a sample. The above tables show the top ten instances of words in each subreddit. The forums /r/VideoGames and /r/HeartsOfIron show a majority of hostile or violent language. While in some contexts this could be considered problematic, it is unlikely here. Taking the /r/HeartsOfIron subreddit as an example, the high prevalence of attack, kill and force does not necessarily indicate propensity to violence within the community. This is because the game itself is about war; one would expect such words to appear. Beat has a similar function within the /r/VideoGames forum, likely referring to beating or finishing a game. The count and word frequency of these terms is also relatively low.

However, the data from /r/KotakuInAction reveals a high incidence of language hostile to feminist or progressive politics – the pejorative term sjw appears commonly and with comparatively high incidence, noting a hostility to feminism and the kind of dalliance with far-right politics that many commentators have been concerned about. Overall /r/KotakuInAction also appears to be a much more hostile place than the other two subreddits.

What is most notable, however, is the low incidence of anti-Feminist or misogynistic terminology appearing in the other two subreddits. The term sjw appears only 42 times in the /r/VideoGames subcorpus, and the Incel and anti-Feminism classified term chad appears 117 times in the /r/HeartsOfIron subcorpus. Additionally, the first Incel and anti-Feminism classified term to appear in the /r/VideoGames corpus is beta, which appears 112 times. Similar to the appearance of violent language this however can be reflective of the medium, with early releases of unfinished games being referred to as beta versions. This data does not indicate that either forum has a problem with extremism, as indicated by the presence of extreme language.

Distinctive Terms

Other measures can be used to better understand the differences between these spaces. The measure TF-IDF (term frequency-inverse document frequency) provides a measure for words that are distinctively frequent in a document, when that document is compared others. A higher TF-IDF indicates a more comparatively distinct term. Exploring text using ­TF-IDF can help with gaining a better understanding of what is unique about a particular corpus.

A raw measure of TF-IDF across all the subreddits confirms that each subreddit is distinct – each is concerned with their own specific interests; video games, Hearts Of Iron IV, and video game politics. Running TF-IDF through the lexicon reveals which kinds of extreme language is most distinctive in each forum.

/r/VideoGames
wordntf_idftype
bounty314,97E+08racism
boobs254,01E+08sexual
faggot193,05E+08homophobic
jackass111,76E+08hostile language
nigga111,76E+08racism
moss91,44E+08racism
barbie81,28E+08sexual
butthead81,28E+08hostile language
heterosexual81,28E+08sexual
turd81,28E+08hostile language

 

/r/HeartsOfIron
wordntf_idftype
japs344,99E+08racism
jap314,55E+08racism
anglo284,11E+08racism
arab253,67E+08racism
annihilate192,79E+08violent language
zog111,62E+08racism
mra91,32E+08incel and antiFeminism
niger91,32E+08racism
huns31,19E+08racism

 

/r/KotakuInAction
wordntf_idftype
boobs1211,24E+09sexual
sjw’s1101,13E+09incel and antiFeminism
weasel328,91E+08hostile language
smear697,09E+08hostile language
faggot606,17E+08homophobic
arab575,86E+08racism
vagina444,52E+08sexual
mgtow164,45E+08incel and antiFeminism
whore424,32E+08sexual
bounty404,11E+08racism

Again, this analysis reveals a split between the samples. The forums /r/VideoGames and /r/HeartsOfIron are not typically associated with extreme language. With the exception of nigga and faggot, the majority of incidences within the /r/VideoGames subcorpus is classified as hostile or sexual language, associated with competition and a possible hostility to the challenges of others.

A difference between /r/VideoGames and /r/HeartsOfIron however does occur with the term ZOG, or Zionist Occupied Government, referring to a neo-Nazi belief in Jewish world control. While this is of low incidence in the subreddit, it is nonetheless revealing that this term is distinctive. This supports some of the findings of Vaux et. al.’s study. Indeed, most terms appearing as distinctive within the /r/HeartsOfIron data are classified as derogatory terms for racial, ethnic or religious groups; while the top four terms are likely related to strategic objectives within the game, high TF-IDF scores were also found for the terms subhuman (n = 3, TF-IDF = 4,40E+07), goy (n = 1, TF-IDF = 3,62E+07), alongside similar derogatory terms. While these terms are generally low incidence, it does indicate that certain gaming spaces can attract extreme ideologies.

The data relating to /r/KotakuInAction is more revealing, with Incel-related, anti-Feminist and sexualised terms being distinctive. Most high TF-IDF terms in this subcorpus were related to derogation of women, anti-Feminism, and Incel-related terminologies. The terms redpill (n = 207, TF-IDF = 2,78E+08) and blackpill (n = 185, TF-IDF = 1,95E+08) appeared with high scores. Misogynist terms such as slut and harem, indicating hostility towards women, also scored highly.

Discussion

The exploration presented above, while not generalisable, raises some important questions about our understanding of extremism within gaming forums. What is most prevalent in mainstream forums is not extremism, but hostility. While this can manifest in homophobic or sexist language and can indeed result in an unwelcoming environment for those gamers who are not white, male, and straight, there is little evidence that these spaces are home to extremists.

However, it is also apparent that some gaming spaces are characterised by anti-Feminist, misogynist and Incel-related language, and that others are attractive to extremists. /r/KotakuInAction is notably high in the former language – compared to the other two subreddits, this language is prevalent and uniquely present. It is also apparent that while the unique presence of neo-Nazi related language on /r/HeartsOfIron is of concern, and reflects the attraction of this game to extremists, it is not generally a space for extremists. Indeed that language is rare.

While it is worth reiterating that this study is exploratory, not representative, there are some lessons that can be drawn. Some of our ideas about gaming communities hold true; they can be places hostile to minorities, and some games are attractive to extremists. Communities associated with the GamerGate movement are also misogynistic, anti-Feminist and share language with Incel-affiliated communities.

However, this study also highlights that mainstream gaming communities are not especially extremist. Even those games that extremists are attracted to do not have a high prevalence of extremist language in their communities. Additionally while GamerGate characterised some of the worst of gaming communities, this does not represent gaming communities as a whole. We must therefore be more nuanced when talking about gaming and extremism; there is little indication here that gaming is prima facie a place where extremism is found.

There are however several limits to this study. Dictionary-based analysis is a rather blunt tool, and relies on knowledge of context; how can we know whether the use of the word gay, for instance, is being used as a pejorative or simply as a descriptor? The fluid nature of online text also presents problems for static dictionaries, especially as innocuous terms are adopted by extremists. Such problems can be overcome using mixed methods, such as qualitative analysis, which would allow for further nuance in our explorations. Such a study is forthcoming here.

Note: Outside of this dataset, most occurrences of zog on the /r/HOI4 subreddit do not refer to Zionist Occupied Government. Instead they are a reference to King Zog I of Albania. Users on the subreddit posting about playing Albania usually receive replies like “Hail Zog!” While this instance and other instances of neo-Nazi or White nationalist comments can be uniquely found on the subreddit when compared to /r/VideoGames, most of the time Zog is posted as a joke and a reference to Albania.