Radical message boards, large- and small-scale social media platforms, and other Internet fringes are thought to have played a role in several recent terrorist attacks. Currently, increased attention is being paid to understanding the language and behaviour on platforms that provide a free exchange of ideas. However, the majority of previous research has focused on detection (e.g. of radical users and hate speech), while research on understanding how extreme language use develops over time is lacking. Our recent paper investigates the temporal development of user activity and the use of extremist language on the white nationalist forum Stormfront.org, between the years of 2001 and 2015. Stormfront was one of the early online extremist discussion forums, launched in 1995 by a white nationalist and former Ku Klux Klan leader. Throughout the years, Stormfront has become a breeding ground for right-wing extremists worldwide.
Why should we care about how extremist language and online behaviour develop over time?
First, the examination of a large time span may help to better understand emerging radical platforms in the future. Second, appreciating the temporal nature of a forum may aid (tech) policymakers and law enforcement in identifying points of intervention, for example when language use suddenly becomes more extreme.
Our study examined both the engagement with the forum and the language used on the forum for a temporal range spanning 14 years and a total of 1,009,986 posts. Forum engagement was measured through the number of posts and post lengths. Extremist language was measured through a composite measure consisting of the frequency of profane words, racial slurs and negative overall sentiment of the post. Using these measures, we fit various time-series models on the data, aggregated by month, in order to inspect how forum activity and language developed over time. Specifically, we compared models which represented the development as a stationary process (only seasonal changes, but no mean change), a linear process (continuous increase over time), and as a process with discrete breakpoints (step-wise increases or decreases).
Core findings
In terms of forum activity, the temporal development of the post frequencies and post length was best represented by a structural breakpoint model. There were sudden increases in forum activity in 2003, 2005, 2007, until a lasting decrease starting in 2009 (see figures in our paper). For post length, we saw a maximum plateau between 2004 and 2006 (average 85 words), after which we saw decreases until the end of the timeline (average 69 words).
In terms of extremist language, breakpoint models were also the best fit to represent both the development of the proportion of extremist posts, as well as the intensity of extremist language. The proportion of extremist posts increased in a step-wise manner between 2002 and 2011, after which it decreased slightly. Nevertheless, at this point 23.81% of posts were still of an extremist nature. For extremist language intensity, we observed a sudden increase in March 2004, resulting in a plateau that lasted until May 2011, after which the intensity decreased to a level comparable to the forum’s early period.
Lastly, we examined whether a select number of users dominated forum activity and the number of extremist posts. Indeed, a mere 10% of users were responsible for more than 80% of posting activity, and 20% of the users accounted for almost 90% of posts. We then compared super users (in the 99th percentile of posting activity) to the rest of the forum users. Super-users wrote longer and more extreme posts, used more profane language, and were less positive than other users on the forum. We assessed how these two user types progressed through the forum (i.e. from their start on the forum to the end of the dataset or their posting activity). The findings suggest that both normal and super users showed step-wise decreases in activity over time, but superusers increased their activity after about 15% of their time on the platform.
So what? What do these results mean?
We draw three core conclusions from our analysis. First, both forum activity and extremist language developed in discrete steps, rather than in a seasonal or continuous linear fashion. Second, the analyses suggest that the language on Stormfront did not become more extreme over time, at least in the timeframe studied. Third, a small number of users can be responsible for overall forum engagement and extremist rhetoric.
Future research should examine whether and how the progression of individual users (or new forums) can be forecasted, and whether phase transitions could be predicted. The discrete nature of behaviour and language development suggested by our analysis could further allow for forecasting not only of the development but also of the occurrence of critical phase transitions (e.g., from normal posting to sudden, more extreme content). Ultimately, this could help in mitigating the trade-off between allowing free speech and preventing hate speech online because the passing of critical thresholds could be predicted. In a wider sense, this work could be a step towards needed transparent and evidence-based decision-making by policymakers concerning risk management and threat assessment on such online spaces.
This piece was co-authored with Bennett Kleinberg and Paul Gill.