All posts by Anh V. Vu

ExtremeBB: Supporting Large-Scale Research into Misogyny and Online Extremism

Online anonymous platforms such as forums enable freedom of speech, but also facilitate misogyny, extremism, and political polarisation. We have collected tens of millions of postings to such forums and created a new tool for social scientists to study how these phenomena are linked.

Far-right extremism has been associated with a growing number of mass killings, overtaking Islamist terrorism in about 2018. Examples include the Wisconsin Sikh temple shooting (2012), the riots in Charlottesville (2017), the Pittsburgh synagogue shooting (2018), the Christchurch mosque shootings (2019), the US Capitol riots (January 2021), and recently the Buffalo shooting (May 2022). Misogyny has been explicitly linked with terror attacks including the Isla Vista killings (2014), the Toronto Van attack (2018), the Hanau shootings (early 2020), and most recently, the Plymouth shooting in the UK (August 2021).

Are extremism and misogyny linked? Joan Smith documented how the great majority of the men who committed terrorist killings in Europe since 9/11, whether far-right or Islamist, display strongly misogynistic attitudes. Most also have a history of physically abusing women — often in their own families — before committing acts of violence against strangers. The Womanstats database, created by Val Hudson and colleagues, has uncovered many statistically significant relationships between the physical security of women and the security of states: authoritarian patriarchal attitudes undermine good government in multiple ways.

Social scientists — who often have limited technical skills to deal with complicated collection techniques to compile a reasonably meaningful database — lack quantitative measurements at a finer granularity. The case studies collected by Smith and the macroeconomic data collected in Womanstats are compelling in their own ways. However, there are not many high-quality datasets that support quantitative analysis at scales in between individuals and whole societies. The existing resources tend to be small, difficult to access, or not well-maintained.

We have therefore created ExtremeBB, a longitudinal structured textual database of nearly 50M posts made by around 400K registered active members on 12 online extremist forums that promote misogyny and far-right extremism (as of September 2022). Its goal is to facilitate both qualitative and quantitative research on historical trends going back two decades. Our data can help researchers trace the evolution of extremist ideology, extremist behaviours, external political movements and relationships between online subcultures; measure hate speech and toxicity; and explore links between misogyny, far-right extremism, and their correlation. A better understanding of extremist subcultures may lead to more effective interventions, while ExtremeBB may also help monitor the effectiveness of any interventions that are undertaken.

This database is being actively maintained and developed with special attention to ensuring data completeness and making it a reliable resource. Academic researchers can request access through the Cambridge Cybercrime Centre, subject to a standard license to ensure lawful and ethical use. Since the database was first opened to external researchers in 2021, access has been granted to 49 researchers from 16 groups in 12 universities. The paper describing this powerful new resource and describing some of the things we have so far discovered using it can be found here.

How an Illicit Cybercrime Market Evolves: A Longitudinal Study

Online underground marketplaces are an essential part of the cybercrime economy. They often act as a cash-out market, enabling the trade in illicit goods and services between pseudonymous members. To understand their characteristics, previous research mostly uses vendor ratings, public feedback, sometimes private messages, friend status, and post content. However, most research lacks comprehensive (and important) data about transactions made by the forum members.

Our recent paper (original talk here) published at the Internet Measurement Conference (IMC’20) examines how an online illicit marketplace evolves over time (especially its performance as an infrastructure for trust), including a significant shift through the COVID-19 pandemic. This study draws insights from a novel, rich and powerful dataset containing hundreds of thousands contractual transactions made by members of HackForums — the most popular online cybercrime community. The data includes a two-year historical record of the contract system, originally adopted in June 2018 as an attempt to mitigate scams and frauds occurring between untrusted parties. As well as contractual arrangements, the dataset includes thousands of associated members, threads, posts on the forum, which provide additional context. To study the longitudinal maturation of this marketplace, we split the timespan into three eras: Set-up, Stable, and COVID-19. These eras are defined by two important external milestones: the enforcement of the new forum’s policy in March 2019, and the declaration of the global pandemic in March 2020.

We applied a range of analysis and statistical modelling approaches to outline the maturation of economic and social characteristics of the market since the day it was introduced. We find the market has centralised over time, with a small proportion of ‘power users’ involved in the majority of transactions. In term of trading activities, currency exchange and payments account for the largest proportion of both contracts and users involved, followed by giftcards and accounts/licenses. The other popular products include automated bots, hacking tutorials, remote access tools (RATs), and eWhoring packs. Contracts are settled faster over time, with the completion time dropping from around 70 hours in the early months to less than 10 hours during the COVID-19 Era in June 2020.

We quantitatively estimate a lower bound total trading value of over 6 million USD for public and private transactions. With regards to payment methods preferably used within the market, Bitcoin and PayPal dominate the others at all times in terms of both trading values and number of contracts involved. A subset of new members joining the market face the ‘cold start’ problem, which refers to the difficulties of how to establish and build up a reputation base while initially having no reputation. We find that the majority of these build up their profile by participating in low-level currency exchanges, while some instead establish themselves by offering products and services.

To examine the behaviours of members over time, we use Latent Transition Analysis to discover hidden groups among the forum’s members, including how members move between groups and how they change across the lifetime of the market. In the Set-up Era, we see users gradually shift to the new system with a large number of ‘small scale’ users involved in one-off transactions, and few ‘power-users’. In the Stable Era, we see a shift in the composition and scale of the market when contracts become compulsory, with a growth of ‘business-to-consumer’ trades by ‘power-users’. In the COVID-19 Era, the market further concentrates around already-existing ‘power-users’, who are party to multiple transactions with others.

Overall, the marketplace provides a range of trust capabilities to facilitate trade between pseudonymous parties with the control is becoming further centralised with administrators acting as third-party arbitrators. The platform is clearly being used as a cash-out market, with most trades involving the exchange of currencies. In term of the three eras, the big picture shows two significant rises in the market’s activities in response to two major events that happened at the beginning of Stable and COVID-19 eras. Particularly, we observe a stimulus (rather than transformation) in trading activities during the pandemic: the same kinds of transactions, users, and behaviours, but at increased volumes. By looking at the context of forum posts at that time, we see a period of mass boredom and economic change, when some members are no longer at school while others have become unemployed or are unable to go to work. A need to make money and the availability of time in their hand to do so may be a factor resulting in the increase of trading activities seen at this time.

Some limitations of our dataset include no ground truth verification, in which we have no way to verify if transactions actually proceed as set out in the contractual agreements. Furthermore, the dataset contains a large number of private contracts (around 88%), in which we only can observe minimal information. The dataset is available to academic researchers through the Cambridge Cybercrime Center‘s data-sharing agreements.

Three Paper Thursday: Attacking the Bitcoin Peer-to-Peer Network

People have tried to develop many different attack vectors on cryptocurrencies, from codebase flaws, cryptographic algorithms, mining processes, consensus protocols and block propagation mechanisms to the underlying network layer. Most attacks could be patched quickly by modifying the source code, but preventing attacks that exploit the network layer remains a non-trivial problem as the network layer heavily relies on the existing Internet infrastructure, which is impractical to change. So network-layer attacks could be dangerous, powerful and hard to mitigate.

In this post, I would like to introduce some recent attacks against the Bitcoin P2P network. Continue reading Three Paper Thursday: Attacking the Bitcoin Peer-to-Peer Network