Detecting Influence Campaigns on X with AI and Network Science

ISI researchers have developed machine learning models to identify coordinated inauthentic behaviors by accounts on X involved in influence campaigns around the world

by Bernice Chan

May 14, 2024

Man being controlled by hand from phone screen. — Photo Credit: rudall30/iStock

In the age of generative-AI and large language models (LLMs), massive amounts of inauthentic content can be rapidly broadcasted on social media platforms. As a result, malicious actors are becoming more sophisticated, hijacking hashtags, artificially amplifying misleading content, and mass resharing propaganda.

These actions are often orchestrated by state-sponsored information operations (IOs), which attempt to sway public opinion during major geopolitical events such as the US elections, the Covid-19 pandemic, and more.

Combating these IOs has never been more crucial. Identifying influence campaigns with high-precision technology will significantly reduce the misclassification of legitimate users as IO drivers, ensuring social media providers or regulators do not mistakenly suspend accounts while trying to curb illicit activities.

In light of this, USC Information Sciences Institute (ISI) researcher Luca Luceri is co-leading an effort funded by the Defense Advanced Research Project Agency (DARPA) to identify and characterize influence campaigns on social media. His most recent paper “Unmasking the Web of Deceit: Uncovering Coordinated Activity to Expose Information Operations on Twitter” was presented at the Web Conference on May 13, 2024.

“My team and I have worked on modeling and identifying IO drivers such as bots and trolls for the past five to ten years,” Luceri said. “In this paper, we’ve advanced our methodologies to propose a suite of unsupervised and supervised machine learning models that can detect orchestrated influence campaigns from different countries within the platform X (formerly Twitter).”

A fused network of similar behaviors

Drawing from a comprehensive dataset of 49 million tweets from verified campaigns originating in six countries – China, Cuba, Egypt, Iran, Russia, and Venezuela – Luceri and his team have honed in on five sharing behaviors on X that IO drivers participate in.

These include co-retweeting (sharing identical tweets), co-URL (sharing the same links or URLs), hashtag sequence (using an identical sequence of hashtags within tweets), fast retweeting (quickly re-sharing content from the same users), and text similarity (tweets with resembling textual content).

Previous research focused on building networks that mapped out each type of behavior, examining the similarities between individual users on X. However, Luceri and his team noticed that these accounts often employ many strategies at the same time, which meant that monitoring one behavioral trace was not enough.

“We found that co-retweeting was massively used by campaigns in Cuba and Venezuela,” Luceri explained. “However, if we only examine co-retweeting without considering other behaviors, we would perform well in identifying some campaigns, such as those originating from Cuba and Venezuela, but poorly where co-retweeting was used less, such as in Russian campaigns.”

To capture a broader range of coordinated sharing behaviors, the researchers constructed a unified similarity network called a Fused Network. Then, they applied machine learning algorithms fed by topological properties of the fused network to classify these accounts’ similarities and predict their future participation in IOs.

Luceri and his team found that this method could be applicable to campaigns across the world. Multiple X users within the same campaign, no matter where they are from, exhibited remarkable collective similarity in their actions.

“I consider our work a paradigm shift in research methods, giving a new perspective in the identification of influence campaigns and their drivers,” said Luceri.

Unlocking new opportunities

The unsupervised machine learning model leverages well-known, yet underutilized network features achieving a 42% higher precision than other traditional approaches to detect influence campaigns. Luceri views this paper as a starting point that could open the way to further avenues of research.

“We can train models on the topological features of this similarity network, and make them work in complex scenarios: for instance, if different users from different countries interacted with each other, or more challenging situations where we have limited information about the campaigns,” Luceri remarked.

Luceri also presented another paper “Leveraging Large Language Models to Detect Influence Campaigns in Social Media” at the Web Conference, which received the best paper award from the International Workshop on Computational Methods for Online Discourse Analysis (BeyondFacts’24). The paper examines the potential of using LLMs to recognize the signs of AI-driven influence campaigns. This is particularly crucial in the current climate, where AI-created media is pervasive.

“These coordinated activities have consequences in real life,” said Luceri. “They have the power to spread misinformation and conspiracy theories that might lead to protests or attacks on our democracy, such as the interference of Russian trolls in the 2016 US election.”

Luceri and his team are committed to continuing the search for alternative strategies to identify influence campaigns and protect users susceptible to influence.

Published on May 14th, 2024

Last updated on May 16th, 2024