Attacking Fake News Detectors via Manipulating News Social Engagement

🔬 Research Summary by Haoran Wang, a doctoral student at Illinois Institute of Technology, with an interest in building trustworthy AI systems to verify natural language information.

[Original paper by Haoran Wang, Yingtong Dou, Canyu Chen, Lichao Sun, Philip S. Yu, and Kai Shu]

Overview: Although recent works have exploited the vulnerability of text-based misinformation detectors, the robustness of social-context-based detectors has not yet been extensively studied. In light of this, we propose a multi-agent reinforcement learning framework to probe the robustness of existing social-context-based detectors. We offer valuable insights to enhance misinformation detectors’ reliability and trustworthiness by evaluating our method on two real-world misinformation datasets.

Introduction

The popularity of social media platforms as sources of news consumption, particularly among the younger generation, has led to a significant increase in misinformation. To address this issue, several text- and social context-based fake news detectors have been proposed. However, recent research has started to focus on uncovering the vulnerabilities of these detectors. In this paper, we introduce an adversarial attack framework designed to assess the robustness of Graph Neural Network (GNN)-based fake news detectors. Our approach utilizes a multi-agent reinforcement learning (MARL) framework to simulate the adversarial behavior exhibited by fraudsters on social media platforms. Real-world evidence suggests that fraudsters coordinate their actions to share different news articles, aiming to evade detection by fake news detectors. To capture this behavior, we model our MARL framework as a Markov Game, incorporating bot, cyborg, and crowd worker agents with distinct costs, budgets, and influence levels. Deep Q-learning techniques search for the optimal policy that maximizes rewards in this adversarial setting. Through extensive experimentation on two real-world datasets of fake news propagation, we demonstrate the effectiveness of our proposed framework in sabotaging the performance of GNN-based fake news detectors. The results highlight the vulnerability of these detectors when faced with coordinated adversarial attacks. By presenting this adversarial attack framework and showcasing its impact on GNN-based fake news detectors, our paper provides valuable insights for future research in the field of fake news detection. It underscores the need to develop more robust and resilient detection mechanisms to counter the sophisticated tactics employed by fraudsters in spreading misinformation.

Key Insights

Probing the Robustness of Social Misinformation Detectors via Manipulating News Social Engagement

We draw inspiration from previous research on GNN robustness to analyze the robustness of social-engagement-based misinformation detectors. In our approach, we propose attacking GNN-based detectors by simulating the adversarial behaviors exhibited by fraudsters in real-world misinformation campaigns.

However, this simulation presents three significant challenges that need to be addressed. The first challenge concerns the evasion tactics employed by malicious actors to promote fake news on social media. These actors typically manipulate controlled user accounts to share various social posts while attempting to evade detection. However, most existing GNN adversarial attack methods assume the ability to perturb all nodes and edges, which is impractical in this scenario. We must develop strategies that account for the limited control over nodes and edges to devise effective attacks. The second challenge stems from many deployed GNN-based fake news detectors being gray-box models with diverse architectures tailored to the heterogeneous user-post graph. As a result, the gradient-based optimization methods used in previous works cannot be directly applied to devise attacks. Alternative approaches must be explored to tackle this challenge and overcome the limitations imposed by the model architecture. The third challenge arises from different types of coordinated malicious actors with varying capabilities, budgets, and risk appetites in real-world misinformation campaigns. Key opinion leaders, for example, possess stronger influence but require more resources to cultivate compared to social bots. It is important to account for this diversity and develop attack strategies that can adapt to different types of malicious actors.

To address these challenges, we propose a dedicated Multiagent Reinforcement Learning (MARL) framework, distinct from previous GNN robustness research. This MARL framework is designed to simulate the real-world behavior of fraudsters sharing different posts. We utilize deep reinforcement learning techniques to flip the classification results of target news nodes by modifying the connections of users who have shared the post. Our MARL framework is formulated as a Markov Game, where multiple agents work together to flip the classification results collectively. Through extensive experimentation, we have made several observations from our results.

Finding 1: Existing social-context-based detectors are vulnerable to adversarial attacks

Our experiment results on two real-world misinformation datasets show that the proposed method (MARL) is effective on fake news in both datasets. Among the popular GNN-based misinformation detectors that we tested, Graph Convolutional Neural Network (GCN) is the most vulnerable, where MARL can achieve a 92% success rate in attacking fake news in the Politifact dataset. This is likely due to the low breakdown point of GCN’s weighted mean aggregation method.

Finding 2: Pay attention to “seemingly good” users when defending against social attacks

We divide the users into “good” and “bad” groups based on the historical number of real news they have tweeted. Our experiment results show that the seemingly “good” users have more influence on flipping the classification label of fake news during attacks. In reality, social media companies should pay equal attention to all users when building more robust detectors since the seemingly “good” users exhibit greater influence during attacks if they become compromised.

Finding 3: More “viral” news is inherently more robust than the news that receives little or no attention

We categorize news posts based on their degrees, representing how often users retweet them. Our experiment results show that news with higher degrees, meaning the more “viral” news is harder to be attacked than news with lower degrees. This suggests news that receives undersized attention could be easy targets for attackers.

Between the lines

Our experimental findings demonstrate that Multi-Agent Reinforcement Learning (MARL) significantly enhances the overall performance of attacks when compared to our baseline methods, particularly showing its effectiveness against GCN-based detectors. While these results are promising, it is important to acknowledge two major limitations in this paper:

1) This work only employs a simple heuristic to select users for action aggregation.

2) The search space of the Q network is considerably large and results in a high computational cost on larger datasets like Gossipcop.

Therefore, there are several interesting directions that need further investigation. The first is to automate the selection of optimal agents for action aggregation. The second one is effectively reducing the deep Q network’s search space. Finally, we used a vanilla MARL framework in this paper. It would be interesting to explore a more complex MARL framework for this task.

Attacking Fake News Detectors via Manipulating News Social Engagement

Introduction

Key Insights

Probing the Robustness of Social Misinformation Detectors via Manipulating News Social Engagement

Finding 1: Existing social-context-based detectors are vulnerable to adversarial attacks

Finding 2: Pay attention to “seemingly good” users when defending against social attacks

Finding 3: More “viral” news is inherently more robust than the news that receives little or no attention

Between the lines

Rethinking Fairness: An Interdisciplinary Survey of Critiques of Hegemonic ML

LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games

Examining the Black Box: Tools for Assessing Algorithmic Systems (Research Summary)

Universal and Transferable Adversarial Attacks on Aligned Language Models

System Safety and Artificial Intelligence

The State of AI Ethics Report (June 2020)

Anthropomorphism and the Social Robot

Atomist or holist? A diagnosis and vision for more productive interdisciplinary AI ethics dialogue

Promoting Bright Patterns

Ghosting the Machine: Judicial Resistance to a Recidivism Risk Assessment Instrument

Categories

Signature Content

Learn More

The AI Ethics Brief (bi-weekly newsletter)

About Us

Archive

Introduction

Key Insights

Probing the Robustness of Social Misinformation Detectors via Manipulating News Social Engagement

Finding 1: Existing social-context-based detectors are vulnerable to adversarial attacks

Finding 2: Pay attention to “seemingly good” users when defending against social attacks

Finding 3: More “viral” news is inherently more robust than the news that receives little or no attention

Between the lines

Rethinking Fairness: An Interdisciplinary Survey of Critiques of Hegemonic ML

LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games

Examining the Black Box: Tools for Assessing Algorithmic Systems (Research Summary)

Universal and Transferable Adversarial Attacks on Aligned Language Models

System Safety and Artificial Intelligence

The State of AI Ethics Report (June 2020)

Anthropomorphism and the Social Robot

Atomist or holist? A diagnosis and vision for more productive interdisciplinary AI ethics dialogue

Promoting Bright Patterns

Ghosting the Machine: Judicial Resistance to a Recidivism Risk Assessment Instrument

Footer

Categories

Signature Content

Learn More

The AI Ethics Brief (bi-weekly newsletter)

About Us

Archive