• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study

May 28, 2023

🔬 Research Summary by Marco Guerini, a researcher in Computational Linguistics and head of the Language and Dialogue Technologies group at Fondazione Bruno Kessler (FBK).

[Original paper by Serra Sinem Tekiroglu, Helena Bonaldi, Margherita Fanton, and Marco Guerini]


Overview: Many international institutions and countries are taking action to monitor, restrict, and remove online hate content. Still, results are not always satisfactory, and they are often charged with censorship. An alternative approach that emerged in recent years is based on the use of so-called Counter Narratives, which are de-escalating and fact-bound textual responses refuting the hateful messages in a non-aggressive way. In this scenario, automatization is needed to address the sheer amount of hate produced daily effectively. Therefore, we conducted a comparative study investigating the use of several AI-based Neural Language Models for an automatic counter-narrative generation to be used as a companion to NGO operators while tackling online hate.


Introduction

Hate Speech (HS) has found fertile ground in Social Media Platforms. Actions undertaken by such platforms to tackle online hatred consist in identifying possible sources of hate and removing them using content deletion, account suspension, or shadow-banning. However, these actions are often interpreted and denounced as censorship by the affected users and political groups. For this reason, such restrictions can have the opposite effect of exacerbating the hostility of the haters. An alternative strategy looming on the horizon is based on using Counter-Narratives (CNs), i.e., communicative actions used to refute hate speech through thoughtful and cogent reasons and true and fact-bound arguments. As a de-escalating measure, CNs have successfully diminished hate while preserving freedom of speech.  An example of HS and CN follows: 

HS: Jews have a secret plot to take over the world.

CN: This myth traces back to ‘The Protocols of the Learned Elders of Zion,’ which is used as proof of a Jewish conspiracy to rule the world. But the Protocols are a proven forgery, written by agents of the Russian czar in the late 19th century, that spread throughout the 20th century and continues to this day.

Key Insights

Based on their effectiveness, NGOs are employing CNs to counter online hate. Still, it is impossible to respond to all instances of hate manually. For this reason, a line of AI research based on Natural Language Processing (NLP) has recently emerged, focusing on designing systems to generate CN suggestions automatically. This study aims to compare the most recent and advanced AI-based Language Models (LM) to understand their pros and cons in generating CNs. 

Effective CN Generation experiments

In our experiments, we use various automatic metrics and manual evaluations with expert judgments to assess several LMs, representing the main categories of the model architectures and decoding methods that are currently available. We further test the robustness of the fine-tuned LMs in generating CNs for unseen targets of hate. For this study, we rely on a dataset that grants the target diversity and the CN quality we aim for. The dataset was collected with a human-in-the-loop approach and features 5k HS-CN pairs, covering several targets, including DISABLED, JEWS, LGBT+, MIGRANTS, MUSLIMS, POC, and WOMEN. 

Results show that autoregressive language models such as GPT-2 are, in general, more suited for the task, and while stochastic decoding mechanisms can generate more novel, diverse, and informative outputs, deterministic decoding is useful in scenarios where more generic and less novel (yet ‘safer’) CNs are needed. Furthermore, in out-of-target experiments, we find that the similarity of targets (e.g., JEWS and MUSLIMS as religious groups) plays a crucial role in the effectiveness of portability to new targets. We finally show a promising research direction of leveraging human corrections of LM’s outputs for building an additional automatic post-editing step to correct errors made by LMs during generation.

Between the Lines

Automating CN generation can help increase the efficiency of online hate countering while preserving freedom of speech and promoting less aggressive and hostile debates. However, the AI-based generation models we tested are not meant to be used autonomously since even the best model can still produce substandard CNs containing inappropriate or negative language. Instead, following a human-computer cooperation paradigm, we want to build models that can be helpful to NGO operators by providing them with diverse and novel CN candidates for their hate-countering activities while granting them total control over the final output.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Canada’s Minister of AI and Digital Innovation is a Historic First. Here’s What We Recommend.

Am I Literate? Redefining Literacy in the Age of Artificial Intelligence

AI Policy Corner: The Texas Responsible AI Governance Act

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

related posts

  • Research summary: Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Le...

    Research summary: Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Le...

  • The Larger The Fairer? Small Neural Networks Can Achieve Fairness for Edge Devices

    The Larger The Fairer? Small Neural Networks Can Achieve Fairness for Edge Devices

  • On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

    On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

  • Conversational Swarm Intelligence (CSI) Enhances Groupwise Deliberation

    Conversational Swarm Intelligence (CSI) Enhances Groupwise Deliberation

  • Artificial Intelligence and Aesthetic Judgment

    Artificial Intelligence and Aesthetic Judgment

  • Research Summary: Toward Fairness in AI for People with Disabilities: A Research Roadmap

    Research Summary: Toward Fairness in AI for People with Disabilities: A Research Roadmap

  • A Systematic Review of Ethical Concerns with Voice Assistants

    A Systematic Review of Ethical Concerns with Voice Assistants

  • The Meaning of “Explainability Fosters Trust in AI”

    The Meaning of “Explainability Fosters Trust in AI”

  • Model Positionality and Computational Reflexivity: Promoting Reflexivity in Data Science

    Model Positionality and Computational Reflexivity: Promoting Reflexivity in Data Science

  • Fair allocation of exposure in recommender systems

    Fair allocation of exposure in recommender systems

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.