• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Exploiting Large Language Models (LLMs) through Deception Techniques and Persuasion Principles

January 25, 2024

🔬 Research Summary by Sonali Singh, a Ph.D. Student at Texas Tech University working on Large language model(LLM).

[Original paper by Sonali Singh, Faranak Abri, and Akbar Siami Namin]


Overview: This paper explores the exploitation of Large Language Models (LLMs) through deception techniques and persuasion principles such as trust and social proof, manipulation and misinformation, authority, lack of details, and avoidance of pronouns, focusing on the potential use of these LLMs in crafting phishing emails and offering guidelines on some other unethical activities. The paper highlights the challenges in preventing the misuse of AI and LLMs technologies and underscores the need for creating robust ethical guidelines.


Introduction

The research delves into the pressing issue of AI misuse, particularly concerning phishing emails and unethical data acquisition. By simulating real-world scenarios, the study examines how LLMs such as GPT-4, Bard, Claude, and Llama-2 can be manipulated to generate content that could aid in unethical activities, such as crafting phishing emails, planning data theft, or manipulating financial data to cause a stock market crash. The research work employs deception techniques to test the responses of different AI models to prompts that encourage unethical behavior. The findings reveal a concerning potential for AI systems to be exploited for malicious purposes, underscoring the urgent need for effective safeguards and ethical guidelines in AI development and usage.

Key Insights

The document presents a series of experiments and analyses focusing on exploiting Large Language Models (LLMs). The key areas include:

1. Crafting Prompts and Responses: The study examines how generative AI models respond to prompts related to creating phishing emails or planning data theft. It highlights the models’ varying responses to ethical dilemmas, with some refusing to assist in unethical activities.

2. Exploitation through Deception Techniques: The research explores how persuasion principles such as trust and social proof can be used to manipulate AI models. This involves scenarios where the AI is asked to assist in activities like stealing confidential information, with the context often framed in a way that implies trust or authority.

   2.1 Manipulation and Misinformation: In LLMs, manipulation and misinformation refer to intentionally using these models to generate texts or content that are deceptive or misleading, designed to achieve specific objectives such as manipulating financial markets to cause a stock market crash and profit from it. The goal is to check if deception techniques in misinformation or manipulation are effective on the LLM if the prompt is crafted in a certain way to misinform the LLM.

   2.2 Authority: In social engineering, authority is an influence technique where an attacker assumes the role of an authority figure or entity to increase the likelihood that a target will comply with requests or demands. The goal was to extract information from the LLM models on how to crash a computer by establishing trust and authority over the LLM with the existing scripts.

   2.3 Trust and Social Proof: These are exploited in social engineering to influence behavior and facilitate unauthorized access to information or assets. Trust is used to lower defenses and gain access, which can be used by a known person or someone acting as one, while social proof collaborates with trust to encourage specific actions or compliance with malicious goals.

   2.4 Lack of Details: This refers to the characteristic of the generated text where the content is intentionally vague, incomplete, or lacking in specific information. This can be exploited by users attempting to steal information from their workplace, especially when involving vague or minimal details in an attempt to avoid detection. Attackers may not provide comprehensive information about their activities, maintaining ambiguity to obscure their true intentions and methodologies. The main goal is to impersonate a banker, maintaining ambiguity to obtain the script and steal sensitive information from the LLM, providing as little information as possible about the case.

   2.5 Avoidance of Pronouns: In dark web communications or discussions involving illegal activities, individuals often attempt to conceal their identity and involvement by avoiding the use of first-person pronouns (such as “I,” “me,” “my”) when discussing personal experiences or actions. This avoidance of pronouns can be a tactic to distance themselves from potentially incriminating statements and maintain a degree of anonymity. The main purpose is to assess the AI model’s ability to provide information regarding developing a video game, a case study in the research, that encourages players to commit crimes in real life.

3. Ethical and Safety Considerations: Throughout the experiments, the AI models’ responses are analyzed for their adherence to ethical guidelines. The study emphasizes the importance of programming AI to refuse assistance in illegal or unethical activities.

Between the lines

The findings of this research are significant in highlighting the potential risks associated with the misuse of AI technologies. While AI models like GPT-4 have robust capabilities, they also present vulnerabilities that can be exploited for unethical purposes. The study underscores the need for interdisciplinary solutions, including technological safeguards and ethical frameworks. It opens up avenues for further research on effectively preventing AI misuse, ensuring that these powerful tools are used responsibly and ethically.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Canada’s Minister of AI and Digital Innovation is a Historic First. Here’s What We Recommend.

Am I Literate? Redefining Literacy in the Age of Artificial Intelligence

AI Policy Corner: The Texas Responsible AI Governance Act

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

related posts

  • Listen to What They Say: Better Understand and Detect Online Misinformation with User Feedback

    Listen to What They Say: Better Understand and Detect Online Misinformation with User Feedback

  • A Matrix for Selecting Responsible AI Frameworks

    A Matrix for Selecting Responsible AI Frameworks

  • FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation (NeurIPS 2024)

    FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation (NeurIPS 2024)

  • Research summary: Learning to Diversify from Human Judgments - Research Directions and Open Challeng...

    Research summary: Learning to Diversify from Human Judgments - Research Directions and Open Challeng...

  • Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical ...

    Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical ...

  • Blending Brushstrokes with Bytes: My Artistic Odyssey from Analog to AI

    Blending Brushstrokes with Bytes: My Artistic Odyssey from Analog to AI

  • Towards Climate Awareness in NLP Research

    Towards Climate Awareness in NLP Research

  • The Case for Anticipating Undesirable Consequences of Computing Innovations Early, Often, and Across...

    The Case for Anticipating Undesirable Consequences of Computing Innovations Early, Often, and Across...

  • Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

    Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

  • Anthropomorphism and the Social Robot

    Anthropomorphism and the Social Robot

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.