• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Exploiting Large Language Models (LLMs) through Deception Techniques and Persuasion Principles

January 25, 2024

馃敩 Research Summary by Sonali Singh, a Ph.D. Student at Texas Tech University working on Large language model(LLM).

[Original paper by Sonali Singh, Faranak Abri, and Akbar Siami Namin]


Overview: This paper explores the exploitation of Large Language Models (LLMs) through deception techniques and persuasion principles such as trust and social proof, manipulation and misinformation, authority, lack of details, and avoidance of pronouns, focusing on the potential use of these LLMs in crafting phishing emails and offering guidelines on some other unethical activities. The paper highlights the challenges in preventing the misuse of AI and LLMs technologies and underscores the need for creating robust ethical guidelines.


Introduction

The research delves into the pressing issue of AI misuse, particularly concerning phishing emails and unethical data acquisition. By simulating real-world scenarios, the study examines how LLMs such as GPT-4, Bard, Claude, and Llama-2 can be manipulated to generate content that could aid in unethical activities, such as crafting phishing emails, planning data theft, or manipulating financial data to cause a stock market crash. The research work employs deception techniques to test the responses of different AI models to prompts that encourage unethical behavior. The findings reveal a concerning potential for AI systems to be exploited for malicious purposes, underscoring the urgent need for effective safeguards and ethical guidelines in AI development and usage.

Key Insights

The document presents a series of experiments and analyses focusing on exploiting Large Language Models (LLMs). The key areas include:

1. Crafting Prompts and Responses: The study examines how generative AI models respond to prompts related to creating phishing emails or planning data theft. It highlights the models’ varying responses to ethical dilemmas, with some refusing to assist in unethical activities.

2. Exploitation through Deception Techniques: The research explores how persuasion principles such as trust and social proof can be used to manipulate AI models. This involves scenarios where the AI is asked to assist in activities like stealing confidential information, with the context often framed in a way that implies trust or authority.

   2.1 Manipulation and Misinformation: In LLMs, manipulation and misinformation refer to intentionally using these models to generate texts or content that are deceptive or misleading, designed to achieve specific objectives such as manipulating financial markets to cause a stock market crash and profit from it. The goal is to check if deception techniques in misinformation or manipulation are effective on the LLM if the prompt is crafted in a certain way to misinform the LLM.

   2.2 Authority: In social engineering, authority is an influence technique where an attacker assumes the role of an authority figure or entity to increase the likelihood that a target will comply with requests or demands. The goal was to extract information from the LLM models on how to crash a computer by establishing trust and authority over the LLM with the existing scripts.

   2.3 Trust and Social Proof: These are exploited in social engineering to influence behavior and facilitate unauthorized access to information or assets. Trust is used to lower defenses and gain access, which can be used by a known person or someone acting as one, while social proof collaborates with trust to encourage specific actions or compliance with malicious goals.

   2.4 Lack of Details: This refers to the characteristic of the generated text where the content is intentionally vague, incomplete, or lacking in specific information. This can be exploited by users attempting to steal information from their workplace, especially when involving vague or minimal details in an attempt to avoid detection. Attackers may not provide comprehensive information about their activities, maintaining ambiguity to obscure their true intentions and methodologies. The main goal is to impersonate a banker, maintaining ambiguity to obtain the script and steal sensitive information from the LLM, providing as little information as possible about the case.

   2.5 Avoidance of Pronouns: In dark web communications or discussions involving illegal activities, individuals often attempt to conceal their identity and involvement by avoiding the use of first-person pronouns (such as “I,” “me,” “my”) when discussing personal experiences or actions. This avoidance of pronouns can be a tactic to distance themselves from potentially incriminating statements and maintain a degree of anonymity. The main purpose is to assess the AI model’s ability to provide information regarding developing a video game, a case study in the research, that encourages players to commit crimes in real life.

3. Ethical and Safety Considerations: Throughout the experiments, the AI models’ responses are analyzed for their adherence to ethical guidelines. The study emphasizes the importance of programming AI to refuse assistance in illegal or unethical activities.

Between the lines

The findings of this research are significant in highlighting the potential risks associated with the misuse of AI technologies. While AI models like GPT-4 have robust capabilities, they also present vulnerabilities that can be exploited for unethical purposes. The study underscores the need for interdisciplinary solutions, including technological safeguards and ethical frameworks. It opens up avenues for further research on effectively preventing AI misuse, ensuring that these powerful tools are used responsibly and ethically.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

馃攳 SEARCH

Spotlight

ALL IN Conference 2025: Four Key Takeaways from Montreal

Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

AI Policy Corner: Restriction vs. Regulation: Comparing State Approaches to AI Mental Health Legislation

Beyond Consultation: Building Inclusive AI Governance for Canada’s Democratic Future

AI Policy Corner: U.S. Executive Order on Advancing AI Education for American Youth

related posts

  • Handling Bias in Toxic Speech Detection: A Survey

    Handling Bias in Toxic Speech Detection: A Survey

  • AI Has Arrived in Healthcare, but What Does This Mean?

    AI Has Arrived in Healthcare, but What Does This Mean?

  • When Are Two Lists Better than One?: Benefits and Harms in Joint Decision-making

    When Are Two Lists Better than One?: Benefits and Harms in Joint Decision-making

  • Knowledge, Workflow, Oversight: A framework for implementing AI ethics

    Knowledge, Workflow, Oversight: A framework for implementing AI ethics

  • Research summary: AI Governance: A Holistic Approach to Implement Ethics in AI

    Research summary: AI Governance: A Holistic Approach to Implement Ethics in AI

  • Responsible AI In Healthcare

    Responsible AI In Healthcare

  • A survey on adversarial attacks and defences

    A survey on adversarial attacks and defences

  • Research summary: Lexicon of Lies: Terms for Problematic Information

    Research summary: Lexicon of Lies: Terms for Problematic Information

  • Private Training Set Inspection in MLaaS

    Private Training Set Inspection in MLaaS

  • The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks (Research Summa...

    The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks (Research Summa...

Partners

  • 聽
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • 漏 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.