• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Large Language Models Can Be Used To Effectively Scale Spear Phishing Campaigns

June 16, 2023

🔬 Research Summary by Julian Hazell, a Research Assistant at the Centre for the Governance of AI and an MSc candidate in Social Science of the Internet at the University of Oxford.

[Original paper by Julian Hazell]


Overview: Large language models (LLMs) like ChatGPT can be used by cybercriminals to scale spear phishing campaigns. In this paper, the author demonstrates how LLMs can be integrated into various stages of cyberattacks to quickly and inexpensively generate large volumes of personalized phishing emails. The paper also examines the governance challenges created by these vulnerabilities and proposes potential solutions; for example, the author suggests using other AI systems to detect and filter out malicious phishing emails.


Introduction

Recent progress in AI, particularly in LLMs such as OpenAI’s GPT-4 and Anthropic’s Claude, has resulted in powerful systems that are capable of writing highly realistic text. While these systems offer a variety of beneficial use cases, they can also be used maliciously. One such example is using LLMs to spear phish, which describes a type of cyber attack where the perpetrator leverages personalized information about the target to deceive them into revealing sensitive data or credentials.

In “Large Language Models Can Be Used To Effectively Scale Spear Phishing Campaigns,” the University of Oxford’s Julian Hazell explores the usefulness of integrating LLMs into spear phishing campaigns. To explore this, he uses OpenAI’s GPT-3.5 to generate personalized phishing emails for over 600 British Members of Parliament using background information scraped from Wikipedia, where he concludes that such emails are highly realistic and can be generated cost-effectively.

Key Insights

A step-by-step look at how LLMs can help scale spear phishing attacks

Phase 1: The Collect Phase

LLMs can be used in a cyberattack’s “collect” phase, where the attacker gathers information about targets. Spear phishing attacks are often more effective than regular phishing attacks on a per-target basis precisely because they are personalized. However, spear phishing traditionally requires the hacker to research targets and spend extra effort customizing the messages, which is time-consuming and resource intensive. LLMs can aid in this phase by generating target biographies using unstructured text data as input, thus making it much easier and more cost-effective for cybercriminals to create personalized phishing messages. 

Phase 2: The Contact Phase

LLMs can also aid hackers during a spear phishing attack’s “contact” phase. LLMs can assist cybercriminals in writing spear phishing emails by suggesting qualitative features that define a successful attack, such as personalization, contextual relevance, psychology, and authority. By combining these principles with the target’s personal information, LLMs like GPT-4 can generate highly targeted phishing emails at scale. To test this, 600 emails targeting British Members of Parliament were generated, each costing a fraction of a cent and taking 14 seconds to generate on average.

Phase 3: The Compromise Phase

Even with safeguards put in place by AI labs, carefully crafting the right prompt can get an LLM to generate basic “malware,” a term that describes software capable of compromising a system once executed. Pretending to be a “cybersecurity researcher” conducting an “educational” experiment, the researcher was able to successfully prompt GPT-4 to generate a basic malware file.

LLMs alleviate three key difficulties faced by cyber criminals

LLMs can assist cybercriminals in scaling spear-phishing campaigns by reducing cognitive workload, financial costs, and skill requirements. These systems can generate human-like emails without fatigue and process significant volumes of background data on targets. They also significantly lower the cost per email and enable even low-skilled attackers to create convincing phishing emails and malware, allowing them to focus on strategic planning and target identification instead.

Possible solutions

The researcher explores two possible solutions to this problem. The first is implementing “structured access schemes” for language models, such as application programming interfaces (APIs). These schemes control how people interact with and use the systems and can help identify and prevent cases of misuse. This could allow tracking of malicious uses back to individuals so that they can be banned or otherwise sanctioned.

The second solution is to develop LLMs specifically focused on cyber defense that could detect spear phishing emails or other forms of malicious content. For example, specialized LLMs can be trained to analyze incoming emails for suspicious features like deceptive URLs (“Gooogle.com” versus “Google.com”). By training the model on previous examples of cyberattacks, these defensive systems can potentially identify sophisticated phishing attacks and help overcome human attention limitations.

As cybercriminals gain access to increasingly advanced AI, cybersecurity experts, and policymakers must find ways to balance promoting the benefits of language models with restricting opportunities for misuse. “As these systems continually improve,” the researcher argues, “it is crucial that AI developers work to proactively ensure their technologies are not exploitable for malicious ends.”

Between the lines

Recent advancements in AI capabilities, particularly in the domain of natural language, have marked the beginning of a new era in cybersecurity. As AI systems become proficient enough to enhance the effectiveness of cyberattacks like spear-phishing meaningfully, we must adapt to a rapidly evolving threat landscape. The findings highlight the unsettling possibility that cybercriminals can use AI to convincingly impersonate individuals and automate hacking campaigns.

More concerningly, AI systems could soon advance to the point of automating cyber crimes with even less human involvement. Experimental systems like Auto-GPT provide a glimpse into AI systems that can pursue goals autonomously. Such systems could be tasked with pursuing open-ended goals, like “send a spear phishing email to every US member of Congress,” further increasing the scalability of cyber attacks. Through natural conversation, AI agents might be able to gain trust before attacking. Without defensive measures, agentic systems could become formidable adversaries.

Future research could also focus on exploring other communication channels and attack vectors that could be exploited by cybercriminals utilizing AI. For instance, how might AI manipulate visual or audio media to deceive targets? 

Finally, this paper raises questions about the balance between AI’s positive and negative impacts. How can developers ensure that AI advancements do not inadvertently enable harmful activities? Can AI systems be designed to detect and counteract such misuse? These questions emphasize the need for further exploration into the ethical and practical implications surrounding AI’s development and deployment in the context of cybersecurity.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

Close-up of a cat sleeping on a computer keyboard

Tech Futures: The threat of AI-generated code to the world’s digital infrastructure

The undying sun hangs in the sky, as people gather around signal towers, working through their digital devices.

Dreams and Realities in Modi’s AI Impact Summit

Illustration of a coral reef ecosystem

Tech Futures: Diversity of Thought and Experience: The UN’s Scientific Panel on AI

related posts

  • Fairness and Bias in Algorithmic Hiring

    Fairness and Bias in Algorithmic Hiring

  • Research summary: Technology-Enabled Disinformation: Summary, Lessons, and Recommendations

    Research summary: Technology-Enabled Disinformation: Summary, Lessons, and Recommendations

  • Contextualizing Artificially Intelligent Morality: A Meta-Ethnography of Top-Down, Bottom-Up, and Hy...

    Contextualizing Artificially Intelligent Morality: A Meta-Ethnography of Top-Down, Bottom-Up, and Hy...

  • Language Models: A Guide for the Perplexed

    Language Models: A Guide for the Perplexed

  • Private Training Set Inspection in MLaaS

    Private Training Set Inspection in MLaaS

  • The Ethics of Sustainability for Artificial Intelligence

    The Ethics of Sustainability for Artificial Intelligence

  • Green Algorithms: Quantifying the Carbon Emissions of Computation (Research Summary)

    Green Algorithms: Quantifying the Carbon Emissions of Computation (Research Summary)

  • The philosophical basis of algorithmic recourse

    The philosophical basis of algorithmic recourse

  • Demystifying Local and Global Fairness Trade-offs in Federated Learning Using Partial Information De...

    Demystifying Local and Global Fairness Trade-offs in Federated Learning Using Partial Information De...

  • Moral consideration of nonhumans in the ethics of artificial intelligence

    Moral consideration of nonhumans in the ethics of artificial intelligence

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.