• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Judging the algorithm: A case study on the risk assessment tool for gender-based violence implemented in the Basque country

June 19, 2022

🔬 Research Summary by Ana Valdivia, Cari Hyde-Vaamonde, Julián García-Marcos.

Ana Valdivia is a Research Associate in Artificial Intelligence at King’s College London (Department of War Studies – ERC Security Flows). Her research has explored a critical perspective towards algorithmic systems and the design of ethical, transparent and fair machine learning classifiers.

Cari Hyde-Vaamonde is an experienced lawyer and court advocate, having practised in diverse fields including technology. Turning her focus to research in the field recently culminated in a UKRI 4-year award to research the impacts of AI in justice settings at King’s College London, where she is also a Visiting Lecturer.

Julián García-Marcos graduated in Law, entered the judicial career in 2008. Investigating Judge in Irun and Donostia-San Sebastian until 2021. Nowadays, Magistrate of the 1st section (Criminal) of the Provincial Court of Guipúzcoa (Spain).

[Original paper by Ana Valdivia, Cari Hyde-Vaamonde, Julián García-Marcos]


Overview:  Algorithms designed for use by the police have been introduced in courtrooms to assist the decision-making process of judges in the context of gender-based violence. This paper examines a risk assessment tool implemented in the Basque country (Spain) from a technical and legal perspective and identifies its risks, harms and limitations. 


Introduction

In 2018, M reported her ex-husband to police authorities in the Basque Country (Spain). She had suffered gender-based violence. During the report, the police authority asked her several questions: ‘Is the male aggressor or victim an immigrant?’, ‘Very intense jealousy or

controlling behaviours?’, ‘What is your perception of danger of death in the past month?’. After this interview, the police used a risk assessment tool that evaluates the severity of the case. In the case of M, the algorithmic output assessed that her husband had a low-risk of further gender-based violence. The report of M’s case together with the algorithmic evaluation was sent to the courtroom. However, the judge assessed that M was at high-risk, contradicting the result of the algorithm. In this paper, we propose an interdisciplinary analysis to examine the impact of this risk assessment tool in this context. Through an exhaustive analysis of publicly available documents and a conversation with a judge, who is in turn a user of this tool, we unveil risks, benefits and limitations of using these algorithmic tools for assessing cases of gender-based violence in courtrooms.

Key Insights

Risk assessment tools to predict violence

The use of algorithms such as risk assessment tools to predict violence has a long-standing history. Statistical predictions which involve predicting an individual’s violent behaviour on the basis of how others have acted in similar situations began in the eighties through the analysis of risk factors. To predict violence, scholars have proposed several statistical strategies to overtake human judgment, and do it better, perhaps citing greater efficiency. Yet statistical outputs do not always outperform human judgements. Recently, some limitations regarding the use of these tools have been the focus of the field of fairness in machine learning and critical data studies. Part of this he scholarship relates to the demystification of the neutrality and objectivity of algorithmic and statistical tools, the unintended discrimination and disparate impact of these tools due to statistical bias and the influence that algorithmic tools have on human decisions.

The intimate partner femicide and severe violence assessment tool: the EPV

In the Basque Country (Spain), police officers are using an algorithm to automatically assess the risk of gender-based violence. This risk assessment tool, the EPV,  is based on 20 items evaluating several aspects of aggressors and victims to classify the risks of gender-based

violence recidivism. It was developed by a research team made of clinical psychologists who used their expertise to ‘propose a brief, easy-to use scale [tool] that is practical for use by the police, social workers, forensic psychologists, and judges in their decision-making process’. 

The efficacy of this risk assessment tool was assessed through an analysis of the trade-off between the true positives (TP) and true negatives (TN). However, in this context, TP don’t have the same importance than TN: while an error in a negative case (FP) implies that the case is overestimated (more protection to the women), an error in a positive case (FN) implies that the case is underestimated putting in risk a women that suffers gender-based violence. It is then preferable to obtain higher rates of FP than FN, which implies that in the worst case scenario, cases with a non-severe risk of violence are categorised as high risk, implying perhaps greater attention.

Is judicial reasoning aided by EPV-R?

This paper considers real examples of judicial decision-making behaviour. A judge will hear representations of lawyers, documentary evidence, and sometimes oral evidence from witnesses, the accuser and the defendant. At the end of the hearing, on deciding what measures to take, the individual in the judicial role is also presented with evidence of the EPV-R score, suggesting the risk level of the defendant. This score is presented without a narrative. It is left to the judge to decide how to weigh this score, while the score itself is impossible to interrogate at court, either factually, or technically. It may contrast strongly with their own judgment. How this conflict is resolved will vary according to the individual judge, but will have serious consequences for the parties to the case. The impact of errors, and concerns regarding the balance of false positives to false negatives is rarely aired, if ever, and yet crucial for reasoning. 

Technical and legal risk and  harms of the EPV-R

From a technical perspective, current debates on risks, harms and limitations of socio-technical systems have been focusing on bias and the disparate impact that algorithms might have on different demographic groups, inspired by several publications and journalistic investigations. However, we seek to move the analysis beyond the critique on bias by examining three factors: (1) opaque implementation, (2) efficiency’s paradox and (3) feedback loop.

From a legal perspective, there is a lack of appropriate legal guidelines for use in a court scenario. In the case of EPV-R, we are told by a judge, a first-hand user of this information, that no warning regarding the reliability of the data is given. The legal framework requires real deliberation by the judge, but the status of the algorithm as a quasi-expert closes down enquiry. Principles of the rule of law and due process require that individuals are aware of the case against them, and for the individual in the case, the right of free movement may be restricted. Equally, a victim’s word may be doubted on the basis of a score of “low-risk”.

Between the lines

This risk assessment tool goes to the very core of the judge’s function. If we do not accept that EPV-R is the best overall measure of risk under the legal framework, a judge must then weigh its assessment against their own judgment, based on the facts of the case. To do so they must consider how reliable the EPV-R assessment is (as compared to the other evidence), and what ‘risk’ means in this tool. Yet, at present, the judge does not have the proper means to do this. This paper has been prepared by authors of diverse disciplines (law and computer science) to highlight a practice that has widely gone unreported, was not fully anticipated by the designers of the initial software, and its use is so far unsupported by empirical research. In fact, we identify several elements that make the use of this algorithm in the decision-making of judges not recommended. In considering further steps, we also recommend reviewing the work of Costanza-Chock on Design Justice; D’Ignazio and Klein and Peña and Varon on Data Feminism. Bringing together many perspectives on risk assessment tools in the context of gender-based violence will lead us to build better algorithms, promoting technologies and practices that have real impact in the algorithmic social justice context.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Illustration of a coral reef ecosystem

Tech Futures: Diversity of Thought and Experience: The UN’s Scientific Panel on AI

This image shows a large white, traditional, old building. The top half of the building represents the humanities (which is symbolised by the embedded text from classic literature which is faintly shown ontop the building). The bottom section of the building is embossed with mathematical formulas to represent the sciences. The middle layer of the image is heavily pixelated. On the steps at the front of the building there is a group of scholars, wearing formal suits and tie attire, who are standing around at the enternace talking and some of them are sitting on the steps. There are two stone, statute-like hands that are stretching the building apart from the left side. In the forefront of the image, there are 8 students - which can only be seen from the back. Their graduation gowns have bright blue hoods and they all look as though they are walking towards the old building which is in the background at a distance. There are a mix of students in the foreground.

Tech Futures: Co-opting Research and Education

Agentic AI systems and algorithmic accountability: a new era of e-commerce

ALL IN Conference 2025: Four Key Takeaways from Montreal

Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

related posts

  • Research summary: Apps Gone Rogue: Maintaining Personal Privacy in an Epidemic

    Research summary: Apps Gone Rogue: Maintaining Personal Privacy in an Epidemic

  • Writer-Defined AI Personas for On-Demand Feedback Generation

    Writer-Defined AI Personas for On-Demand Feedback Generation

  • Artificial Intelligence and Inequality in the Middle East: The Political Economy of Inclusion

    Artificial Intelligence and Inequality in the Middle East: The Political Economy of Inclusion

  • Algorithms as Social-Ecological-Technological Systems: an Environmental Justice lens on Algorithmic ...

    Algorithms as Social-Ecological-Technological Systems: an Environmental Justice lens on Algorithmic ...

  • Research summary: Health Care, Capabilities, and AI Assistive Technologies

    Research summary: Health Care, Capabilities, and AI Assistive Technologies

  • It doesn't tell me anything about how my data is used'': User Perceptions of Data Collection Purpos...

    "It doesn't tell me anything about how my data is used'': User Perceptions of Data Collection Purpos...

  • Before and after GDPR: tracking in mobile apps

    Before and after GDPR: tracking in mobile apps

  • Research summary: Robot Rights? Let’s Talk about Human Welfare instead

    Research summary: Robot Rights? Let’s Talk about Human Welfare instead

  • Computer Vision’s implications for human autonomy

    Computer Vision’s implications for human autonomy

  • When Are Two Lists Better than One?: Benefits and Harms in Joint Decision-making

    When Are Two Lists Better than One?: Benefits and Harms in Joint Decision-making

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.