• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Using attention methods to predict judicial outcomes

May 28, 2023

🔬 Research Summary by Vithor Bertalan, a Computer Engineering PhD Student at Polytechnique Montréal.

[Original paper by Vithor Bertalan and Evandro Ruiz]


Overview: We have developed a model to classify judicial outcomes by analyzing textual features from the legal orders. After that step, we used the weights of one of our networks, a Hierarchical Attention Network, to detect the most important words used to absolve or convict defendants.


Introduction

What can we find in a text to convict or absolve defendants? Do words matter in a legal order?

Focusing on those main research questions, we have used AI classifiers to predict judicial outcomes from legal orders. For this purpose, we developed a text crawler to extract data from public electronic legal systems. These texts formed a dataset of second-degree murder and active corruption cases. We applied different classifiers to predict judicial outcomes by analyzing textual features from the dataset. Our research showed that Regression Trees, Gated Recurring Units (GRUs), and Hierarchical Attention Networks presented higher metrics for different subsets. 

Finally, to accomplish our main goal, we explored the attention weights of one of the algorithms used, the Hierarchical Attention Networks, to find a sample of the most important words used to absolve or convict defendants. Therefore, we have found the words that matter most in legal orders for both outcomes. 

Key Insights

How we got our documents

For the research, we collected a corpus of judicial outcomes from the eSAJ, the electronic system of the Sao Paulo Justice Court, Brazil. We selected a few previously defined judicial subjects to restrict the documents captured. We chose only judicial subjects with very well-defined outcomes. Namely, second-degree murder and active corruption. We have implemented a web text crawler to capture the data from eSAJ. Using the text crawler developed for this research, we have collected 2,467 cases, only selecting homicide and corruption subjects, resulting in 1,681 homicide cases and 786 corruption cases. The crawler was used to gather documents from different periods. 

Using professional guidance

We have used the professional guidance of Brazilian lawyers with the purpose of better understanding the texts. The language adopted worldwide in the field of Law is notoriously obscure. Therefore, we decided that professional consulting was necessary to understand each of the judicial cases’ outcomes fully. 

Transforming documents into numbers

After preprocessing the texts, we transformed our dataset into numerical vectors using Term Frequency–Inverse Document Frequency (TFIDF) and Word Embeddings to see the performance of the two methods. 

Applying Artificial Intelligence methods

We used several different methods to predict the outcomes: namely, Logistic Regression, Linear Discriminant Analysis (LDA),  K-Nearest Neighbors (KNN), Regression Trees, NaĂŻve Bayes, Support Vector Machines (SVM), Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), Gated Recurring Units (GRU) and Hierarchical Attention Networks (HAN). 

The results were split into non-neural network methods and neural network methods. For the homicides dataset, the non-neural network method which showed the best performance was the SVM method. Regression Trees were the best method for the corruption dataset among the non-neural methods. For both datasets, GRUs showed the best performance among the neural methods. 

Analyzing the words and sentences

After the classification, we sought to order all the words in each dataset by their attention weights. Therefore, each word will have a unique value, ranging from 0 (where the word would have no importance in the classification of the document) to 1 (where the word would have maximum importance in the classification of the document). 

It is helpful to mention that a word might have different attention weights in distinct sentences. As a short example, the sentence “The defendant robbed a bank” and the sentence “The defendant did not participate in the robbery because it was going to a blood bank” both have the word bank but in very different contexts. In the first sentence, the word would be a vital contributor to the condemnation, while it would contribute to the absolution in the second sentence. Therefore, words with different attention weights appeared more than once in our final calculations.

Conclusion: Words do matter

A few examples of our findings are, for instance, in the corruption dataset, verbs that indicate the giving of goods (like “ofereceu,” offered; “apresentou,” presented; “oferecendo,” offering) are signs of condemnation. 

Words that show physical damage are preponderant for homicide condemnations: “infração” (infraction); “disparos” (gunshots); “golpes” (physical blows); “socos” (physical punches); “lesões” (lesions). On the other hand, the word “jĂşri” (jury) is a fundamental word to absolve homicide defendants. 

Some curious words are also present, such as “infância” (childhood) and “social” (social) as a word present in the homicide absolution list, indicating that some defendants can be absolved of homicides by appealing to social and emotional topics. 

Between the lines

The first key finding – Machine learning can effectively process law texts

We demonstrated that algorithms could predict the outcome of judicial cases, given the text written on their court decisions. We had results that exceeded 95% accuracy for most cases.

The second key finding – Regression Trees are good methods for Law texts

As other authors have also found, our research shows that Regression Trees are good and computationally effective methods to process law texts. The reason is yet to be found, but some authors suggest that Regression Trees can study legal conceptions of Law, revealing patterns those other methods cannot emulate as effectively.

The third key finding – Words are primary to absolve or convict defendants

As our main research hypothesis predicted, words are significant in absolving or convicting defendants. Since our method is language-agnostic, it could be fine-tuned to other languages to help legal workers understand the most impacting words to achieve the desired outcome, or to help humanities researchers perform textual analysis to find the underlying characteristics of each situation.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

related posts

  • A Case Study: Increasing AI Ethics Maturity in a Startup

    A Case Study: Increasing AI Ethics Maturity in a Startup

  • Against Interpretability: a Critical Examination

    Against Interpretability: a Critical Examination

  • Research summary: A Picture Paints a Thousand Lies? The Effects and Mechanisms of Multimodal Disinfo...

    Research summary: A Picture Paints a Thousand Lies? The Effects and Mechanisms of Multimodal Disinfo...

  • Responsible Use of Technology in Credit Reporting: White Paper

    Responsible Use of Technology in Credit Reporting: White Paper

  • AI and the Global South: Designing for Other Worlds  (Research Summary)

    AI and the Global South: Designing for Other Worlds (Research Summary)

  • Clueless AI: Should AI Models Report to Us When They Are Clueless?

    Clueless AI: Should AI Models Report to Us When They Are Clueless?

  • Breaking Fair Binary Classification with Optimal Flipping Attacks

    Breaking Fair Binary Classification with Optimal Flipping Attacks

  • Exchanging Lessons Between Algorithmic Fairness and Domain Generalization (Research Summary)

    Exchanging Lessons Between Algorithmic Fairness and Domain Generalization (Research Summary)

  • Research summary: Working Algorithms: Software Automation and the Future of Work

    Research summary: Working Algorithms: Software Automation and the Future of Work

  • The Political Power of Platforms: How Current Attempts to Regulate Misinformation Amplify Opinion Po...

    The Political Power of Platforms: How Current Attempts to Regulate Misinformation Amplify Opinion Po...

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.