• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Energy and Policy Considerations in Deep Learning for NLP

May 30, 2021

🔬 Research summary by Abhishek Gupta (@atg_abhishek), our Founder, Director, and Principal Researcher.

[Original paper by Emma Strubell, Ananya Ganesh, and Andrew McCallum]


Overview: As we inch towards ever-larger AI models, we have entered an era where achieving state-of-the-art results has become a function of access to huge compute and data infrastructure in addition to fundamental research capabilities. This is leading to inequity and impacting the environment due to high energy consumption in the training of these systems. The paper provides recommendations for the NLP community to alter this antipattern by making energy and policy considerations central to the research process.


Introduction

We’ve seen astonishing numbers detailing the size of recent large-scale language models. For example, GPT-3 clocked in at 175 billion parameters, the Switch Transformer at 1.6 trillion parameters, amongst many others. The environmental impact of the training and serving of these models has also been discussed widely, especially after the firing of Dr. Timnit Gebru from Google last year. In this paper, one of the foundational papers analyzing the environmental impact of AI, the researchers take a critical look at the energy consumption of BERT, Transformer, ELMo, and GPT-2 by capturing the hardware that they were trained on, the power consumption of that hardware, the duration of training, and finally, the CO2eq emitted as a result along with the financial cost for that training. 

The researchers found that enormous financial costs make this line of research increasingly inaccessible to those who don’t work at well-funded academic and industry research labs. They also found that the environmental impact is quite severe and the trend of relying on large-scale models to achieve state-of-the-art is exacerbating these problems. 

GPU power consumption

Prior research has shown that computationally-intensive models achieve high scores. Arriving at those results though requires iteration when experimenting with different architectures and hyperparameter values which multiplies this high cost thousands of times over. For some large models, the carbon equivalent rivals that of several lifetimes of a car. 

To calculate the power consumption while training large models on GPUs, the researchers use manufacturer-provided system management interfaces which report these values in real-time. Total power consumption is estimated as that consumed by the CPU, GPU, and DRAM of the system multiplied by the Power Usage Effectiveness factor which accounts for the additional energy that is consumed for auxiliary purposes like cooling the system. These calculations are done for Transformer, BERT, ELMo, and GPT-2 based on the values for the hardware and duration of the training as provided in the original papers by the authors of those models.

While there has been prior research capturing values of training such models from an energy and cost perspective, those typically focus on just the final configuration of the model rather than the journey used to arrive at that final configuration which can be quite significant in its impact. Through the experiments conducted by the authors of this paper, they find that TPUs are more energy-efficient than GPUs, especially in the cases where they are more appropriate for the model that is being trained, for example, BERT. 

Iteratively fine-tuning models

This process of fine-tuning a model through iterative searches of the model architectures and hyperparameter values adds up to massive financial and energy costs as shown in the paper where a single iteration for the model training might cost only ~USD 200, the entire R&D process for arriving at that model which required ~4800 iterations cost ~USD450k which can easily put it out of the reach of those without access to significant resources. 

Thus, the researchers propose that when a model is supposed to be further fine-tuned downstream, there should be a reporting of the sensitivity of different hyperparameters to this process to guide future developers. An emphasis on large-scale models furthers inequity by promoting a rich-get-richer cycle whereby only the organizations that have a lot of resources are able to do this kind of research, publish results, and thus gain more funding further entrenching their advantage. Tooling that promotes more efficient architecture searches is limited in its application at the moment because of a lack of easy tutorials and compatibility with the most popular deep learning libraries like Tensorflow and PyTorch. A change in this is also bound to make an impact on the state of carbon accounting in the field of AI. 

Between the lines

This paper kickstarted a reflection in the field of NLP on thinking about carbon accounting and overreliance on accuracy as a metric for evaluating the value of results in the AI research community. Upcoming efforts such as carbon-efficient workshops at various top-tier NLP conferences have further boosted awareness of these issues in the community. The hope is that there will be sustained momentum around this as we seek to build more eco-socially responsible AI systems. Follow-on research is required, especially to make tooling more compatible with existing deep learning frameworks. Making reporting a standardized process of the research lifecycle will also help with this. Work done at the Montreal AI Ethics Institute titled SECure: A Social and Environmental Certificate for AI systems provides some more recommendations on how we can do better when it comes to building more eco-socially responsible AI systems.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Illustration of a coral reef ecosystem

Tech Futures: Diversity of Thought and Experience: The UN’s Scientific Panel on AI

This image shows a large white, traditional, old building. The top half of the building represents the humanities (which is symbolised by the embedded text from classic literature which is faintly shown ontop the building). The bottom section of the building is embossed with mathematical formulas to represent the sciences. The middle layer of the image is heavily pixelated. On the steps at the front of the building there is a group of scholars, wearing formal suits and tie attire, who are standing around at the enternace talking and some of them are sitting on the steps. There are two stone, statute-like hands that are stretching the building apart from the left side. In the forefront of the image, there are 8 students - which can only be seen from the back. Their graduation gowns have bright blue hoods and they all look as though they are walking towards the old building which is in the background at a distance. There are a mix of students in the foreground.

Tech Futures: Co-opting Research and Education

Agentic AI systems and algorithmic accountability: a new era of e-commerce

ALL IN Conference 2025: Four Key Takeaways from Montreal

Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

related posts

  • When Algorithms Infer Pregnancy or Other Sensitive Information About People

    When Algorithms Infer Pregnancy or Other Sensitive Information About People

  • Bias Amplification Enhances Minority Group Performance

    Bias Amplification Enhances Minority Group Performance

  • Risk of AI in Healthcare: A Study Framework

    Risk of AI in Healthcare: A Study Framework

  • Research summary: Algorithmic Colonization of Africa

    Research summary: Algorithmic Colonization of Africa

  • The Ethics of AI Value Chains: An Approach for Integrating and Expanding AI Ethics Research, Practic...

    The Ethics of AI Value Chains: An Approach for Integrating and Expanding AI Ethics Research, Practic...

  • Avoiding an Oppressive Future of Machine Learning: A Design Theory for Emancipatory Assistants

    Avoiding an Oppressive Future of Machine Learning: A Design Theory for Emancipatory Assistants

  • The Sociology of AI Ethics (Column Introduction)

    The Sociology of AI Ethics (Column Introduction)

  • Promises and Challenges of Causality for Ethical Machine Learning

    Promises and Challenges of Causality for Ethical Machine Learning

  • Research Summary: Risk Shifts in the Gig Economy: The Normative Case for an Insurance Scheme against...

    Research Summary: Risk Shifts in the Gig Economy: The Normative Case for an Insurance Scheme against...

  • Research summary: Changing My Mind About AI, Universal Basic Income, and the Value of Data

    Research summary: Changing My Mind About AI, Universal Basic Income, and the Value of Data

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.