• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • State of AI Ethics Report Volume 8 (2026): Call for Contributors
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models

December 2, 2023

šŸ”¬ Research Summary by Ahmad Faiz, Masters in Data Science student at Indiana University Bloomington.

[Original paper by Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi, Parteek Sharma, Fan Chen, and Lei Jiang]


Overview: This paper delves into estimating the carbon footprint of Large Language Models (LLMs), spanning their complete lifecycle, including training, inference, experimentation, and storage phases, encompassing both operational and embodied carbon emissions. It specifically focuses on the challenge of accurately projecting the carbon impact of emerging LLMs during GPU-intensive training. To address this, the paper introduces LLMCarbon, a framework for modeling carbon footprints in dense and Mixture-of-Experts (MoE) LLMs. LLMCarbon markedly improves the precision of carbon footprint estimations for various LLMs, overcoming significant limitations in existing tools like mlco2. This enables efficient design space exploration by considering the trade-off between carbon footprint and test loss across LLM configurations.


Introduction

Patterson et al.’s paper ā€œCarbon Emissions and Large Neural Network Trainingā€ reports 552.1 metric tons of gross equivalent carbon emissions for the famous GPT-3 model trained on NVIDIA V100 GPUs. To put it in context, this is equivalent to a round trip of 3.05 jet plane carbon emissions between San Francisco and New York. With the proliferation of these large models in everyday life, its substantial carbon footprint is a critical concern. This paper introduces LLMCarbon, a comprehensive modeling tool to predict and evaluate the carbon impact of LLMs at different stages of their lifecycle.

LLMCarbon considers various inputs, including architectural details of the LLM, data center specifications, and hardware configurations. It employs a series of models to process this information, including a parameter model to estimate the LLM’s parameters, the neural scaling law to predict test loss, a FLOP model to estimate processing volume, and a hardware efficiency model to calculate actual computing throughput. Furthermore, LLMCarbon incorporates operational and embodied carbon models to provide a holistic view of an LLM’s carbon footprint.

Through rigorous validation, we confirm LLMCarbon’s accuracy in estimating both the operational and embodied carbon footprints of LLMs. In operational phases, our tool demonstrated disparities of 8.2% or less when compared to actual data, surpassing existing tools in precision. Moreover, LLMCarbon’s estimations of embodied carbon footprints closely align with publicly available data, showcasing an error margin of less than 3.6%. These findings highlight LLMCarbon’s invaluable role in guiding the development and usage of LLMs towards a more sustainable and environmentally conscious AI future.

Key Insights

To comprehend the carbon footprint of LLMs, it’s essential to consider emissions at different phases, including training, inference, experimentation, and storage. This includes both operational carbon emissions generated during usage and embodied carbon emissions associated with producing hardware components. LLMCarbon relies on a series of models to process a range of inputs, including architectural information about the LLM, data center specifications, and hardware configurations. These models are listed below:

1.     Parameter Model: Estimate the parameter count based on architectural parameters such as hidden size, the number of layers, vocabulary size, and the number of experts for dense and MoE models.

2.     Neural Scaling Law: Neural scaling law predicts an LLM’s test loss based on its parameter count and the training dataset size. This allows for consistent comparison of test losses across various LLMs.

3.     FLOP Model: The FLOP model calculates the number of floating-point operations (FLOPs) required during LLM processing, using the parameter count and the number of tokens processed, which is used to understand computational requirements.

4.     Hardware Efficiency Model: LLMcarbon provides valuable insights into identifying the optimal parallel settings across data, tensor, pipeline, and expert dimensions that should be followed to achieve peak throughput and resource utilization.

5.     Operational Carbon Model: LLMCarbon quantifies the carbon emissions generated during LLM processing. It considers the FLOP count, hardware efficiency, and the number of computing devices used. Additionally, it factors in variables like the data center’s power usage effectiveness (PUE) and carbon intensity, ensuring a comprehensive assessment of carbon impact.

6.     Embodied Carbon Model: The embodied carbon model quantifies the carbon footprint associated with hardware components. It calculates the carbon emissions for each unit, considering factors like chip area and Carbon emitted Per unit Area (CPA).

The total equivalent carbon emission is the sum of operational and embodied carbon emissions.

The validation results reveal a remarkable alignment between LLMCarbon’s projections and real-world data for diverse LLMs, surpassing the performance of existing tools like mlco2. LLMCarbon’s adaptability to various data center specifications and its capability to pinpoint optimal parallelism settings enhance overall operational efficiency. This adaptability, combined with its ability to accurately gauge the environmental impact of LLMs, positions LLMCarbon as a pragmatic tool in assessing and mitigating the carbon footprint associated with LLMs, offering an indispensable resource for the future of sustainable AI development.

Between The Lines:

In the broader context of the ML community’s growing concern about the carbon footprint of computationally intensive models, it is crucial to underline the significance of tools like LLMCarbon. We suggest three key areas for improvement: explicitly reporting energy consumption and CO2eq, rewarding efficiency improvements alongside traditional metrics in ML conferences, and providing insights into the time and number of processors used during training.

By employing LLMCarbon or similar tools, researchers and developers can report the carbon footprint more accurately and consider it a competitive factor in model training. This shift in perspective could promote a virtuous cycle where efficiency and reduced emissions become paramount. Integrating power metrics into benchmarks like MLPerf is a positive step in the right direction, fostering a more sustainable approach to AI development. Further research is needed in these areas to solidify these goals and push for ongoing improvements in the industry’s carbon footprint.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

SAIER Volume 8 (2026)

SAIER Volume 8 (2026) Call for Contributors

šŸ” SEARCH

Spotlight

Vertically- and horizontally-placed chess boards and chess pieces

Tech Futures: At the Frontier of Fear, Uncertainty and Doubt

Tech Futures: Introducing the Resist List

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

related posts

  • The State of AI Ethics Report (Volume 4)

    The State of AI Ethics Report (Volume 4)

  • A Taxonomy of Foundation Model based Systems for Responsible-AI-by-Design

    A Taxonomy of Foundation Model based Systems for Responsible-AI-by-Design

  • AI and the Global South: Designing for Other Worlds  (Research Summary)

    AI and the Global South: Designing for Other Worlds (Research Summary)

  • REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

    REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

  • Bridging the Gap: The Case For an ā€˜Incompletely Theorized Agreement’ on AI Policy (Research Summary)

    Bridging the Gap: The Case For an ā€˜Incompletely Theorized Agreement’ on AI Policy (Research Summary)

  • Modeling Content Creator Incentives on Algorithm-Curated Platforms

    Modeling Content Creator Incentives on Algorithm-Curated Platforms

  • Unlocking Accuracy and Fairness in Differentially Private Image Classification

    Unlocking Accuracy and Fairness in Differentially Private Image Classification

  • The Most Important Question in AI Alignment

    The Most Important Question in AI Alignment

  • Measuring Value Understanding in Language Models through Discriminator-Critique Gap

    Measuring Value Understanding in Language Models through Discriminator-Critique Gap

  • Cleaning Up the Streets: Understanding Motivations, Mental Models, and Concerns of Users Flagging So...

    Cleaning Up the Streets: Understanding Motivations, Mental Models, and Concerns of Users Flagging So...

Partners

  • Ā 
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • Ā© 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.