A recent study published in Frontiers in Communication has cast a critical light on the environmental impact of artificial intelligence, revealing that not all AI prompts are created equal when it comes to carbon emissions. The research highlights that more complex “reasoning models” within Large Language Models (LLMs) can generate significantly more CO₂ than their “concise” counterparts, prompting concerns among researchers and climate advocates regarding AI’s escalating energy demands.
The study, which meticulously evaluated 14 different LLMs using a standardized set of 500 questions across diverse subject areas, found a direct correlation between the number of “thinking tokens” generated by a model per query and its associated CO₂ emissions. Maximilian Dauner, a PhD student at Hochschule München University of Applied Sciences and a lead author of the paper, emphasized that “the environmental impact of questioning trained LLMs is strongly determined by their reasoning approach, with explicit reasoning processes significantly driving up energy consumption and carbon emissions.”
Specifically, the findings indicate that reasoning models, which possess larger training sets and require more processing time, produced substantially higher CO₂ outputs. In some instances, these sophisticated models generated up to 50 times the emissions of concise models. This disparity is further exacerbated by the complexity of the questions posed; open-ended or intricate queries, such as those involving advanced algebra or philosophical concepts, resulted in a larger carbon footprint compared to simpler prompts like high school history questions.
Reasoning models, sometimes referred to as “thinking models,” are optimized for tackling complex tasks that necessitate logic, step-by-step breakdowns, or detailed instructions. These models, exemplified by versions like OpenAI’s GPT-4o and o1/o3-mini, employ what LLM researchers term “chain-of-thought” processing. This allows them to respond more deliberately and generate more human-like responses, albeit with the trade-off of increased processing time and, consequently, higher energy consumption. Conversely, generalized models prioritize speed and clarity for more straightforward tasks.
The researchers conducted their testing in two phases: initially with multiple-choice questions, followed by free-response prompts. On average, reasoning models generated an astonishing 543.5 tokens per question, a stark contrast to the mere 37.7 tokens produced by concise models. For example, “Cogito,” identified as the most accurate reasoning model examined, produced three times as much CO₂ as similarly sized models optimized for concise responses. The paper explicitly states that “from an environmental perspective, reasoning models consistently exhibited higher emissions, driven primarily by their elevated token production.”
While the difference in emissions per individual prompt may appear marginal, the cumulative effect at scale is significant. The study projects that asking DeepSeek’s R1 model 600,000 questions would generate approximately the same amount of CO₂ as a round-trip flight from London to New York. In comparison, the non-reasoning Qwen 2.5 model could answer three times as many questions before reaching an equivalent emission level. This highlights a critical trade-off between LLM accuracy and environmental sustainability, as “as model size increases, accuracy tends to improve,” but “this gain is also linked to substantial growth in both CO₂ emissions and the number of generated tokens.”
These findings emerge amidst a fierce global competition among tech giants to develop increasingly advanced AI models. The escalating demand for AI-driven infrastructure is poised to place considerable strain on existing energy grids. Over the past year, Apple announced plans to invest a staggering $500 billion in manufacturing and data centers over the next four years. Similarly, Project Stargate, a collaborative initiative involving OpenAI, SoftBank, and Oracle, has pledged an equivalent $500 billion toward AI-focused data centers. A recent report in the MIT Technology Review indicates that since 2017, data centers have increasingly incorporated energy-intensive hardware specifically designed for complex AI computations, leading to a surge in energy consumption.
The Electric Power Research Institute (EPRI) estimates that data centers supporting advanced AI models could account for up to 9.1 percent of the United States’ total energy demand by the end of the decade, a significant increase from approximately 4.4 percent today. To meet this burgeoning energy demand, major tech companies are exploring diverse power generation strategies. Meta, Google, and Microsoft have all forged partnerships with nuclear power plants. Notably, Microsoft has signed a 20-year agreement to source energy from the Three Mile Island nuclear facility in Pennsylvania to power its growing data center fleet. Meta is also making substantial investments in geothermal technology, while OpenAI CEO Sam Altman is reportedly investing in experimental nuclear fusion, acknowledging that the coming age of AI will necessitate an “energy breakthrough.” Despite these efforts, recent research suggests that it is almost certain that more fossil fuels, particularly natural gas, will be required to fully meet AI’s massive energy requirements.
However, the researchers believe their findings can empower everyday AI users to mitigate their carbon impact. By understanding the significantly higher energy intensity of reasoning models, users could opt to use them more sparingly, relying on concise models for general daily tasks such as web searches and basic question answering. Dauner emphasized this point, stating, “If users know the exact CO₂ cost of their AI-generated outputs, such as casually turning themselves into an action figure, they might be more selective and thoughtful about when and how they use these technologies.” This proactive user behavior, coupled with ongoing advancements in energy-efficient AI design, will be crucial in navigating the environmental challenges posed by the rapid expansion of artificial intelligence.




