CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language Models
- URL: http://arxiv.org/abs/2508.06524v1
- Date: Sat, 02 Aug 2025 00:41:45 GMT
- Title: CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language Models
- Authors: Lei Jiang, Fan Chen,
- Abstract summary: textitCarbonScaling is an analytical framework that extends neural scaling laws to incorporate both operational and embodied carbon.<n>Results show that while a power-law relationship between accuracy and carbon holds, real-world inefficiencies significantly increase the scaling factor.
- Score: 8.082036109277468
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural scaling laws have driven the development of increasingly large language models (LLMs) by linking accuracy improvements to growth in parameter count, dataset size, and compute. However, these laws overlook the carbon emissions that scale exponentially with LLM size. This paper presents \textit{CarbonScaling}, an analytical framework that extends neural scaling laws to incorporate both operational and embodied carbon in LLM training. By integrating models for neural scaling, GPU hardware evolution, parallelism optimization, and carbon estimation, \textit{CarbonScaling} quantitatively connects model accuracy to carbon footprint. Results show that while a power-law relationship between accuracy and carbon holds, real-world inefficiencies significantly increase the scaling factor. Hardware technology scaling reduces carbon emissions for small to mid-sized models, but offers diminishing returns for extremely large LLMs due to communication overhead and underutilized GPUs. Training optimizations-especially aggressive critical batch size scaling-help alleviate this inefficiency. \textit{CarbonScaling} offers key insights for training more sustainable and carbon-efficient LLMs.
Related papers
- Compute-Optimal Scaling for Value-Based Deep RL [99.680827753493]
We investigate compute scaling for online, value-based deep RL.<n>Our analysis reveals a nuanced interplay between model size, batch size, and UTD.<n>We provide a mental model for understanding this phenomenon and build guidelines for choosing batch size and UTD.
arXiv Detail & Related papers (2025-08-20T17:54:21Z) - Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z) - Diffusion-Modeled Reinforcement Learning for Carbon and Risk-Aware Microgrid Optimization [48.70916202664808]
DiffCarl is a diffusion-modeled carbon- and risk-aware reinforcement learning algorithm for intelligent operation of multi-microgrid systems.<n>It outperforms classic algorithms and state-of-the-art DRL solutions, with 2.3-30.1% lower operational cost.<n>It also achieves 28.7% lower carbon emissions than those of its carbon-unaware variant and reduces performance variability.
arXiv Detail & Related papers (2025-07-22T03:27:07Z) - CEGI: Measuring the trade-off between efficiency and carbon emissions for SLMs and VLMs [0.0]
This paper analyzes the performance of Small Language Models (SLMs) and Vision Language Models (VLMs)<n>To quantify the trade-off between model performance and carbon emissions, we introduce a novel metric called CEGI (Carbon Efficient Gain Index)<n>Our findings suggest that the marginal gains in accuracy from larger models do not justify the substantial increase in carbon emissions.
arXiv Detail & Related papers (2024-12-03T17:32:47Z) - LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferences [7.137654106298203]
Estimating the carbon footprint of large language models (LLM) inferences is more complex than training.
coo is a graph neural network (GNN)-based model that greatly improves the accuracy of LLM inference carbon footprint predictions.
arXiv Detail & Related papers (2024-10-03T19:48:45Z) - Generative AI for Low-Carbon Artificial Intelligence of Things with Large Language Models [67.0243099823109]
Generative AI (GAI) holds immense potential to reduce carbon emissions of Artificial Intelligence of Things (AIoT)
In this article, we explore the potential of GAI for carbon emissions reduction and propose a novel GAI-enabled solution for low-carbon AIoT.
We propose a Large Language Model (LLM)-enabled carbon emission optimization framework, in which we design pluggable LLM and Retrieval Augmented Generation (RAG) modules.
arXiv Detail & Related papers (2024-04-28T05:46:28Z) - LACS: Learning-Augmented Algorithms for Carbon-Aware Resource Scaling with Uncertain Demand [1.423958951481749]
This paper studies the online carbon-aware resource scaling problem with unknown job lengths (OCSU)
We propose LACS, a theoretically robust learning-augmented algorithm that solves OCSU.
LACS achieves a 32% reduction in carbon footprint compared to the deadline-aware carbon-agnostic execution of the job.
arXiv Detail & Related papers (2024-03-29T04:54:22Z) - ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language
Models [70.45441031021291]
Large Vision-Language Models (LVLMs) can understand the world comprehensively by integrating rich information from different modalities.
LVLMs are often problematic due to their massive computational/energy costs and carbon consumption.
We propose Efficient Coarse-to-Fine LayerWise Pruning (ECoFLaP), a two-stage coarse-to-fine weight pruning approach for LVLMs.
arXiv Detail & Related papers (2023-10-04T17:34:00Z) - LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language
Models [7.132822974156601]
The carbon footprint of large language models (LLMs) is a significant concern, encompassing emissions from their training, inference, experimentation, and storage processes.
We introduce textitcarb, an end-to-end carbon footprint projection model designed for both dense and MoE LLMs.
arXiv Detail & Related papers (2023-09-25T14:50:04Z) - Real-time high-resolution CO$_2$ geological storage prediction using
nested Fourier neural operators [58.728312684306545]
Carbon capture and storage (CCS) plays an essential role in global decarbonization.
Scaling up CCS deployment requires accurate and high-resolution modeling of the storage reservoir pressure buildup and the gaseous plume migration.
We introduce Nested Fourier Neural Operator (FNO), a machine-learning framework for high-resolution dynamic 3D CO2 storage modeling at a basin scale.
arXiv Detail & Related papers (2022-10-31T04:04:03Z) - Measuring the Carbon Intensity of AI in Cloud Instances [91.28501520271972]
We provide a framework for measuring software carbon intensity, and propose to measure operational carbon emissions.
We evaluate a suite of approaches for reducing emissions on the Microsoft Azure cloud compute platform.
arXiv Detail & Related papers (2022-06-10T17:04:04Z) - Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine
Translation [0.0]
We study the carbon efficiency and look for alternatives to reduce the overall environmental impact of training models.
In our work, we assess the performance of models for machine translation, across multiple language pairs.
We examine the various components of these models to analyze aspects of our pipeline that can be optimized to reduce these carbon emissions.
arXiv Detail & Related papers (2021-09-26T12:30:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.