Related papers: Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters

Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters

URL: http://arxiv.org/abs/2505.23554v1
Date: Thu, 29 May 2025 15:31:28 GMT
Title: Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters
Authors: Hayden Moore, Sirui Qi, Ninad Hogade, Dejan Milojicic, Cullen Bash, Sudeep Pasricha,
Abstract summary: Large Language Models (LLM) such as ChatGPT, CoPilot, and Gemini have been widely adopted in different areas.<n>Recent studies estimate that the costs of operating LLMs in their inference phase can exceed training costs by 25x per year.<n>We propose a novel framework called SLIT to co-optimize LLM quality of service (time-to-first token), carbon emissions, water usage, and energy costs.
Score: 2.391483506190989
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In recent years, Large Language Models (LLM) such as ChatGPT, CoPilot, and Gemini have been widely adopted in different areas. As the use of LLMs continues to grow, many efforts have focused on reducing the massive training overheads of these models. But it is the environmental impact of handling user requests to LLMs that is increasingly becoming a concern. Recent studies estimate that the costs of operating LLMs in their inference phase can exceed training costs by 25x per year. As LLMs are queried incessantly, the cumulative carbon footprint for the operational phase has been shown to far exceed the footprint during the training phase. Further, estimates indicate that 500 ml of fresh water is expended for every 20-50 requests to LLMs during inference. To address these important sustainability issues with LLMs, we propose a novel framework called SLIT to co-optimize LLM quality of service (time-to-first token), carbon emissions, water usage, and energy costs. The framework utilizes a machine learning (ML) based metaheuristic to enhance the sustainability of LLM hosting across geo-distributed cloud datacenters. Such a framework will become increasingly vital as LLMs proliferate.

Related papers

What Factors Affect LLMs and RLLMs in Financial Question Answering? [4.42417272193095]
This study explores the impact of various methods on large language models (LLMs) and reasoning large language models (RLLMs) in the financial domain.<n>We utilize five LLMs and three RLLMs to assess the effects of prompting methods, agentic frameworks, and multilingual alignment methods on financial question-answering tasks.
arXiv Detail & Related papers (2025-07-11T06:37:44Z)
Large Language Model-enhanced Reinforcement Learning for Low-Altitude Economy Networking [71.83640290222928]
Low-Altitude Economic Networking (LAENet) aims to support diverse flying applications below 1,000 meters.<n>Complex decision-making, resource constraints, and environmental uncertainty pose significant challenges to the development of the LAENet.
arXiv Detail & Related papers (2025-05-27T11:25:42Z)
LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile Apps [12.95765484886394]
Large language models (LLMs) have opened new opportunities for automated mobile app exploration.<n>We argue that such extensive usage of LLMs is neither necessary nor effective, since many actions during exploration do not require, or may even be biased by the abilities of LLMs.<n>We introduce LLM-Explorer, a new exploration agent designed for efficiency and affordability.
arXiv Detail & Related papers (2025-05-15T05:28:35Z)
Bridging AI and Carbon Capture: A Dataset for LLMs in Ionic Liquids and CBE Research [3.2995359570845912]
Large Language Models (LLMs) have demonstrated exceptional performance in general knowledge and reasoning tasks.<n>Their effectiveness in specialized scientific fields like Chemical and Biological Engineering (CBE) remains underexplored.<n>We release an expert - curated dataset of 5,920 examples designed to benchmark LLMs' reasoning in CBE.
arXiv Detail & Related papers (2025-05-11T12:32:57Z)
LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch [77.02136168850532]
We detail the training of the LLM360 K2-65B model, scaling up our 360-degree OPEN SOURCE approach to the largest and most powerful models under project LLM360.
arXiv Detail & Related papers (2025-01-13T08:26:43Z)
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs [11.664088080448593]
"LlamaDuo" is a pipeline for migrating knowledge and abilities from service-oriented large language models to smaller, locally manageable models. Our pipeline is crucial for ensuring service continuity in the presence of operational failures, strict privacy policies, or offline requirements.
arXiv Detail & Related papers (2024-08-24T05:03:08Z)
A Comprehensive Survey of Contamination Detection Methods in Large Language Models [68.10605098856087]
With the rise of Large Language Models (LLMs) in recent years, abundant new opportunities are emerging, but also new challenges.<n>LLMs' performance may not be reliable anymore, as the high performance may be at least partly due to their previous exposure to the data.<n>This limitation jeopardizes real capability improvement in the field of NLP, yet there remains a lack of methods on how to efficiently detect contamination.
arXiv Detail & Related papers (2024-03-31T14:32:02Z)
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement [79.31084387589968]
Pretrained large language models (LLMs) are currently state-of-the-art for solving the vast majority of natural language processing tasks. We propose LLM2LLM, a data augmentation strategy that uses a teacher LLM to enhance a small seed dataset. We achieve improvements up to 24.2% on the GSM8K dataset, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC and 39.8% on SST-2 over regular fine-tuning in the low-data regime.
arXiv Detail & Related papers (2024-03-22T08:57:07Z)
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z)
Large Language Models: A Survey [66.39828929831017]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.<n>LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z)
LLM360: Towards Fully Transparent Open-Source LLMs [89.05970416013403]
The goal of LLM360 is to support open and collaborative AI research by making the end-to-end training process transparent and reproducible by everyone. As a first step of LLM360, we release two 7B parameter LLMs pre-trained from scratch, Amber and CrystalCoder, including their training code, data, intermediate checkpoints, and analyses.
arXiv Detail & Related papers (2023-12-11T17:39:00Z)
FATE-LLM: A Industrial Grade Federated Learning Framework for Large Language Models [18.65547577691255]
Large Language Models (LLMs) have exhibited remarkable performances across various tasks in recent years. FATE-LLM is an industrial-grade federated learning framework for large language models. We release the code of FATE-LLM to facilitate the research of FedLLM and enable a broad range of industrial applications.
arXiv Detail & Related papers (2023-10-16T04:17:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.