Hierarchical Multi-Agent Framework for Carbon-Efficient Liquid-Cooled Data Center Clusters
- URL: http://arxiv.org/abs/2502.08337v1
- Date: Wed, 12 Feb 2025 12:00:58 GMT
- Title: Hierarchical Multi-Agent Framework for Carbon-Efficient Liquid-Cooled Data Center Clusters
- Authors: Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Vineet Gundecha, Ricardo Luna Gutierrez, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Desik Rengarajan, Cullen Bash,
- Abstract summary: This paper introduces Green-DCC, which proposes a Reinforcement Learning (RL) based hierarchical controller to optimize both workload and liquid cooling dynamically in a DCC.
We demonstrate how the system optimize multiple data centers synchronously, enabling the scope of digital twins, and compare the performance of various RL approaches based on carbon emissions and sustainability metrics.
- Score: 5.335496791443277
- License:
- Abstract: Reducing the environmental impact of cloud computing requires efficient workload distribution across geographically dispersed Data Center Clusters (DCCs) and simultaneously optimizing liquid and air (HVAC) cooling with time shift of workloads within individual data centers (DC). This paper introduces Green-DCC, which proposes a Reinforcement Learning (RL) based hierarchical controller to optimize both workload and liquid cooling dynamically in a DCC. By incorporating factors such as weather, carbon intensity, and resource availability, Green-DCC addresses realistic constraints and interdependencies. We demonstrate how the system optimizes multiple data centers synchronously, enabling the scope of digital twins, and compare the performance of various RL approaches based on carbon emissions and sustainability metrics while also offering a framework and benchmark simulation for broader ML research in sustainability.
Related papers
- Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.
We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - SustainDC: Benchmarking for Sustainable Data Center Control [4.159959816797259]
We introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC)
SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management.
We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements.
arXiv Detail & Related papers (2024-08-14T22:43:52Z) - Sustainability of Data Center Digital Twins with Reinforcement Learning [2.4971633082970377]
Machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption.
To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential.
DCRL-Green is a multi-agent RL environment that empowers the ML community to design data centers and research, develop, and refine RL controllers for carbon footprint reduction in DCs.
arXiv Detail & Related papers (2024-04-16T18:22:30Z) - Carbon Footprint Reduction for Sustainable Data Centers in Real-Time [2.794742330785396]
We propose a Data Center Carbon Footprint Reduction (DC-CFR) multi-agent Reinforcement Learning (MARL) framework to optimize data centers for the objectives of carbon footprint reduction, energy consumption, and energy cost.
The results show that the DC-CFR MARL agents effectively resolved the complex interdependencies in optimizing cooling, load shifting, and energy storage in real-time for various locations under real-world dynamic weather and grid carbon intensity conditions.
arXiv Detail & Related papers (2024-03-21T02:59:56Z) - CAFE: Carbon-Aware Federated Learning in Geographically Distributed Data
Centers [18.54380015603228]
Training large-scale artificial intelligence (AI) models demands significant computational power and energy, leading to increased carbon footprint with potential environmental repercussions.
This paper delves into the challenges of training AI models across geographically distributed (geo-distributed) data centers, emphasizing the balance between learning performance and carbon footprint.
We propose a new framework called CAFE (short for Carbon-Aware Federated Learning) to optimize training within a fixed carbon footprint budget.
arXiv Detail & Related papers (2023-11-06T23:59:22Z) - SHIELD: Sustainable Hybrid Evolutionary Learning Framework for Carbon,
Wastewater, and Energy-Aware Data Center Management [2.9699290794642366]
Geo-distributed data centers (GDDCs) have a significant associated environmental impact.
This paper proposes a novel framework to co-optimize carbon emissions, water footprint, and energy costs of GDDCs.
arXiv Detail & Related papers (2023-08-24T21:11:55Z) - Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A
Multi-Agent Reinforcement Learning Approach [48.18355658448509]
Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption.
Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive and low-carbon energy.
We propose an algorithm based on multi-agent reinforcement learning and actor-critic methods to learn the optimal collaborative scheduling strategy through interacting with a cloud system built with real-life workload patterns, energy prices, and carbon intensities.
arXiv Detail & Related papers (2023-04-17T02:12:30Z) - Measuring the Carbon Intensity of AI in Cloud Instances [91.28501520271972]
We provide a framework for measuring software carbon intensity, and propose to measure operational carbon emissions.
We evaluate a suite of approaches for reducing emissions on the Microsoft Azure cloud compute platform.
arXiv Detail & Related papers (2022-06-10T17:04:04Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Dual-Cross Central Difference Network for Face Anti-Spoofing [54.81222020394219]
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems.
Central difference convolution (CDC) has shown its excellent representation capacity for the FAS task.
We propose two Cross Central Difference Convolutions (C-CDC), which exploit the difference of the center and surround sparse local features.
arXiv Detail & Related papers (2021-05-04T05:11:47Z) - A Framework for Energy and Carbon Footprint Analysis of Distributed and
Federated Edge Learning [48.63610479916003]
This article breaks down and analyzes the main factors that influence the environmental footprint of distributed learning policies.
It models both vanilla and decentralized FL policies driven by consensus.
Results show that FL allows remarkable end-to-end energy savings (30%-40%) for wireless systems characterized by low bit/Joule efficiency.
arXiv Detail & Related papers (2021-03-18T16:04:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.