Full Scaling Automation for Sustainable Development of Green Data Centers
- URL: http://arxiv.org/abs/2305.00706v2
- Date: Sat, 01 Mar 2025 15:57:31 GMT
- Title: Full Scaling Automation for Sustainable Development of Green Data Centers
- Authors: Shiyu Wang, Yinbo Sun, Xiaoming Shi, Shiyi Zhu, Lin-Tao Ma, James Zhang, Yifei Zheng, Jian Liu,
- Abstract summary: The rapid rise in cloud computing has resulted in an alarming increase in data centers' carbon emissions.<n>Our proposed Full Scaling Automation (FSA) mechanism is an effective method of dynamically adapting resources to accommodate changing workloads.<n>FSA harnesses the power of deep representation learning to accurately predict the future workload of each service and automatically stabilize the corresponding target CPU usage level.
- Score: 13.448126025186538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid rise in cloud computing has resulted in an alarming increase in data centers' carbon emissions, which now accounts for >3% of global greenhouse gas emissions, necessitating immediate steps to combat their mounting strain on the global climate. An important focus of this effort is to improve resource utilization in order to save electricity usage. Our proposed Full Scaling Automation (FSA) mechanism is an effective method of dynamically adapting resources to accommodate changing workloads in large-scale cloud computing clusters, enabling the clusters in data centers to maintain their desired CPU utilization target and thus improve energy efficiency. FSA harnesses the power of deep representation learning to accurately predict the future workload of each service and automatically stabilize the corresponding target CPU usage level, unlike the previous autoscaling methods, such as Autopilot or FIRM, that need to adjust computing resources with statistical models and expert knowledge. Our approach achieves significant performance improvement compared to the existing work in real-world datasets. We also deployed FSA on large-scale cloud computing clusters in industrial data centers, and according to the certification of the China Environmental United Certification Center (CEC), a reduction of 947 tons of carbon dioxide, equivalent to a saving of 1538,000 kWh of electricity, was achieved during the Double 11 shopping festival of 2022, marking a critical step for our company's strategic goal towards carbon neutrality by 2030.
Related papers
- MAIZX: A Carbon-Aware Framework for Optimizing Cloud Computing Emissions [0.7127829790714169]
Cloud computing poses significant environmental challenges due to its high-energy consumption and carbon emissions.<n>Data centers account for 2-4% of global energy usage, and the ICT sector's share of electricity consumption is projected to reach 40% by 2040.<n>This study evaluates the MAIZX framework, designed to optimize cloud operations and reduce carbon footprint.
arXiv Detail & Related papers (2025-06-24T19:40:09Z) - Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach [50.52139512096988]
6G Internet of Things (IoT) networks face challenges in remote areas and disaster scenarios where ground infrastructure is unavailable.<n>This paper proposes a novel aerial unmanned vehicle (UAV)-assisted computing (MEC) system enhanced by directional antennas to provide both computational and energy support for ground edge terminals.
arXiv Detail & Related papers (2025-05-06T06:46:19Z) - Hierarchical Multi-Agent Framework for Carbon-Efficient Liquid-Cooled Data Center Clusters [5.335496791443277]
This paper introduces Green-DCC, which proposes a Reinforcement Learning (RL) based hierarchical controller to optimize both workload and liquid cooling dynamically in a DCC.
We demonstrate how the system optimize multiple data centers synchronously, enabling the scope of digital twins, and compare the performance of various RL approaches based on carbon emissions and sustainability metrics.
arXiv Detail & Related papers (2025-02-12T12:00:58Z) - Towards Robust Stability Prediction in Smart Grids: GAN-based Approach under Data Constraints and Adversarial Challenges [53.2306792009435]
We introduce a novel framework to detect instability in smart grids by employing only stable data.
It relies on a Generative Adversarial Network (GAN) where the generator is trained to create instability data that are used along with stable data to train the discriminator.
Our solution, tested on a dataset composed of real-world stable and unstable samples, achieve accuracy up to 97.5% in predicting grid stability and up to 98.9% in detecting adversarial attacks.
arXiv Detail & Related papers (2025-01-27T20:48:25Z) - Beyond Efficiency: Scaling AI Sustainably [4.711003829305544]
Modern AI applications have driven ever-increasing demands in computing.
This paper characterizes the carbon impact of AI, including both operational carbon emissions from training and inference as well as embodied carbon emissions from hardware manufacturing.
arXiv Detail & Related papers (2024-06-08T00:07:16Z) - Spatio-temporal load shifting for truly clean computing [0.5857582826810999]
We study the impact of shifting computing jobs and associated power loads both in time and between locations.
We isolate three signals relevant for informed use of loadblity.
The costs of 24/7 CFE are reduced by 1.29$pm$0.07 EUR/MWh for every additional percentage of flexible load.
arXiv Detail & Related papers (2024-03-26T13:36:42Z) - Carbon Footprint Reduction for Sustainable Data Centers in Real-Time [2.794742330785396]
We propose a Data Center Carbon Footprint Reduction (DC-CFR) multi-agent Reinforcement Learning (MARL) framework to optimize data centers for the objectives of carbon footprint reduction, energy consumption, and energy cost.
The results show that the DC-CFR MARL agents effectively resolved the complex interdependencies in optimizing cooling, load shifting, and energy storage in real-time for various locations under real-world dynamic weather and grid carbon intensity conditions.
arXiv Detail & Related papers (2024-03-21T02:59:56Z) - A Safe Genetic Algorithm Approach for Energy Efficient Federated
Learning in Wireless Communication Networks [53.561797148529664]
Federated Learning (FL) has emerged as a decentralized technique, where contrary to traditional centralized approaches, devices perform a model training in a collaborative manner.
Despite the existing efforts made in FL, its environmental impact is still under investigation, since several critical challenges regarding its applicability to wireless networks have been identified.
The current work proposes a Genetic Algorithm (GA) approach, targeting the minimization of both the overall energy consumption of an FL process and any unnecessary resource utilization.
arXiv Detail & Related papers (2023-06-25T13:10:38Z) - Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A
Multi-Agent Reinforcement Learning Approach [48.18355658448509]
Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption.
Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive and low-carbon energy.
We propose an algorithm based on multi-agent reinforcement learning and actor-critic methods to learn the optimal collaborative scheduling strategy through interacting with a cloud system built with real-life workload patterns, energy prices, and carbon intensities.
arXiv Detail & Related papers (2023-04-17T02:12:30Z) - Green Federated Learning [7.003870178055125]
Federated Learning (FL) is a machine learning technique for training a centralized model using data of decentralized entities.
FL may leverage as many as hundreds of millions of globally distributed end-user devices with diverse energy sources.
We propose the concept of Green FL, which involves optimizing FL parameters and making design choices to minimize carbon emissions.
arXiv Detail & Related papers (2023-03-26T02:23:38Z) - Measuring the Carbon Intensity of AI in Cloud Instances [91.28501520271972]
We provide a framework for measuring software carbon intensity, and propose to measure operational carbon emissions.
We evaluate a suite of approaches for reducing emissions on the Microsoft Azure cloud compute platform.
arXiv Detail & Related papers (2022-06-10T17:04:04Z) - HUNTER: AI based Holistic Resource Management for Sustainable Cloud
Computing [26.48962351761643]
We propose an artificial intelligence (AI) based holistic resource management technique for sustainable cloud computing called HUNTER.
The proposed model formulates the goal of optimizing energy efficiency in data centers as a multi-objective scheduling problem.
Experiments on simulated and physical cloud environments show that HUNTER outperforms state-of-the-art baselines in terms of energy consumption, SLA violation, scheduling time, cost and temperature by up to 12, 35, 43, 54 and 3 percent respectively.
arXiv Detail & Related papers (2021-10-11T18:11:26Z) - Power Modeling for Effective Datacenter Planning and Compute Management [53.41102502425513]
We discuss two classes of statistical power models designed and validated to be accurate, simple, interpretable and applicable to all hardware configurations and workloads.
We demonstrate that the proposed statistical modeling techniques, while simple and scalable, predict power with less than 5% Mean Absolute Percent Error (MAPE) for more than 95% diverse Power Distribution Units (more than 2000) using only 4 features.
arXiv Detail & Related papers (2021-03-22T21:22:51Z) - A Framework for Energy and Carbon Footprint Analysis of Distributed and
Federated Edge Learning [48.63610479916003]
This article breaks down and analyzes the main factors that influence the environmental footprint of distributed learning policies.
It models both vanilla and decentralized FL policies driven by consensus.
Results show that FL allows remarkable end-to-end energy savings (30%-40%) for wireless systems characterized by low bit/Joule efficiency.
arXiv Detail & Related papers (2021-03-18T16:04:42Z) - Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable
Edge Computing Systems [87.4519172058185]
An effective energy dispatch mechanism for self-powered wireless networks with edge computing capabilities is studied.
A novel multi-agent meta-reinforcement learning (MAMRL) framework is proposed to solve the formulated problem.
Experimental results show that the proposed MAMRL model can reduce up to 11% non-renewable energy usage and by 22.4% the energy cost.
arXiv Detail & Related papers (2020-02-20T04:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.