Related papers: Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale

Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale

URL: http://arxiv.org/abs/2402.18593v1
Date: Sun, 25 Feb 2024 02:22:34 GMT
Title: Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
Authors: Dan Zhao, Siddharth Samsi, Joseph McDonald, Baolin Li, David Bestor, Michael Jones, Devesh Tiwari, Vijay Gadepally
Abstract summary: Recent large language models require considerable resources to train and deploy. With the right amount of power-capping, we show significant decreases in both temperature and power draw. Our work is the first to conduct and make available a detailed analysis of the effects of GPU power-capping at the supercomputing scale.
Score: 20.30679358575365
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbon emissions, and massive demand for GPUs and other hardware accelerators. However, this surge carries large implications for energy sustainability at the HPC/datacenter level. In this paper, we study the aggregate effect of power-capping GPUs on GPU temperature and power draw at a research supercomputing center. With the right amount of power-capping, we show significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span with minimal impact on job performance. While power-capping reduces power draw by design, the aggregate system-wide effect on overall energy consumption is less clear; for instance, if users notice job performance degradation from GPU power-caps, they may request additional GPU-jobs to compensate, negating any energy savings or even worsening energy consumption. To our knowledge, our work is the first to conduct and make available a detailed analysis of the effects of GPU power-capping at the supercomputing scale. We hope our work will inspire HPCs/datacenters to further explore, evaluate, and communicate the impact of power-capping AI hardware accelerators for more sustainable AI.

Related papers

Power- and Fragmentation-aware Online Scheduling for GPU Datacenters [9.29180785233729]
We focus on two objectives: minimizing GPU fragmentation and reducing power consumption. To this end, we propose PWR, a novel scheduling policy to minimize power usage by selecting power-efficient GPU and CPU combinations. We show how PWR, when combined with FGD, achieves a balanced trade-off between reducing power consumption and minimizing GPU fragmentation.
arXiv Detail & Related papers (2024-12-23T11:27:17Z)
Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation Approach [5.03421342195771]
We propose a novel search-based compilation method to generate energy-efficient GPU kernels. Our evaluation demonstrates that the proposed approach can generate GPU kernels with up to 21.69% reduced energy consumption.
arXiv Detail & Related papers (2024-11-28T02:51:54Z)
Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach [15.28157695259566]
Energy consumption has become a critical design metric and a limiting factor in the development of future computing architectures. This paper studies a novel and practical online energy optimization problem for GPU in HPC scenarios. EnergyUCB is designed to dynamically adjust GPU core frequencies in real-time, reducing energy consumption with minimal impact on performance.
arXiv Detail & Related papers (2024-10-03T17:05:34Z)
On the Opportunities of Green Computing: A Survey [80.21955522431168]
Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades. The needs for high computing power brings higher carbon emission and undermines research fairness. To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic.
arXiv Detail & Related papers (2023-11-01T11:16:41Z)
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU. This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z)
Energy Concerns with HPC Systems and Applications [0.0]
em energy has become a critical concern in all relevant activities and technical designs. For the specific case of computer activities, the problem is exacerbated with the emergence and pervasiveness of the so called em intelligent devices There are mainly two contexts where em energy is one of the top priority concerns: em embedded computing and em supercomputing.
arXiv Detail & Related papers (2023-08-31T08:33:42Z)
Non-Intrusive Electric Load Monitoring Approach Based on Current Feature Visualization for Smart Energy Management [51.89904044860731]
We employ computer vision techniques of AI to design a non-invasive load monitoring method for smart electric energy management. We propose to recognize all electric loads from color feature images using a U-shape deep neural network with multi-scale feature extraction and attention mechanism.
arXiv Detail & Related papers (2023-08-08T04:52:19Z)
Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads [0.534434568021034]
We present measurements of the energy consumption of two typical applications of deep learning models on different types of compute nodes. One advantage of our approach is that the information on energy consumption is available to all users of the supercomputer.
arXiv Detail & Related papers (2022-12-03T21:40:55Z)
Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models [8.927248087602942]
We investigate techniques that can be used to reduce the energy consumption of common NLP applications. These techniques can lead to significant reduction in energy consumption when training language models or their use for inference.
arXiv Detail & Related papers (2022-05-19T16:03:55Z)
The Ecological Footprint of Neural Machine Translation Systems [2.132096006921048]
This chapter focuses on the ecological footprint of neural MT systems. It starts from the power drain during the training of and the inference with neural MT models and moves towards the environment impact. The overall CO2 offload is calculated for Ireland and the Netherlands.
arXiv Detail & Related papers (2022-02-04T14:56:41Z)
Compute and Energy Consumption Trends in Deep Learning Inference [67.32875669386488]
We study relevant models in the areas of computer vision and natural language processing. For a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated.
arXiv Detail & Related papers (2021-09-12T09:40:18Z)
JUWELS Booster -- A Supercomputer for Large-Scale AI Research [79.02246047353273]
We present JUWELS Booster, a recently commissioned high-performance computing system at the J"ulich Supercomputing Center. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance.
arXiv Detail & Related papers (2021-06-30T21:37:02Z)
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems [45.479582612113205]
We show how to improve the performance and power efficiency of RL training on CPU-GPU systems. We quantify the overall hardware utilization on a state-of-the-art distributed RL training framework. We also introduce a new system design metric, CPU/GPU ratio, and show how to find the optimal balance between CPU and GPU resources.
arXiv Detail & Related papers (2020-12-08T04:50:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.