Related papers: A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter

A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter

URL: http://arxiv.org/abs/2407.00878v1
Date: Wed, 10 Apr 2024 02:20:29 GMT
Title: A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter
Authors: Sunyanan Choochotkaew, Chen Wang, Huamin Chen, Tatsuhiro Chiba, Marcelo Amaral, Eun Kyung Lee, Tamar Eilam,
Abstract summary: Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform. The second challenge is the overhead in power consumption that the Cloud platform control plane induces.
Score: 2.661555989363678
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform (multi-tenancy), where information on the physical servers is mostly obscured. The second challenge is the overhead in power consumption that the Cloud platform control plane induces. This paper addresses these challenges and introduces a novel pipeline framework for power model training. This allows versatile power consumption approximation of individual containers on the basis of available performance counters and other metrics. The proposed model utilizes machine learning techniques to predict the power consumed by the control plane and associated processes, and uses it for isolating the power consumed by the user containers, from the server power consumption. To determine how well the prediction results in an isolation, we introduce a metric termed isolation goodness. Applying the proposed power model does not require online power measurements, nor does it need information on the physical servers, configuration, or information on other tenants sharing the same machine. The results of cross-workload, cross-platform experiments demonstrated the higher accuracy of the proposed model when predicting power consumption of unseen containers on unknown platforms, including on virtual machines.

Related papers

Green AI: A Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures [56.200335252600354]
It is common practice to deploy pre-trained models on environments distinct from their native development settings. This led to the introduction of interchange formats such as ONNX, which includes its infrastructure, and ONNX, which work as standard formats.
arXiv Detail & Related papers (2024-02-21T09:18:44Z)
Power Hungry Processing: Watts Driving the Cost of AI Deployment? [74.19749699665216]
generative, multi-purpose AI systems promise a unified approach to building machine learning (ML) models into technology. This ambition of generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit. We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models. We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions
arXiv Detail & Related papers (2023-11-28T15:09:36Z)
WattScope: Non-intrusive Application-level Power Disaggregation in Datacenters [0.6086160084025234]
WattScope is a system for non-intrusive estimating the power consumption of individual applications. WattScope adapts and extends a machine learning-based technique for disaggregating building power.
arXiv Detail & Related papers (2023-09-22T04:13:46Z)
A Reinforcement Learning Approach for Performance-aware Reduction in Power Consumption of Data Center Compute Nodes [0.46040036610482665]
We use Reinforcement Learning to design a power capping policy on cloud compute nodes. We show how a trained agent running on actual hardware can take actions by balancing power consumption and application performance.
arXiv Detail & Related papers (2023-08-15T23:25:52Z)
Non-Intrusive Electric Load Monitoring Approach Based on Current Feature Visualization for Smart Energy Management [51.89904044860731]
We employ computer vision techniques of AI to design a non-invasive load monitoring method for smart electric energy management. We propose to recognize all electric loads from color feature images using a U-shape deep neural network with multi-scale feature extraction and attention mechanism.
arXiv Detail & Related papers (2023-08-08T04:52:19Z)
Assessing VoD pressure on network power consumption [0.0]
We propose a new methodology to evaluate the impact of video streaming on the network infrastructure. At the core of our methodology is a parametric model of a simplified network and Content Delivery Network (CDN) infrastructure. Our results show that classical efficiency indicators do not reflect the power consumption increase of more intensive Internet usage.
arXiv Detail & Related papers (2023-04-06T15:34:14Z)
Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads [0.534434568021034]
We present measurements of the energy consumption of two typical applications of deep learning models on different types of compute nodes. One advantage of our approach is that the information on energy consumption is available to all users of the supercomputer.
arXiv Detail & Related papers (2022-12-03T21:40:55Z)
MetaNetwork: A Task-agnostic Network Parameters Generation Framework for Improving Device Model Generalization [65.02542875281233]
We propose a novel task-agnostic framework, named MetaNetwork, for generating adaptive device model parameters from cloud without on-device training. The MetaGenerator is designed to learn a mapping function from samples to model parameters, and it can generate and deliver the adaptive parameters to the device based on samples uploaded from the device to the cloud. The MetaStabilizer aims to reduce the oscillation of the MetaGenerator, accelerate the convergence and improve the model performance during both training and inference.
arXiv Detail & Related papers (2022-09-12T13:26:26Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
A Deep Learning Technique using Low Sampling rate for residential Non Intrusive Load Monitoring [0.19662978733004596]
Non-intrusive load monitoring (NILM) or energy disaggregation, is a blind source separation problem. We propose a novel deep neural network-based approach for performing load disaggregation on low frequency power data. Our neural network is capable of generating detailed feedback of demand-side, providing vital insights to the end-user.
arXiv Detail & Related papers (2021-11-07T23:01:36Z)
Power Modeling for Effective Datacenter Planning and Compute Management [53.41102502425513]
We discuss two classes of statistical power models designed and validated to be accurate, simple, interpretable and applicable to all hardware configurations and workloads. We demonstrate that the proposed statistical modeling techniques, while simple and scalable, predict power with less than 5% Mean Absolute Percent Error (MAPE) for more than 95% diverse Power Distribution Units (more than 2000) using only 4 features.
arXiv Detail & Related papers (2021-03-22T21:22:51Z)
Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable Edge Computing Systems [87.4519172058185]
An effective energy dispatch mechanism for self-powered wireless networks with edge computing capabilities is studied. A novel multi-agent meta-reinforcement learning (MAMRL) framework is proposed to solve the formulated problem. Experimental results show that the proposed MAMRL model can reduce up to 11% non-renewable energy usage and by 22.4% the energy cost.
arXiv Detail & Related papers (2020-02-20T04:58:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.