The Case for Learning Application Behavior to Improve Hardware Energy
Efficiency
- URL: http://arxiv.org/abs/2004.13074v2
- Date: Mon, 23 Nov 2020 20:12:39 GMT
- Title: The Case for Learning Application Behavior to Improve Hardware Energy
Efficiency
- Authors: Kevin Weston, Vahid Jafanza, Arnav Kansal, Abhishek Taur, Mohamed
Zahran, Abdullah Muzahid
- Abstract summary: We propose to use the harvested knowledge to tune hardware configurations.
Our proposed approach, called FORECASTER, uses a deep learning model to learn what configuration of hardware resources provides the optimal energy efficiency for a certain behavior of an application.
Our results show that FORECASTER can save as much as 18.4% system power over the baseline set up with all resources.
- Score: 2.4425948078034847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer applications are continuously evolving. However, significant
knowledge can be harvested from a set of applications and applied in the
context of unknown applications. In this paper, we propose to use the harvested
knowledge to tune hardware configurations. The goal of such tuning is to
maximize hardware efficiency (i.e., maximize an applications performance while
minimizing the energy consumption). Our proposed approach, called FORECASTER,
uses a deep learning model to learn what configuration of hardware resources
provides the optimal energy efficiency for a certain behavior of an
application. During the execution of an unseen application, the model uses the
learned knowledge to reconfigure hardware resources in order to maximize energy
efficiency. We have provided a detailed design and implementation of FORECASTER
and compared its performance against a prior state-of-the-art hardware
reconfiguration approach. Our results show that FORECASTER can save as much as
18.4% system power over the baseline set up with all resources. On average,
FORECASTER saves 16% system power over the baseline setup while sacrificing
less than 0.01% of overall performance. Compared to the prior scheme,
FORECASTER increases power savings by 7%.
Related papers
- Secure Resource Allocation via Constrained Deep Reinforcement Learning [49.15061461220109]
We present SARMTO, a framework that balances resource allocation, task offloading, security, and performance.
SARMTO consistently outperforms five baseline approaches, achieving up to a 40% reduction in system costs.
These enhancements highlight SARMTO's potential to revolutionize resource management in intricate distributed computing environments.
arXiv Detail & Related papers (2025-01-20T15:52:43Z) - Energy consumption of code small language models serving with runtime engines and execution providers [11.998900897003997]
Small Language Models (SLMs) offer a promising solution to reduce resource demands.
Our goal is to analyze the impact of deep learning engines and execution providers on energy consumption, execution time, and computing-resource utilization.
arXiv Detail & Related papers (2024-12-19T22:44:02Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.
deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.
This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - A Reinforcement Learning Approach for Performance-aware Reduction in
Power Consumption of Data Center Compute Nodes [0.46040036610482665]
We use Reinforcement Learning to design a power capping policy on cloud compute nodes.
We show how a trained agent running on actual hardware can take actions by balancing power consumption and application performance.
arXiv Detail & Related papers (2023-08-15T23:25:52Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture
Search [50.33956216274694]
optimizing resource utilization in target platforms is key to achieving high performance during DNN inference.
We propose a novel hardware-aware NAS framework that does not only optimize for task accuracy and inference latency, but also for resource utilization.
We achieve 2.8 - 4x speedup for DNN inference compared to prior hardware-aware NAS methods.
arXiv Detail & Related papers (2022-03-23T13:44:15Z) - Deep Reinforcement Learning Based Multidimensional Resource Management
for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency.
In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z) - Intelligent colocation of HPC workloads [0.0]
Many HPC applications suffer from a bottleneck in the shared caches, instruction execution units, I/O or memory bandwidth, even though the remaining resources may be underutilized.
It is hard for developers and runtime systems to ensure that all critical resources are fully exploited by a single application, so an attractive technique is to colocate multiple applications on the same server.
We show that server efficiency can be improved by first modeling the expected performance degradation of colocated applications based on measured hardware performance counters.
arXiv Detail & Related papers (2021-03-16T12:35:35Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - Intelligent Resource Allocation in Dense LoRa Networks using Deep
Reinforcement Learning [5.035252201462008]
We propose a multi-channel scheme for LoRaDRL.
Results demonstrate that the proposed algorithm not only significantly improves long-range wide area network (LoRaWAN)'s packet delivery ratio (PDR)
We show that LoRaDRL's output improves the performance of state-of-the-art techniques resulting in some cases an improvement of more than 500% in terms of PDR.
arXiv Detail & Related papers (2020-12-22T07:41:47Z) - AVAC: A Machine Learning based Adaptive RRAM Variability-Aware
Controller for Edge Devices [3.7346292069282643]
We propose an Adaptive RRAM Variability-Aware Controller, AVAC, which periodically updates Wait Buffer and batch sizes.
AVAC allows Edge devices to adapt to different applications and their stages, to improve performance and reduce energy consumption.
arXiv Detail & Related papers (2020-05-06T19:06:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.