SCOPE: Safe Exploration for Dynamic Computer Systems Optimization
- URL: http://arxiv.org/abs/2204.10451v1
- Date: Fri, 22 Apr 2022 00:58:52 GMT
- Title: SCOPE: Safe Exploration for Dynamic Computer Systems Optimization
- Authors: Hyunji Kim, Ahsan Pervaiz, Henry Hoffmann, Michael Carbin, Yi Ding
- Abstract summary: We present SCOPE, a resource manager that dynamically allocates hardware resources from the execution space.
We evaluate SCOPE's ability to deliver improved latency while minimizing power constraint violations.
- Score: 18.498208917123414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern computer systems need to execute under strict safety constraints
(e.g., a power limit), but doing so often conflicts with their ability to
deliver high performance (i.e. minimal latency). Prior work uses machine
learning to automatically tune hardware resources such that the system
execution meets safety constraints optimally. Such solutions monitor past
system executions to learn the system's behavior under different hardware
resource allocations before dynamically tuning resources to optimize the
application execution. However, system behavior can change significantly
between different applications and even different inputs of the same
applications. Hence, the models learned using data collected a priori are often
suboptimal and violate safety constraints when used with new applications and
inputs. To address this limitation, we introduce the concept of an execution
space, which is the cross product of hardware resources, input features, and
applications. To dynamically and safely allocate hardware resources from the
execution space, we present SCOPE, a resource manager that leverages a novel
safe exploration framework. We evaluate SCOPE's ability to deliver improved
latency while minimizing power constraint violations by dynamically configuring
hardware while running a variety of Apache Spark applications. Compared to
prior approaches that minimize power constraint violations, SCOPE consumes
comparable power while improving latency by up to 9.5X. Compared to prior
approaches that minimize latency, SCOPE achieves similar latency but reduces
power constraint violation rates by up to 45.88X, achieving almost zero safety
constraint violations across all applications.
Related papers
- CUAOA: A Novel CUDA-Accelerated Simulation Framework for the QAOA [3.757262277494307]
Quantum Approximate Optimization Algorithm (QAOA) is a prominent quantum algorithm designed to find approximate solutions to optimization problems.
Existing state-of-the-art simulation frameworks suffer from long execution times or lack comprehensive functionality.
We develop a GPU accelerated QAOA simulation framework utilizing the runtime-the-art toolkit.
arXiv Detail & Related papers (2024-07-17T21:06:18Z) - Dynamic DNNs and Runtime Management for Efficient Inference on
Mobile/Embedded Devices [2.8851756275902476]
Deep neural network (DNN) inference is increasingly being executed on mobile and embedded platforms.
We co-designed novel Dynamic Super-Networks to maximise system-level performance and energy efficiency.
Compared with SOTA, our experimental results using ImageNet on the GPU of Jetson Xavier NX show our model is 2.4x faster for similar ImageNet Top-1 accuracy, or 5.1% higher accuracy at similar latency.
arXiv Detail & Related papers (2024-01-17T04:40:30Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a
Multi-Tasking System [20.882393722208608]
More and more deep neural networks (DNNs) have been developed and deployed on autonomous vehicles (AVs)
To meet their growing expectations and requirements, AVs should "optimize" use of their limited onboard computing resources for multiple concurrent in-vehicle apps.
We propose Dynamix, which optimize the resource requirement of concurrent apps and aims to maximize execution accuracy.
arXiv Detail & Related papers (2023-02-03T06:33:28Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - FELARE: Fair Scheduling of Machine Learning Applications on
Heterogeneous Edge Systems [5.165692107696155]
Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications.
We study and analyze resource allocation solutions that can increase the on-time task completion rate while considering the energy constraint.
We observed 8.9% improvement in on-time task completion rate and 12.6% in energy-saving without imposing any significant overhead on the edge system.
arXiv Detail & Related papers (2022-05-31T19:19:40Z) - Real-Time GPU-Accelerated Machine Learning Based Multiuser Detection for
5G and Beyond [70.81551587109833]
nonlinear beamforming filters can significantly outperform linear approaches in stationary scenarios with massive connectivity.
One of the main challenges comes from the real-time implementation of these algorithms.
This paper explores the acceleration of APSM-based algorithms through massive parallelization.
arXiv Detail & Related papers (2022-01-13T15:20:45Z) - Intelligent colocation of HPC workloads [0.0]
Many HPC applications suffer from a bottleneck in the shared caches, instruction execution units, I/O or memory bandwidth, even though the remaining resources may be underutilized.
It is hard for developers and runtime systems to ensure that all critical resources are fully exploited by a single application, so an attractive technique is to colocate multiple applications on the same server.
We show that server efficiency can be improved by first modeling the expected performance degradation of colocated applications based on measured hardware performance counters.
arXiv Detail & Related papers (2021-03-16T12:35:35Z) - EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware
Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.