COIN: Chance-Constrained Imitation Learning for Uncertainty-aware
Adaptive Resource Oversubscription Policy
- URL: http://arxiv.org/abs/2401.07051v1
- Date: Sat, 13 Jan 2024 11:43:25 GMT
- Title: COIN: Chance-Constrained Imitation Learning for Uncertainty-aware
Adaptive Resource Oversubscription Policy
- Authors: Lu Wang, Mayukh Das, Fangkai Yang, Chao Duo, Bo Qiao, Hang Dong, Si
Qin, Chetan Bansal, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
- Abstract summary: We address the challenge of learning safe and robust decision policies in presence of uncertainty.
Traditional supervised prediction or forecasting models are ineffective in learning adaptive policies.
Online optimization or reinforcement learning is difficult to deploy on real systems.
- Score: 37.034543365623286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the challenge of learning safe and robust decision policies in
presence of uncertainty in context of the real scientific problem of adaptive
resource oversubscription to enhance resource efficiency while ensuring safety
against resource congestion risk.
Traditional supervised prediction or forecasting models are ineffective in
learning adaptive policies whereas standard online optimization or
reinforcement learning is difficult to deploy on real systems. Offline methods
such as imitation learning (IL) are ideal since we can directly leverage
historical resource usage telemetry. But, the underlying aleatoric uncertainty
in such telemetry is a critical bottleneck.
We solve this with our proposed novel chance-constrained imitation learning
framework, which ensures implicit safety against uncertainty in a principled
manner via a combination of stochastic (chance) constraints on resource
congestion risk and ensemble value functions. This leads to substantial
($\approx 3-4\times$) improvement in resource efficiency and safety in many
oversubscription scenarios, including resource management in cloud services.
Related papers
- Optimal Transport-Assisted Risk-Sensitive Q-Learning [4.14360329494344]
This paper presents a risk-sensitive Q-learning algorithm that leverages optimal transport theory to enhance the agent safety.
We validate the proposed algorithm in a Gridworld environment.
arXiv Detail & Related papers (2024-06-17T17:32:25Z) - Uncertainty-aware Distributional Offline Reinforcement Learning [26.34178581703107]
offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data.
We propose an uncertainty-aware distributional offline RL method to simultaneously address both uncertainty and environmentality.
Our method is rigorously evaluated through comprehensive experiments in both risk-sensitive and risk-neutral benchmarks, demonstrating its superior performance.
arXiv Detail & Related papers (2024-03-26T12:28:04Z) - Resilient Constrained Reinforcement Learning [87.4374430686956]
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before study.
It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward training objective and the constraint satisfaction.
We propose a new constrained RL approach that searches for policy and constraint specifications together.
arXiv Detail & Related papers (2023-12-28T18:28:23Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - Model-Free Learning of Optimal Deterministic Resource Allocations in
Wireless Systems via Action-Space Exploration [4.721069729610892]
We propose a technically grounded and scalable deterministic-dual gradient policy method for efficiently learning optimal parameterized resource allocation policies.
Our method not only efficiently exploits gradient availability of popular universal representations such as deep networks, but is also truly model-free, as it relies on consistent zeroth-order gradient approximations of associated random network services constructed via low-dimensional perturbations in action space.
arXiv Detail & Related papers (2021-08-23T18:26:16Z) - Risk-Aware Transfer in Reinforcement Learning using Successor Features [16.328601804662657]
We show that risk-aware successor features (RaSF) integrate seamlessly within the practical reinforcement learning framework.
RaSFs outperform alternative methods including SFs, when taking the risk of the learned policies into account.
arXiv Detail & Related papers (2021-05-28T22:22:03Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.