Path Structured Multimarginal Schr\"odinger Bridge for Probabilistic
Learning of Hardware Resource Usage by Control Software
- URL: http://arxiv.org/abs/2310.00604v2
- Date: Tue, 3 Oct 2023 12:13:34 GMT
- Title: Path Structured Multimarginal Schr\"odinger Bridge for Probabilistic
Learning of Hardware Resource Usage by Control Software
- Authors: Georgiy A. Bondar, Robert Gifford, Linh Thi Xuan Phan, Abhishek Halder
- Abstract summary: The solution of the path structured multimarginal Schr"odinger bridge problem (MSBP) is the most-likely measure-valued trajectory.
We leverage recent algorithmic advances in solving such MSBPs for learning hardware resource usage by control software.
- Score: 1.7601096935307592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The solution of the path structured multimarginal Schr\"{o}dinger bridge
problem (MSBP) is the most-likely measure-valued trajectory consistent with a
sequence of observed probability measures or distributional snapshots. We
leverage recent algorithmic advances in solving such structured MSBPs for
learning stochastic hardware resource usage by control software. The solution
enables predicting the time-varying distribution of hardware resource
availability at a desired time with guaranteed linear convergence. We
demonstrate the efficacy of our probabilistic learning approach in a model
predictive control software execution case study. The method exhibits rapid
convergence to an accurate prediction of hardware resource utilization of the
controller. The method can be broadly applied to any software to predict
cyber-physical context-dependent performance at arbitrary time.
Related papers
- Stochastic Learning of Computational Resource Usage as Graph Structured Multimarginal Schrödinger Bridge [1.6111903346958474]
We propose to learn the time-varying computational resource usage of software as a graph structured Schr"odinger bridge problem.
We provide detailed algorithms for learning in both single and multi-core cases, discuss the convergence guarantees, computational complexities, and demonstrate their practical use.
arXiv Detail & Related papers (2024-05-21T02:39:45Z) - Practical Probabilistic Model-based Deep Reinforcement Learning by
Integrating Dropout Uncertainty and Trajectory Sampling [7.179313063022576]
This paper addresses the prediction stability, prediction accuracy and control capability of the current probabilistic model-based reinforcement learning (MBRL) built on neural networks.
A novel approach dropout-based probabilistic ensembles with trajectory sampling (DPETS) is proposed.
arXiv Detail & Related papers (2023-09-20T06:39:19Z) - Robust Control for Dynamical Systems With Non-Gaussian Noise via Formal
Abstractions [59.605246463200736]
We present a novel controller synthesis method that does not rely on any explicit representation of the noise distributions.
First, we abstract the continuous control system into a finite-state model that captures noise by probabilistic transitions between discrete states.
We use state-of-the-art verification techniques to provide guarantees on the interval Markov decision process and compute a controller for which these guarantees carry over to the original control system.
arXiv Detail & Related papers (2023-01-04T10:40:30Z) - Probabilistic Time Series Forecasting for Adaptive Monitoring in Edge
Computing Environments [0.06999740786886537]
In this paper, we propose a sampling-based and cloud-located approach for monitoring critical infrastructures.
We evaluate our prototype implementation for the monitoring pipeline on a publicly available streaming dataset.
arXiv Detail & Related papers (2022-11-24T17:35:14Z) - Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time
Reinforcement Learning [39.07307690074323]
We consider the problem of predicting the distribution of returns obtained by an agent interacting in a continuous-time environment.
Accurate return predictions have proven useful for determining optimal policies for risk-sensitive control, state representations, multiagent coordination, and more.
We propose a tractable algorithm for approximately solving the distributional HJB based on a JKO scheme, which can be implemented in an online control algorithm.
arXiv Detail & Related papers (2022-05-24T16:33:54Z) - Sampling-Based Robust Control of Autonomous Systems with Non-Gaussian
Noise [59.47042225257565]
We present a novel planning method that does not rely on any explicit representation of the noise distributions.
First, we abstract the continuous system into a discrete-state model that captures noise by probabilistic transitions between states.
We capture these bounds in the transition probability intervals of a so-called interval Markov decision process (iMDP)
arXiv Detail & Related papers (2021-10-25T06:18:55Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Resource Allocation in Multi-armed Bandit Exploration: Overcoming
Sublinear Scaling with Adaptive Parallelism [107.48538091418412]
We study exploration in multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls.
We focus in particular on the allocation of distributed computing resources, where we may obtain results faster by allocating more resources per pull.
arXiv Detail & Related papers (2020-10-31T18:19:29Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.