Related papers: Risk-Sensitive and Robust Model-Based Reinforcement Learning and Planning

Risk-Sensitive and Robust Model-Based Reinforcement Learning and Planning

URL: http://arxiv.org/abs/2304.00573v1
Date: Sun, 2 Apr 2023 16:44:14 GMT
Title: Risk-Sensitive and Robust Model-Based Reinforcement Learning and Planning
Authors: Marc Rigter
Abstract summary: We will address both planning and reinforcement learning approaches to sequential decision-making. In many real-world domains, it is impossible to construct a perfectly accurate model or simulator. We make a number of contributions towards this goal, with a focus on model-based algorithms.
Score: 2.627046865670577
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many sequential decision-making problems that are currently automated, such as those in manufacturing or recommender systems, operate in an environment where there is either little uncertainty, or zero risk of catastrophe. As companies and researchers attempt to deploy autonomous systems in less constrained environments, it is increasingly important that we endow sequential decision-making algorithms with the ability to reason about uncertainty and risk. In this thesis, we will address both planning and reinforcement learning (RL) approaches to sequential decision-making. In the planning setting, it is assumed that a model of the environment is provided, and a policy is optimised within that model. Reinforcement learning relies upon extensive random exploration, and therefore usually requires a simulator in which to perform training. In many real-world domains, it is impossible to construct a perfectly accurate model or simulator. Therefore, the performance of any policy is inevitably uncertain due to the incomplete knowledge about the environment. Furthermore, in stochastic domains, the outcome of any given run is also uncertain due to the inherent randomness of the environment. These two sources of uncertainty are usually classified as epistemic, and aleatoric uncertainty, respectively. The over-arching goal of this thesis is to contribute to developing algorithms that mitigate both sources of uncertainty in sequential decision-making problems. We make a number of contributions towards this goal, with a focus on model-based algorithms...

Related papers

Conformalized Decision Risk Assessment [5.391713612899277]
We introduce CREDO, a novel framework that quantifies for any candidate decision, a distribution-free upper bound on the probability that the decision is suboptimal.<n>By combining inverse optimization geometry with conformal prediction and generative modeling, CREDO produces risk certificates that are both statistically rigorous and practically interpretable.
arXiv Detail & Related papers (2025-05-19T15:24:38Z)
Causally Aligned Curriculum Learning [69.11672390876763]
This paper studies the problem of curriculum RL through causal lenses. We derive a sufficient graphical condition characterizing causally aligned source tasks. We develop an efficient algorithm to generate a causally aligned curriculum.
arXiv Detail & Related papers (2025-03-21T02:20:38Z)
Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation. We propose methods tailored to the unique properties of perception and decision-making. We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z)
Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions [80.34972679938483]
We introduce Conformal Decision Theory, a framework for producing safe autonomous decisions despite imperfect machine learning predictions. Decisions produced by our algorithms are safe in the sense that they come with provable statistical guarantees of having low risk. Experiments demonstrate the utility of our approach in robot motion planning around humans, automated stock trading, and robot manufacturing.
arXiv Detail & Related papers (2023-10-09T17:59:30Z)
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z)
Risk-reducing design and operations toolkit: 90 strategies for managing risk and uncertainty in decision problems [65.268245109828]
This paper develops a catalog of such strategies and develops a framework for them. It argues that they provide an efficient response to decision problems that are seemingly intractable due to high uncertainty. It then proposes a framework to incorporate them into decision theory using multi-objective optimization.
arXiv Detail & Related papers (2023-09-06T16:14:32Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning [25.218430053391884]
We propose risk-sensitivity as a mechanism to jointly address both of these issues. Risk-aversion to aleatoric uncertainty discourages actions that may result in poor outcomes due to environmentity. Our experiments show that our algorithm achieves competitive performance on deterministic benchmarks.
arXiv Detail & Related papers (2022-11-30T21:24:11Z)
Adaptive Risk Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning [17.940958199767234]
We present a distributional reinforcement learning framework to learn adaptive risk tendency policies. We show our algorithm can adjust its risk-sensitivity on the fly both in simulation and real-world experiments.
arXiv Detail & Related papers (2022-03-28T13:39:58Z)
Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model [42.28001762749647]
In high-stake scenarios like medical treatment and auto-piloting, it's risky or even infeasible to collect online experimental data to train the agent. We consider policy learning for Robust Markov Decision Processes (RMDP), where the agent tries to seek a robust policy with respect to unexpected perturbations on the environments. Our goal is to identify a near-optimal robust policy for the perturbed testing environment, which introduces additional technical difficulties.
arXiv Detail & Related papers (2022-03-13T06:37:25Z)
Enabling risk-aware Reinforcement Learning for medical interventions through uncertainty decomposition [9.208828373290487]
Reinforcement Learning (RL) is emerging as tool for tackling complex control and decision-making problems. It is often challenging to bridge the gap between an apparently optimal policy learnt by an agent and its real-world deployment. Here we propose how a distributional approach (UA-DQN) can be recast to render uncertainties by decomposing the net effects of each uncertainty.
arXiv Detail & Related papers (2021-09-16T09:36:53Z)
Multi Agent System for Machine Learning Under Uncertainty in Cyber Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing. Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it. In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z)
An Offline Risk-aware Policy Selection Method for Bayesian Markov Decision Processes [0.0]
Exploitation vs Caution (EvC) is a paradigm that elegantly incorporates model uncertainty abiding by the Bayesian formalism. We validate EvC with state-of-the-art approaches in different discrete, yet simple, environments offering a fair variety of MDP classes. In the tested scenarios EvC manages to select robust policies and hence stands out as a useful tool for practitioners.
arXiv Detail & Related papers (2021-05-27T20:12:20Z)
Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences. We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.