Accountability in Offline Reinforcement Learning: Explaining Decisions
with a Corpus of Examples
- URL: http://arxiv.org/abs/2310.07747v2
- Date: Fri, 27 Oct 2023 16:23:43 GMT
- Title: Accountability in Offline Reinforcement Learning: Explaining Decisions
with a Corpus of Examples
- Authors: Hao Sun, Alihan H\"uy\"uk, Daniel Jarrett, Mihaela van der Schaar
- Abstract summary: This paper introduces the Accountable Offline Controller (AOC) that employs the offline dataset as the Decision Corpus.
AOC operates effectively in low-data scenarios, can be extended to the strictly offline imitation setting, and displays qualities of both conservation and adaptability.
We assess AOC's performance in both simulated and real-world healthcare scenarios, emphasizing its capability to manage offline control tasks with high levels of performance while maintaining accountability.
- Score: 70.84093873437425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning controllers with offline data in decision-making systems is an
essential area of research due to its potential to reduce the risk of
applications in real-world systems. However, in responsibility-sensitive
settings such as healthcare, decision accountability is of paramount
importance, yet has not been adequately addressed by the literature. This paper
introduces the Accountable Offline Controller (AOC) that employs the offline
dataset as the Decision Corpus and performs accountable control based on a
tailored selection of examples, referred to as the Corpus Subset. AOC operates
effectively in low-data scenarios, can be extended to the strictly offline
imitation setting, and displays qualities of both conservation and
adaptability. We assess AOC's performance in both simulated and real-world
healthcare scenarios, emphasizing its capability to manage offline control
tasks with high levels of performance while maintaining accountability.
Related papers
- Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare [8.920060884688395]
Reinforcement Learning (RL) applied in healthcare can lead to unsafe medical decisions and treatment, such as excessive dosages or abrupt changes, often due to agents overlooking common-sense constraints.
Recent Inverse Constrained Reinforcement Learning (ICRL) is a promising approach that infers constraints from expert demonstrations.
These settings do not align with the practical requirement of a decision-making system in healthcare, where decisions rely on historical treatment recorded in an offline dataset.
Specifically, 1) we utilize a causal attention mechanism to incorporate historical decisions and observations into the constraint modeling, while employing a Non-Markovian layer for weighted constraints
arXiv Detail & Related papers (2024-10-10T01:36:27Z) - Blackout Mitigation via Physics-guided RL [17.807967857394406]
This paper considers the sequential design of remedial control actions in response to system anomalies for the ultimate objective of preventing blackouts.
A physics-guided reinforcement learning framework is designed to identify effective sequences of real-time remedial look-ahead decisions.
arXiv Detail & Related papers (2024-01-17T23:27:36Z) - Investigating Robustness in Cyber-Physical Systems: Specification-Centric Analysis in the face of System Deviations [8.8690305802668]
A critical attribute of cyber-physical systems (CPS) is robustness, denoting its capacity to operate safely.
This paper proposes a novel specification-based robustness, which characterizes the effectiveness of a controller in meeting a specified system requirement.
We present an innovative two-layer simulation-based analysis framework designed to identify subtle robustness violations.
arXiv Detail & Related papers (2023-11-13T16:44:43Z) - Adaptive Online Non-stochastic Control [10.25772015681554]
We tackle the problem of Non-stochastic Control (NSC) with the aim of obtaining algorithms whose policy regret is proportional to the difficulty of the controlled environment.
We tailor the Follow The Regularized Leader (FTRL) framework to dynamical systems by using regularizers that are proportional to the actual witnessed costs.
arXiv Detail & Related papers (2023-10-02T12:32:24Z) - Age of Semantics in Cooperative Communications: To Expedite Simulation
Towards Real via Offline Reinforcement Learning [53.18060442931179]
We propose the age of semantics (AoS) for measuring semantics freshness of status updates in a cooperative relay communication system.
We derive an online deep actor-critic (DAC) learning scheme under the on-policy temporal difference learning framework.
We then put forward a novel offline DAC scheme, which estimates the optimal control policy from a previously collected dataset.
arXiv Detail & Related papers (2022-09-19T11:55:28Z) - COptiDICE: Offline Constrained Reinforcement Learning via Stationary
Distribution Correction Estimation [73.17078343706909]
offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset.
We present an offline constrained RL algorithm that optimize the policy in the space of the stationary distribution.
Our algorithm, COptiDICE, directly estimates the stationary distribution corrections of the optimal policy with respect to returns, while constraining the cost upper bound, with the goal of yielding a cost-conservative policy for actual constraint satisfaction.
arXiv Detail & Related papers (2022-04-19T15:55:47Z) - Controllable Summarization with Constrained Markov Decision Process [50.04321779376415]
We study controllable text summarization which allows users to gain control on a particular attribute.
We propose a novel training framework based on Constrained Markov Decision Process (CMDP)
Our framework can be applied to control important attributes of summarization, including length, covered entities, and abstractiveness.
arXiv Detail & Related papers (2021-08-07T09:12:53Z) - Benchmarks for Deep Off-Policy Evaluation [152.28569758144022]
We present a collection of policies that can be used for benchmarking off-policy evaluation.
The goal of our benchmark is to provide a standardized measure of progress that is motivated from a set of principles.
We provide open-source access to our data and code to foster future research in this area.
arXiv Detail & Related papers (2021-03-30T18:09:33Z) - The Impact of Data on the Stability of Learning-Based Control- Extended
Version [63.97366815968177]
We propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance.
By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions.
arXiv Detail & Related papers (2020-11-20T19:10:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.