Bridging POMDPs and Bayesian decision making for robust maintenance
planning under model uncertainty: An application to railway systems
- URL: http://arxiv.org/abs/2212.07933v1
- Date: Thu, 15 Dec 2022 16:09:47 GMT
- Title: Bridging POMDPs and Bayesian decision making for robust maintenance
planning under model uncertainty: An application to railway systems
- Authors: Giacomo Arcieri, Cyprien Hoelzl, Oliver Schwery, Daniel Straub,
Konstantinos G. Papakonstantinou, Eleni Chatzi
- Abstract summary: We present a framework to estimate POMDP transition and observation model parameters directly from available data.
We then form and solve the POMDP problem by exploiting the inferred distributions.
We successfully apply our approach on maintenance planning for railway track assets.
- Score: 0.7046417074932257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structural Health Monitoring (SHM) describes a process for inferring
quantifiable metrics of structural condition, which can serve as input to
support decisions on the operation and maintenance of infrastructure assets.
Given the long lifespan of critical structures, this problem can be cast as a
sequential decision making problem over prescribed horizons. Partially
Observable Markov Decision Processes (POMDPs) offer a formal framework to solve
the underlying optimal planning task. However, two issues can undermine the
POMDP solutions. Firstly, the need for a model that can adequately describe the
evolution of the structural condition under deterioration or corrective actions
and, secondly, the non-trivial task of recovery of the observation process
parameters from available monitoring data. Despite these potential challenges,
the adopted POMDP models do not typically account for uncertainty on model
parameters, leading to solutions which can be unrealistically confident. In
this work, we address both key issues. We present a framework to estimate POMDP
transition and observation model parameters directly from available data, via
Markov Chain Monte Carlo (MCMC) sampling of a Hidden Markov Model (HMM)
conditioned on actions. The MCMC inference estimates distributions of the
involved model parameters. We then form and solve the POMDP problem by
exploiting the inferred distributions, to derive solutions that are robust to
model uncertainty. We successfully apply our approach on maintenance planning
for railway track assets on the basis of a "fractal value" indicator, which is
computed from actual railway monitoring data.
Related papers
- POMDP inference and robust solution via deep reinforcement learning: An
application to railway optimal maintenance [0.7046417074932257]
We propose a combined framework for inference and robust solution of POMDPs via deep RL.
First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model.
The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization.
arXiv Detail & Related papers (2023-07-16T15:44:58Z) - Learning non-Markovian Decision-Making from State-only Sequences [57.20193609153983]
We develop a model-based imitation of state-only sequences with non-Markov Decision Process (nMDP)
We demonstrate the efficacy of the proposed method in a path planning task with non-Markovian constraints.
arXiv Detail & Related papers (2023-06-27T02:26:01Z) - CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in
Confounded Environments [5.979296454783688]
A major challenge for making accurate and robust action predictions is the problem of confounding.
The partially observable Markov decision process (POMDP) is a widely-used framework to model these and partially-observable decision-making problems.
This paper presents a novel causally-informed extension of "anytime regularized determinized sparse partially observable tree" (AR-DESPOT) to eliminate errors caused by unmeasured confounder variables.
arXiv Detail & Related papers (2023-04-13T22:32:21Z) - Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA)
Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space.
We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z) - Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.
We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z) - Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in
Partially Observed Markov Decision Processes [65.91730154730905]
In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors.
Here we tackle this by considering off-policy evaluation in a partially observed Markov decision process (POMDP)
We extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible.
arXiv Detail & Related papers (2021-10-28T17:46:14Z) - Identification of Unexpected Decisions in Partially Observable
Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces.
The proposed method explores local properties of policy behavior to identify unexpected decisions.
We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Structural Estimation of Partially Observable Markov Decision Processes [3.1614382994158956]
We consider the structural estimation of the primitives of a POMDP model based upon the observable history of the process.
We illustrate the estimation methodology with an application to optimal equipment replacement.
arXiv Detail & Related papers (2020-08-02T15:04:27Z) - Value of structural health information in partially observable
stochastic environments [0.0]
We introduce and study the theoretical and computational foundations of the Value of Information (VoI) and the Value of Structural Health Monitoring (VoSHM)
It is shown that a POMDP policy inherently leverages the notion of VoI to guide observational actions in an optimal way at every decision step.
arXiv Detail & Related papers (2019-12-28T22:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.