Value of structural health information in partially observable
stochastic environments
- URL: http://arxiv.org/abs/1912.12534v2
- Date: Mon, 20 Jul 2020 16:49:06 GMT
- Title: Value of structural health information in partially observable
stochastic environments
- Authors: C.P. Andriotis, K.G. Papakonstantinou, E.N. Chatzi
- Abstract summary: We introduce and study the theoretical and computational foundations of the Value of Information (VoI) and the Value of Structural Health Monitoring (VoSHM)
It is shown that a POMDP policy inherently leverages the notion of VoI to guide observational actions in an optimal way at every decision step.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient integration of uncertain observations with decision-making
optimization is key for prescribing informed intervention actions, able to
preserve structural safety of deteriorating engineering systems. To this end,
it is necessary that scheduling of inspection and monitoring strategies be
objectively performed on the basis of their expected value-based gains that,
among others, reflect quantitative metrics such as the Value of Information
(VoI) and the Value of Structural Health Monitoring (VoSHM). In this work, we
introduce and study the theoretical and computational foundations of the above
metrics within the context of Partially Observable Markov Decision Processes
(POMDPs), thus alluding to a broad class of decision-making problems of
partially observable stochastic deteriorating environments that can be modeled
as POMDPs. Step-wise and life-cycle VoI and VoSHM definitions are devised and
their bounds are analyzed as per the properties stemming from the Bellman
equation and the resulting optimal value function. It is shown that a POMDP
policy inherently leverages the notion of VoI to guide observational actions in
an optimal way at every decision step, and that the permanent or intermittent
information provided by SHM or inspection visits, respectively, can only
improve the cost of this policy in the long-term, something that is not
necessarily true under locally optimal policies, typically adopted in
decision-making of structures and infrastructure. POMDP solutions are derived
based on point-based value iteration methods, and the various definitions are
quantified in stationary and non-stationary deteriorating environments, with
both infinite and finite planning horizons, featuring single- or
multi-component engineering systems.
Related papers
- Bridging POMDPs and Bayesian decision making for robust maintenance
planning under model uncertainty: An application to railway systems [0.7046417074932257]
We present a framework to estimate POMDP transition and observation model parameters directly from available data.
We then form and solve the POMDP problem by exploiting the inferred distributions.
We successfully apply our approach on maintenance planning for railway track assets.
arXiv Detail & Related papers (2022-12-15T16:09:47Z) - Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity.
We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class.
We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z) - Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in
Partially Observed Markov Decision Processes [65.91730154730905]
In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors.
Here we tackle this by considering off-policy evaluation in a partially observed Markov decision process (POMDP)
We extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible.
arXiv Detail & Related papers (2021-10-28T17:46:14Z) - Identification of Unexpected Decisions in Partially Observable
Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces.
The proposed method explores local properties of policy behavior to identify unexpected decisions.
We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z) - Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy.
We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z) - Optimal Inspection and Maintenance Planning for Deteriorating Structural
Components through Dynamic Bayesian Networks and Markov Decision Processes [0.0]
Partially Observable Markov Decision Processes (POMDPs) provide a mathematical methodology for optimal control under uncertain action outcomes and observations.
We provide the formulation for developing both infinite and finite horizon POMDPs in a structural reliability context.
Results show that POMDPs achieve substantially lower costs as compared to their counterparts, even for traditional problem settings.
arXiv Detail & Related papers (2020-09-09T20:03:42Z) - Structural Estimation of Partially Observable Markov Decision Processes [3.1614382994158956]
We consider the structural estimation of the primitives of a POMDP model based upon the observable history of the process.
We illustrate the estimation methodology with an application to optimal equipment replacement.
arXiv Detail & Related papers (2020-08-02T15:04:27Z) - Deep reinforcement learning driven inspection and maintenance planning
under incomplete information and constraints [0.0]
Determination of inspection and maintenance policies constitutes a complex optimization problem.
In this work, these challenges are addressed within a joint framework of constrained Partially Observable Decision Processes (POMDP) and multi-agent Deep Reinforcement Learning (DRL)
The proposed framework is found to outperform well-established policy baselines and facilitate adept prescription of inspection and intervention actions.
arXiv Detail & Related papers (2020-07-02T20:44:07Z) - GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications.
Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions.
The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z) - Kalman meets Bellman: Improving Policy Evaluation through Value Tracking [59.691919635037216]
Policy evaluation is a key process in Reinforcement Learning (RL)
We devise an optimization method, called Kalman Optimization for Value Approximation (KOVA)
KOVA minimizes a regularized objective function that concerns both parameter and noisy return uncertainties.
arXiv Detail & Related papers (2020-02-17T13:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.