Related papers: Inference and dynamic decision-making for deteriorating systems with probabilistic dependencies through Bayesian networks and deep reinforcement learning

Inference and dynamic decision-making for deteriorating systems with probabilistic dependencies through Bayesian networks and deep reinforcement learning

URL: http://arxiv.org/abs/2209.01092v1
Date: Fri, 2 Sep 2022 14:45:40 GMT
Title: Inference and dynamic decision-making for deteriorating systems with probabilistic dependencies through Bayesian networks and deep reinforcement learning
Authors: Pablo G. Morato, Charalampos P. Andriotis, Konstantinos G. Papakonstantinou, Philippe Rigo
Abstract summary: We propose an efficient algorithmic framework for inference and decision-making under uncertainty for engineering systems exposed to deteriorating environments. In terms of policy optimization, we adopt a deep decentralized multi-agent actor-critic (DDMAC) reinforcement learning approach. Results demonstrate that DDMAC policies offer substantial benefits when compared to state-of-the-art approaches.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the context of modern environmental and societal concerns, there is an increasing demand for methods able to identify management strategies for civil engineering systems, minimizing structural failure risks while optimally planning inspection and maintenance (I&M) processes. Most available methods simplify the I&M decision problem to the component level due to the computational complexity associated with global optimization methodologies under joint system-level state descriptions. In this paper, we propose an efficient algorithmic framework for inference and decision-making under uncertainty for engineering systems exposed to deteriorating environments, providing optimal management strategies directly at the system level. In our approach, the decision problem is formulated as a factored partially observable Markov decision process, whose dynamics are encoded in Bayesian network conditional structures. The methodology can handle environments under equal or general, unequal deterioration correlations among components, through Gaussian hierarchical structures and dynamic Bayesian networks. In terms of policy optimization, we adopt a deep decentralized multi-agent actor-critic (DDMAC) reinforcement learning approach, in which the policies are approximated by actor neural networks guided by a critic network. By including deterioration dependence in the simulated environment, and by formulating the cost model at the system level, DDMAC policies intrinsically consider the underlying system-effects. This is demonstrated through numerical experiments conducted for both a 9-out-of-10 system and a steel frame under fatigue deterioration. Results demonstrate that DDMAC policies offer substantial benefits when compared to state-of-the-art heuristic approaches. The inherent consideration of system-effects by DDMAC strategies is also interpreted based on the learned policies.

Related papers

Backscatter Device-aided Integrated Sensing and Communication: A Pareto Optimization Framework [59.30060797118097]
Integrated sensing and communication (ISAC) systems potentially encounter significant performance degradation in densely obstructed urban non-line-of-sight scenarios.<n>This paper proposes a backscatter approximation (BD)-assisted ISAC system, which leverages passive BDs naturally distributed in environments of enhancement.
arXiv Detail & Related papers (2025-07-12T17:11:06Z)
Hypernetwork-based approach for optimal composition design in partially controlled multi-agent systems [5.860363407227059]
Partially Controlled Multi-Agent Systems (PCMAS) are comprised of controllable agents, managed by a system designer, and uncontrollable agents, operating autonomously. This study addresses an optimal composition design problem in PCMAS, which involves the system designer's problem, determining the optimal number and policies of controllable agents, and the uncontrollable agents' problem. We propose a novel hypernetwork-based framework that jointly optimize the system's composition and agent policies.
arXiv Detail & Related papers (2025-02-18T07:35:24Z)
Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery [3.549243565065057]
Imitation learning is a data-driven approach to learning policies from expert behavior. It is prone to unreliable outcomes in out-of-sample (OOS) regions. We propose a framework for learning policies modeled by contractive dynamical systems.
arXiv Detail & Related papers (2024-12-10T14:28:18Z)
A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation [4.144893164317513]
We introduce a novel framework using a decentralized partially observable Markov decision process (Dec_POMDP) At the core of our methodology is the Local Information Aggregation Multi-Agent Deep Deterministic Policy Gradient (LIA_MADDPG) algorithm. Our empirical evaluations show that the LIA module can be seamlessly integrated into various CTDE-based MARL methods.
arXiv Detail & Related papers (2024-11-29T07:53:05Z)
Hierarchical Decision Making Based on Structural Information Principles [19.82391136775341]
We propose a novel Structural Information principles-based framework, namely SIDM, for hierarchical Decision Making.<n>We present an abstraction mechanism that processes historical state-action trajectories to construct abstract representations of states and actions.<n>We develop a skill-based learning method for single-agent scenarios and a role-based collaboration method for multi-agent scenarios, both of which can flexibly integrate various underlying algorithms for enhanced performance.
arXiv Detail & Related papers (2024-04-15T13:02:00Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management [0.0]
We present a multi-agent Deep Reinforcement Learning (DRL) framework for managing large transportation infrastructure systems over their life-cycle. Life-cycle management of such engineering systems is a computationally intensive task, requiring appropriate sequential inspection and maintenance decisions.
arXiv Detail & Related papers (2024-01-23T02:52:36Z)
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z)
UAV Path Planning Employing MPC- Reinforcement Learning Method for search and rescue mission [0.0]
We tackle the problem of Unmanned Aerial (UA V) path planning in complex and uncertain environments. We design a Model Predictive Control (MPC) based on a Long-Short-Term Memory (LSTM) network integrated into the Deep Deterministic Policy Gradient algorithm.
arXiv Detail & Related papers (2023-02-21T13:39:40Z)
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation [61.740187363451746]
Marginalized importance sampling (MIS) measures the density ratio between the state-action occupancy of a target policy and that of a sampling distribution. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.
arXiv Detail & Related papers (2021-06-12T20:21:38Z)
Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation. The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z)
Optimal Inspection and Maintenance Planning for Deteriorating Structural Components through Dynamic Bayesian Networks and Markov Decision Processes [0.0]
Partially Observable Markov Decision Processes (POMDPs) provide a mathematical methodology for optimal control under uncertain action outcomes and observations. We provide the formulation for developing both infinite and finite horizon POMDPs in a structural reliability context. Results show that POMDPs achieve substantially lower costs as compared to their counterparts, even for traditional problem settings.
arXiv Detail & Related papers (2020-09-09T20:03:42Z)
Learning High-Level Policies for Model Predictive Control [54.00297896763184]
Model Predictive Control (MPC) provides robust solutions to robot control tasks. We propose a self-supervised learning algorithm for learning a neural network high-level policy. We show that our approach can handle situations that are difficult for standard MPC.
arXiv Detail & Related papers (2020-07-20T17:12:34Z)
Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework [2.741266294612776]
We present a framework to address a class of sequential decision making problems. Our framework features learning the optimal control policy with robustness to noisy data.
arXiv Detail & Related papers (2020-06-17T04:08:35Z)
Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent [3.118384520557952]
We introduce a deep reinforcement learning algorithm combining policy gradient methods with model-based optimization techniques to solve this problem. In essence, our algorithm iteratively approximates the gradient of the expected return via Monte-Carlo sampling and automatic differentiation. We show that DEPS performs at least as well or better in all three environments, consistently yielding solutions with higher returns in fewer iterations.
arXiv Detail & Related papers (2020-06-02T16:08:07Z)
Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.