Deep Belief Markov Models for POMDP Inference
- URL: http://arxiv.org/abs/2503.13438v1
- Date: Mon, 17 Mar 2025 17:58:45 GMT
- Title: Deep Belief Markov Models for POMDP Inference
- Authors: Giacomo Arcieri, Konstantinos G. Papakonstantinou, Daniel Straub, Eleni Chatzi,
- Abstract summary: This work introduces a novel deep learning-based architecture, termed the Deep Belief Markov Model (DBMM)<n>DBMM provides efficient, model-formulation inference in Partially Observable Markov Decision Process (POMDP) problems.<n>We evaluate the efficacy of the proposed methodology by evaluating the capability of model-formulation agnostic inference of DBMMs in benchmark problems.
- Score: 0.40498500266986387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work introduces a novel deep learning-based architecture, termed the Deep Belief Markov Model (DBMM), which provides efficient, model-formulation agnostic inference in Partially Observable Markov Decision Process (POMDP) problems. The POMDP framework allows for modeling and solving sequential decision-making problems under observation uncertainty. In complex, high-dimensional, partially observable environments, existing methods for inference based on exact computations (e.g., via Bayes' theorem) or sampling algorithms do not scale well. Furthermore, ground truth states may not be available for learning the exact transition dynamics. DBMMs extend deep Markov models into the partially observable decision-making framework and allow efficient belief inference entirely based on available observation data via variational inference methods. By leveraging the potency of neural networks, DBMMs can infer and simulate non-linear relationships in the system dynamics and naturally scale to problems with high dimensionality and discrete or continuous variables. In addition, neural network parameters can be dynamically updated efficiently based on data availability. DBMMs can thus be used to infer a belief variable, thus enabling the derivation of POMDP solutions over the belief space. We evaluate the efficacy of the proposed methodology by evaluating the capability of model-formulation agnostic inference of DBMMs in benchmark problems that include discrete and continuous variables.
Related papers
- Robust Counterfactual Inference in Markov Decision Processes [1.5197843979051473]
Current approaches assume a specific causal model to make counterfactuals identifiable.
We propose a novel non-parametric approach that computes tight bounds on counterfactual transition probabilities.
arXiv Detail & Related papers (2025-02-19T13:56:20Z) - Variational Inference of Parameters in Opinion Dynamics Models [9.51311391391997]
This work uses variational inference to estimate the parameters of an opinion dynamics ABM.
We transform the inference process into an optimization problem suitable for automatic differentiation.
Our approach estimates both macroscopic (bounded confidence intervals and backfire thresholds) and microscopic ($200$ categorical, agent-level roles) more accurately than simulation-based and MCMC methods.
arXiv Detail & Related papers (2024-03-08T14:45:18Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference [47.460898983429374]
We introduce an ensemble Kalman filter (EnKF) into the non-mean-field (NMF) variational inference framework to approximate the posterior distribution of the latent states.
This novel marriage between EnKF and GPSSM not only eliminates the need for extensive parameterization in learning variational distributions, but also enables an interpretable, closed-form approximation of the evidence lower bound (ELBO)
We demonstrate that the resulting EnKF-aided online algorithm embodies a principled objective function by ensuring data-fitting accuracy while incorporating model regularizations to mitigate overfitting.
arXiv Detail & Related papers (2023-12-10T15:22:30Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - POMDP inference and robust solution via deep reinforcement learning: An
application to railway optimal maintenance [0.7046417074932257]
We propose a combined framework for inference and robust solution of POMDPs via deep RL.
First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model.
The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization.
arXiv Detail & Related papers (2023-07-16T15:44:58Z) - Free-Form Variational Inference for Gaussian Process State-Space Models [21.644570034208506]
We propose a new method for inference in Bayesian GPSSMs.
Our method is based on freeform variational inference via inducing Hamiltonian Monte Carlo.
We show that our approach can learn transition dynamics and latent states more accurately than competing methods.
arXiv Detail & Related papers (2023-02-20T11:34:16Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Bridging POMDPs and Bayesian decision making for robust maintenance
planning under model uncertainty: An application to railway systems [0.7046417074932257]
We present a framework to estimate POMDP transition and observation model parameters directly from available data.
We then form and solve the POMDP problem by exploiting the inferred distributions.
We successfully apply our approach on maintenance planning for railway track assets.
arXiv Detail & Related papers (2022-12-15T16:09:47Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Influence Estimation and Maximization via Neural Mean-Field Dynamics [60.91291234832546]
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems.
Our framework can simultaneously learn the structure of the diffusion network and the evolution of node infection probabilities.
arXiv Detail & Related papers (2021-06-03T00:02:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.