Related papers: Active Learning of Markov Decision Processes using Baum-Welch algorithm (Extended)

Active Learning of Markov Decision Processes using Baum-Welch algorithm (Extended)

URL: http://arxiv.org/abs/2110.03014v1
Date: Wed, 6 Oct 2021 18:54:19 GMT
Title: Active Learning of Markov Decision Processes using Baum-Welch algorithm (Extended)
Authors: Giovanni Bacci, Anna Ing\'olfsd\'ottir, Kim Larsen, Rapha\"el Reynouard
Abstract summary: This paper revisits and adapts the classic Baum-Welch algorithm for learning Markov decision processes and Markov chains. We empirically compare our approach with state-of-the-art tools and demonstrate that the proposed active learning procedure can significantly reduce the number of observations required to obtain accurate models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cyber-physical systems (CPSs) are naturally modelled as reactive systems with nondeterministic and probabilistic dynamics. Model-based verification techniques have proved effective in the deployment of safety-critical CPSs. Central for a successful application of such techniques is the construction of an accurate formal model for the system. Manual construction can be a resource-demanding and error-prone process, thus motivating the design of automata learning algorithms to synthesise a system model from observed system behaviours. This paper revisits and adapts the classic Baum-Welch algorithm for learning Markov decision processes and Markov chains. For the case of MDPs, which typically demand more observations, we present a model-based active learning sampling strategy that choses examples which are most informative w.r.t.\ the current model hypothesis. We empirically compare our approach with state-of-the-art tools and demonstrate that the proposed active learning procedure can significantly reduce the number of observations required to obtain accurate models.

Related papers

SHAP-Guided Regularization in Machine Learning Models [1.0515439489916734]
We propose a SHAP-guided regularization framework that incorporates feature importance constraints into model training to enhance both predictive performance and interpretability.<n>Our approach applies entropy-based penalties to encourage sparse, concentrated feature attributions while promoting stability across samples.
arXiv Detail & Related papers (2025-07-31T15:45:38Z)
Approach to Finding a Robust Deep Learning Model [0.28675177318965045]
The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models.<n>We propose a novel approach for determining model robustness using a proposed model selection algorithm designed as a meta-algorithm.<n>Within this framework, we address the influence of training sample size, model weight, and inductive bias on the robustness of deep learning models.
arXiv Detail & Related papers (2025-05-22T20:05:20Z)
Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z)
Model-based Policy Optimization using Symbolic World Model [46.42871544295734]
The application of learning-based control methods in robotics presents significant challenges. One is that model-free reinforcement learning algorithms use observation data with low sample efficiency. We suggest approximating transition dynamics with symbolic expressions, which are generated via symbolic regression.
arXiv Detail & Related papers (2024-07-18T13:49:21Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Partitioned Active Learning for Heterogeneous Systems [5.331649110169476]
We propose the partitioned active learning strategy established upon partitioned GP (PGP) modeling. Global searching scheme accelerates the exploration aspect of active learning. Local searching exploits the active learning criterion induced by the local GP model.
arXiv Detail & Related papers (2021-05-14T02:05:31Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Prediction-Centric Learning of Independent Cascade Dynamics from Partial Observations [13.680949377743392]
We address the problem of learning of a spreading model such that the predictions generated from this model are accurate. We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach. We show that tractable inference from the learned model generates a better prediction of marginal probabilities compared to the original model.
arXiv Detail & Related papers (2020-07-13T17:58:21Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Online learning of both state and dynamics using ensemble Kalman filters [0.0]
This paper investigates the possibility to learn both the dynamics and the state online, i.e. to update their estimates at any time. We consider the implication of learning dynamics online through (i) a global EnKF, (i) a local EnKF and (iii) an iterative EnKF. We then demonstrate numerically the efficiency and assess the accuracy of these methods using one-dimensional, one-scale and two-scale chaotic Lorenz models.
arXiv Detail & Related papers (2020-06-06T13:19:26Z)
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping [4.5497948012757865]
We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.
arXiv Detail & Related papers (2020-01-15T19:13:44Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.