Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
- URL: http://arxiv.org/abs/2505.07911v1
- Date: Mon, 12 May 2025 13:34:50 GMT
- Title: Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
- Authors: Chengmin Zhou, Ville Kyrki, Pasi Fränti, Laura Ruotsalainen,
- Abstract summary: This paper focuses on combining Bayesian inference with reinforcement learning.<n>Bayesian inference has many advantages in decision making of agents over a regular data-driven black-box neural network.
- Score: 7.905957228045954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian inference has many advantages in decision making of agents (e.g. robotics/simulative agent) over a regular data-driven black-box neural network: Data-efficiency, generalization, interpretability, and safety where these advantages benefit directly/indirectly from the uncertainty quantification of Bayesian inference. However, there are few comprehensive reviews to summarize the progress of Bayesian inference on reinforcement learning (RL) for decision making to give researchers a systematic understanding. This paper focuses on combining Bayesian inference with RL that nowadays is an important approach in agent decision making. To be exact, this paper discusses the following five topics: 1) Bayesian methods that have potential for agent decision making. First basic Bayesian methods and models (Bayesian rule, Bayesian learning, and Bayesian conjugate models) are discussed followed by variational inference, Bayesian optimization, Bayesian deep learning, Bayesian active learning, Bayesian generative models, Bayesian meta-learning, and lifelong Bayesian learning. 2) Classical combinations of Bayesian methods with model-based RL (with approximation methods), model-free RL, and inverse RL. 3) Latest combinations of potential Bayesian methods with RL. 4) Analytical comparisons of methods that combine Bayesian methods with RL with respect to data-efficiency, generalization, interpretability, and safety. 5) In-depth discussions in six complex problem variants of RL, including unknown reward, partial-observability, multi-agent, multi-task, non-linear non-Gaussian, and hierarchical RL problems and the summary of how Bayesian methods work in the data collection, data processing and policy learning stages of RL to pave the way for better agent decision-making strategies.
Related papers
- Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning [55.36978389831446]
We recast reflective exploration within the Bayes-Adaptive RL framework.<n>Our resulting algorithm, BARL, instructs the LLM to stitch and switch strategies based on observed outcomes.
arXiv Detail & Related papers (2025-05-26T22:51:00Z) - What Matters for Batch Online Reinforcement Learning in Robotics? [65.06558240091758]
The ability to learn from large batches of autonomously collected data for policy improvement holds the promise of enabling truly scalable robot learning.<n>Previous works have applied imitation learning and filtered imitation learning methods to the batch online RL problem.<n>We analyze how these axes affect performance and scaling with the amount of autonomous data.
arXiv Detail & Related papers (2025-05-12T21:24:22Z) - Bayesian inference for data-efficient, explainable, and safe robotic
motion planning: A review [2.8660829482416346]
The application of Bayesian inference in robotic motion planning is lagging behind the comprehensive theory of Bayesian inference.
This paper first provides the probabilistic theories of Bayesian inference which are the preliminary of Bayesian inference for complex cases.
The analysis of Bayesian inference in inverse RL is given to infer the reward functions in a data-efficient manner.
arXiv Detail & Related papers (2023-07-16T12:29:27Z) - ContraBAR: Contrastive Bayes-Adaptive Deep RL [22.649531458557206]
In meta reinforcement learning (meta RL), an agent seeks a Bayes-optimal policy -- the optimal policy when facing an unknown task.
We investigate whether contrastive methods can be used for learning Bayes-optimal behavior.
We propose a simple meta RL algorithm that uses contrastive predictive coding (CPC) in lieu of variational belief inference.
arXiv Detail & Related papers (2023-06-04T17:50:20Z) - Rethinking Bayesian Learning for Data Analysis: The Art of Prior and
Inference in Sparsity-Aware Modeling [20.296566563098057]
Sparse modeling for signal processing and machine learning has been at the focus of scientific research for over two decades.
This article reviews some recent advances in incorporating sparsity-promoting priors into three popular data modeling tools.
arXiv Detail & Related papers (2022-05-28T00:43:52Z) - BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs [22.78390558602203]
We present a representation-agnostic formulation of BRL under partially observability, unifying the previous models under one theoretical umbrella.
We also propose a novel derivation, Bayes-Adaptive Deep Dropout rl (BADDr), based on dropout networks.
arXiv Detail & Related papers (2022-02-17T19:48:35Z) - Collective eXplainable AI: Explaining Cooperative Strategies and Agent
Contribution in Multiagent Reinforcement Learning with Shapley Values [68.8204255655161]
This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values.
Results could have implications for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints.
arXiv Detail & Related papers (2021-10-04T10:28:57Z) - Bayesian Bellman Operators [55.959376449737405]
We introduce a novel perspective on Bayesian reinforcement learning (RL)
Our framework is motivated by the insight that when bootstrapping is introduced, model-free approaches actually infer a posterior over Bellman operators, not value functions.
arXiv Detail & Related papers (2021-06-09T12:20:46Z) - Exploring Bayesian Deep Learning for Urgent Instructor Intervention Need
in MOOC Forums [58.221459787471254]
Massive Open Online Courses (MOOCs) have become a popular choice for e-learning thanks to their great flexibility.
Due to large numbers of learners and their diverse backgrounds, it is taxing to offer real-time support.
With the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention.
This paper explores for the first time Bayesian deep learning on learner-based text posts with two methods: Monte Carlo Dropout and Variational Inference.
arXiv Detail & Related papers (2021-04-26T15:12:13Z) - Continual Learning using a Bayesian Nonparametric Dictionary of Weight
Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings.
We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity.
We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z) - A Tutorial on Learning With Bayesian Networks [8.98526174345299]
A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest.
A Bayesian network can be used to learn causal relationships.
It can also be used to gain understanding about a problem domain and to predict the consequences of intervention.
arXiv Detail & Related papers (2020-02-01T20:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.