RLBayes: a Bayesian Network Structure Learning Algorithm via Reinforcement Learning-Based Search Strategy
- URL: http://arxiv.org/abs/2504.05167v1
- Date: Mon, 07 Apr 2025 15:11:51 GMT
- Title: RLBayes: a Bayesian Network Structure Learning Algorithm via Reinforcement Learning-Based Search Strategy
- Authors: Mingcan Wang, Junchang Xin, Luxuan Qu, Qi Chen, Zhiqiong Wang,
- Abstract summary: Results: RLBayes can converge to the global optimal BN structure, but also it is experimentally proved that RLBayes has a better effect than almost all other search algorithms.
- Score: 6.048200126394996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The score-based structure learning of Bayesian network (BN) is an effective way to learn BN models, which are regarded as some of the most compelling probabilistic graphical models in the field of representation and reasoning under uncertainty. However, the search space of structure learning grows super-exponentially as the number of variables increases, which makes BN structure learning an NP-hard problem, as well as a combination optimization problem (COP). Despite the successes of many heuristic methods on it, the results of the structure learning of BN are usually unsatisfactory. Inspired by Q-learning, in this paper, a Bayesian network structure learning algorithm via reinforcement learning-based (RL-based) search strategy is proposed, namely RLBayes. The method borrows the idea of RL and tends to record and guide the learning process by a dynamically maintained Q-table. By creating and maintaining the dynamic Q-table, RLBayes achieve storing the unlimited search space within limited space, thereby achieving the structure learning of BN via Q-learning. Not only is it theoretically proved that RLBayes can converge to the global optimal BN structure, but also it is experimentally proved that RLBayes has a better effect than almost all other heuristic search algorithms.
Related papers
- Enhancing Bayesian Network Structural Learning with Monte Carlo Tree Search [0.6574756524825567]
This article presents an adaptation of the Monte Carlo Tree Search (MCTS) algorithm for the structural learning of Bayesian Networks (BNs)<n>Initially designed for game tree exploration, MCTS has been repurposed to address the challenge of learning BN structures.<n>We adopt a semi-randomized approach to address this challenge by incorporating variable orders obtained from other search algorithms such as Greedy Equivalent Search (GES), PC, or HC itself.
arXiv Detail & Related papers (2025-02-03T17:08:21Z) - Upside-Down Reinforcement Learning for More Interpretable Optimal Control [2.06242362470764]
We investigate whether function approximation algorithms other than Neural Networks (NNs) can also be used within a Upside-Down Reinforcement Learning framework.
Our experiments, performed over several popular optimal control benchmarks, show that tree-based methods like Random Forests and Extremely Randomized Trees can perform just as well as NNs.
arXiv Detail & Related papers (2024-11-18T10:44:20Z) - A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets)
Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z) - Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks.
We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations.
We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Learning Bayesian Networks in the Presence of Structural Side
Information [22.734574764075226]
We study the problem of learning a Bayesian network (BN) of a set of variables when structural side information about the system is available.
We develop an algorithm that efficiently incorporates such knowledge into the learning process.
As a consequence of our work, we show that bounded treewidth BNs can be learned with complexity.
arXiv Detail & Related papers (2021-12-20T22:14:19Z) - Feature Selection for Efficient Local-to-Global Bayesian Network
Structure Learning [18.736822756439437]
We propose an efficient F2SL (feature selection-based structure learning) approach to local-to-global BN structure learning.
The F2SL approach first employs the MRMR approach to learn a DAG skeleton, then orients edges in the skeleton.
Compared to the state-of-the-art local-to-global BN learning algorithms, the experiments validated that the proposed algorithms are more efficient and provide competitive structure learning quality.
arXiv Detail & Related papers (2021-12-20T07:44:38Z) - Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task.
We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy.
Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Any Part of Bayesian Network Structure Learning [17.46459748913491]
We study an interesting and challenging problem, learning any part of a Bayesian network (BN) structure.
We first present a new concept of Expand-Backtracking to explain why local BN structure learning methods have the false edge orientation problem.
We then propose APSL, an efficient and accurate Any Part of BN Structure Learning algorithm.
arXiv Detail & Related papers (2021-03-23T10:03:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.