Top-Down Bayesian Posterior Sampling for Sum-Product Networks
- URL: http://arxiv.org/abs/2406.12353v1
- Date: Tue, 18 Jun 2024 07:36:45 GMT
- Title: Top-Down Bayesian Posterior Sampling for Sum-Product Networks
- Authors: Soma Yokoi, Issei Sato,
- Abstract summary: Sum-product networks (SPNs) are probabilistic models characterized by exact and fast evaluation of fundamental probabilistic operations.
This study aimed to develop a Bayesian learning approach that can be efficiently implemented on large-scale SPNs.
Our method has improved learning-time complexity and demonstrated computational speed tens to more than one hundred times faster and superior predictive performance in numerical experiments on more than 20 datasets.
- Score: 32.01426831450348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sum-product networks (SPNs) are probabilistic models characterized by exact and fast evaluation of fundamental probabilistic operations. Its superior computational tractability has led to applications in many fields, such as machine learning with time constraints or accuracy requirements and real-time systems. The structural constraints of SPNs supporting fast inference, however, lead to increased learning-time complexity and can be an obstacle to building highly expressive SPNs. This study aimed to develop a Bayesian learning approach that can be efficiently implemented on large-scale SPNs. We derived a new full conditional probability of Gibbs sampling by marginalizing multiple random variables to expeditiously obtain the posterior distribution. The complexity analysis revealed that our sampling algorithm works efficiently even for the largest possible SPN. Furthermore, we proposed a hyperparameter tuning method that balances the diversity of the prior distribution and optimization efficiency in large-scale SPNs. Our method has improved learning-time complexity and demonstrated computational speed tens to more than one hundred times faster and superior predictive performance in numerical experiments on more than 20 datasets.
Related papers
- Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications [79.53938312089308]
The MIDX-Sampler is a novel adaptive sampling strategy based on an inverted multi-index approach.
Our method is backed by rigorous theoretical analysis, addressing key concerns such as sampling bias, gradient bias, convergence rates, and generalization error bounds.
arXiv Detail & Related papers (2025-01-15T04:09:21Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Faster Stochastic Variance Reduction Methods for Compositional MiniMax
Optimization [50.10952609321302]
compositional minimax optimization is a pivotal challenge across various machine learning domains.
Current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes.
This paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O(kappa3 /epsilon3 )$.
arXiv Detail & Related papers (2023-08-18T14:57:21Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Tackling System and Statistical Heterogeneity for Federated Learning
with Adaptive Client Sampling [34.187387951367526]
Federated learning (FL) algorithms usually sample a fraction in each (partial participation) when the number of participants is large.
Recent works have focused on the convergence analysis of FL.
We obtain new convergence bound for FL algorithms with arbitrary client sampling probabilities.
arXiv Detail & Related papers (2021-12-21T14:28:40Z) - Linear Constraints Learning for Spiking Neurons [0.0]
We propose a supervised multi-spike learning algorithm which reduces the required number of training iterations.
Experimental results show this method offers better efficiency compared to existing algorithms on the MNIST dataset.
arXiv Detail & Related papers (2021-03-10T13:54:05Z) - Improving Neural Network Training in Low Dimensional Random Bases [5.156484100374058]
We show that keeping the random projection fixed throughout training is detrimental to optimization.
We propose re-drawing the random subspace at each step, which yields significantly better performance.
We realize further improvements by applying independent projections to different parts of the network, making the approximation more efficient as network dimensionality grows.
arXiv Detail & Related papers (2020-11-09T19:50:19Z) - Quantum-enhanced analysis of discrete stochastic processes [0.8057006406834467]
We propose a quantum algorithm for calculating the characteristic function of a Discrete processes (DSP)
It completely defines its probability distribution, using the number of quantum circuit elements that grows only linearly with the number of time steps.
The algorithm takes all trajectories into account and hence eliminates the need of importance sampling.
arXiv Detail & Related papers (2020-08-14T16:07:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.