Combining Local Symmetry Exploitation and Reinforcement Learning for Optimised Probabilistic Inference -- A Work In Progress
- URL: http://arxiv.org/abs/2503.08786v1
- Date: Tue, 11 Mar 2025 18:00:23 GMT
- Title: Combining Local Symmetry Exploitation and Reinforcement Learning for Optimised Probabilistic Inference -- A Work In Progress
- Authors: Sagad Hamid, Tanya Braun,
- Abstract summary: Efficient probabilistic inference by variable elimination in graphical models requires an optimal elimination order.<n>We adapt a reinforcement learning approach to find efficient contraction orders in tensor networks.<n>We show that leveraging specific structures during inference allows for introducing compact encodings of intermediate results.
- Score: 2.2164989053903805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient probabilistic inference by variable elimination in graphical models requires an optimal elimination order. However, finding an optimal order is a challenging combinatorial optimisation problem for models with a large number of random variables. Most recently, a reinforcement learning approach has been proposed to find efficient contraction orders in tensor networks. Due to the duality between graphical models and tensor networks, we adapt this approach to probabilistic inference in graphical models. Furthermore, we incorporate structure exploitation into the process of finding an optimal order. Currently, the agent's cost function is formulated in terms of intermediate result sizes which are exponential in the number of indices (i.e., random variables). We show that leveraging specific structures during inference allows for introducing compact encodings of intermediate results which can be significantly smaller. By considering the compact encoding sizes for the cost function instead, we enable the agent to explore more efficient contraction orders. The structure we consider in this work is the presence of local symmetries (i.e., symmetries within a model's factors).
Related papers
- AUGUR, A flexible and efficient optimization algorithm for identification of optimal adsorption sites [0.4188114563181615]
Our model combines graph neural networks and Gaussian processes to create a flexible, efficient, symmetry-aware, translation, and rotation-invariant predictor.
It determines the optimal position of large and complicated clusters with far fewer iterations than current state-of-the-art approaches.
It does not rely on hand-crafted features and can be seamlessly employed on any molecule without any alterations.
arXiv Detail & Related papers (2024-09-24T16:03:01Z) - SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization [22.888876901031043]
Neural network pruning is a key technique towards engineering large yet scalable, interpretable, generalizable models.<n>We show how many existing differentiable pruning techniques can be understood as non regularization for group sparse optimization.<n>We propose SequentialAttention++, which advances state the art in large-scale neural network block-wise pruning tasks on the ImageNet and Criteo datasets.
arXiv Detail & Related papers (2024-02-27T21:42:18Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - Distributed stochastic proximal algorithm with random reshuffling for
non-smooth finite-sum optimization [28.862321453597918]
Non-smooth finite-sum minimization is a fundamental problem in machine learning.
This paper develops a distributed proximal-gradient algorithm with random reshuffling to solve the problem.
arXiv Detail & Related papers (2021-11-06T07:29:55Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z) - Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.