Related papers: Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits

Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits

URL: http://arxiv.org/abs/2004.06231v1
Date: Mon, 13 Apr 2020 23:09:15 GMT
Title: Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits
Authors: Robert Peharz, Steven Lang, Antonio Vergari, Karl Stelzner, Alejandro Molina, Martin Trapp, Guy Van den Broeck, Kristian Kersting, Zoubin Ghahramani
Abstract summary: We propose Einsum Networks (EiNets), a novel implementation design for PCs. At their core, EiNets combine a large number of arithmetic operations in a single monolithic einsum-operation. We show that the implementation of Expectation-Maximization (EM) can be simplified for PCs, by leveraging automatic differentiation.
Score: 99.59941892183454
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Probabilistic circuits (PCs) are a promising avenue for probabilistic modeling, as they permit a wide range of exact and efficient inference routines. Recent ``deep-learning-style'' implementations of PCs strive for a better scalability, but are still difficult to train on real-world data, due to their sparsely connected computational graphs. In this paper, we propose Einsum Networks (EiNets), a novel implementation design for PCs, improving prior art in several regards. At their core, EiNets combine a large number of arithmetic operations in a single monolithic einsum-operation, leading to speedups and memory savings of up to two orders of magnitude, in comparison to previous implementations. As an algorithmic contribution, we show that the implementation of Expectation-Maximization (EM) can be simplified for PCs, by leveraging automatic differentiation. Furthermore, we demonstrate that EiNets scale well to datasets which were previously out of reach, such as SVHN and CelebA, and that they can be used as faithful generative image models.

Related papers

Learning From Simplicial Data Based on Random Walks and 1D Convolutions [6.629765271909503]
simplicial complex neural network learning architecture based on random walks and fast 1D convolutions. We empirically evaluate SCRaWl on real-world datasets and show that it outperforms other simplicial neural networks.
arXiv Detail & Related papers (2024-04-04T13:27:22Z)
Neural Network Approximators for Marginal MAP in Probabilistic Circuits [11.917134619219079]
We propose an approach that uses neural networks to approximate (M)MAP inference in PCs. The two main benefits of our new method are that it is self-supervised and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations.
arXiv Detail & Related papers (2024-02-06T01:15:06Z)
Probabilistic Integral Circuits [11.112802758446344]
We introduce a new language of computational graphs that extends PCs with integral units representing continuous LVs. In practice, we parameterise PICs with light-weight neural nets delivering an intractable hierarchical continuous mixture. We show that such PIC-approximating PCs systematically outperform PCs commonly learned via expectation-maximization or SGD.
arXiv Detail & Related papers (2023-10-25T20:38:18Z)
Sparse Probabilistic Circuits via Pruning and Growing [30.777764474107663]
Probabilistic circuits (PCs) are a tractable representation of probability distributions allowing for exact and efficient computation of likelihoods and marginals. We propose two operations: pruning and growing, that exploit the sparsity of PC structures. By alternatingly applying pruning and growing, we increase the capacity that is meaningfully used, allowing us to significantly scale up PC learning.
arXiv Detail & Related papers (2022-11-22T19:54:52Z)
Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks. To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings. We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z)
HyperSPNs: Compact and Expressive Probabilistic Circuits [89.897635970366]
HyperSPNs is a new paradigm of generating the mixture weights of large PCs using a small-scale neural network. We show the merits of our regularization strategy on two state-of-the-art PC families introduced in recent literature.
arXiv Detail & Related papers (2021-12-02T01:24:43Z)
CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions. A Java library called CREMA has been recently released to model, process and query credal networks. We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z)
One-step regression and classification with crosspoint resistive memory arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge. One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition. Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.