Exchangeability-Aware Sum-Product Networks
- URL: http://arxiv.org/abs/2110.05165v1
- Date: Mon, 11 Oct 2021 11:25:31 GMT
- Title: Exchangeability-Aware Sum-Product Networks
- Authors: Stefan L\"udtke, Christian Bartelt, Heiner Stuckenschmidt
- Abstract summary: Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference.
The contribution of this paper is a novel probabilistic model which we call Exchangeability-Aware Sum-Product Networks (XSPNs)
- Score: 10.506336354512145
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sum-Product Networks (SPNs) are expressive probabilistic models that provide
exact, tractable inference. They achieve this efficiency by making used of
local independence. On the other hand, mixtures of exchangeable variable models
(MEVMs) are a class of tractable probabilistic models that make use of
exchangeability of random variables to render inference tractable.
Exchangeability, which arises naturally in systems consisting of multiple,
interrelated entities, has not been considered for efficient representation and
inference in SPNs yet. The contribution of this paper is a novel probabilistic
model which we call Exchangeability-Aware Sum-Product Networks (XSPNs). It
contains both SPNs and MEVMs as special cases, and combines the ability of SPNs
to efficiently learn deep probabilistic models with the ability of MEVMs to
efficiently handle exchangeable random variables. We also introduce a structure
learning algorithm for XSPNs and empirically show that they can be more
accurate and efficient than conventional SPNs when the data contains repeated,
interchangeable parts.
Related papers
- $χ$SPN: Characteristic Interventional Sum-Product Networks for Causal Inference in Hybrid Domains [19.439265962277716]
We propose aCharacteristic Interventional Sum-Product Network ($chi$SPN) that is capable of estimating interventional distributions in presence of random variables.
$chi$SPN uses characteristic functions in the leaves of an interventional SPN (iSPN) thereby providing a unified view for discrete and continuous random variables.
A neural network is used to estimate the parameters of the learned iSPN using the intervened data.
arXiv Detail & Related papers (2024-08-14T13:31:32Z) - Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling.
We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z) - Top-Down Bayesian Posterior Sampling for Sum-Product Networks [32.01426831450348]
Sum-product networks (SPNs) are probabilistic models characterized by exact and fast evaluation of fundamental probabilistic operations.
This study aimed to develop a Bayesian learning approach that can be efficiently implemented on large-scale SPNs.
Our method has improved learning-time complexity and demonstrated computational speed tens to more than one hundred times faster and superior predictive performance in numerical experiments on more than 20 datasets.
arXiv Detail & Related papers (2024-06-18T07:36:45Z) - Making Pre-trained Language Models Great on Tabular Prediction [50.70574370855663]
The transferability of deep neural networks (DNNs) has made significant progress in image and language processing.
We present TP-BERTa, a specifically pre-trained LM for tabular data prediction.
A novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names.
arXiv Detail & Related papers (2024-03-04T08:38:56Z) - Deep Stochastic Processes via Functional Markov Transition Operators [59.55961312230447]
We introduce a new class of Processes (SPs) constructed by stacking sequences of neural parameterised Markov transition operators in function space.
We prove that these Markov transition operators can preserve the exchangeability and consistency of SPs.
arXiv Detail & Related papers (2023-05-24T21:15:23Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - PECAN: A Product-Quantized Content Addressable Memory Network [6.530758154165138]
The filtering and linear transform are realized solely with product quantization (PQ)
This results in a natural implementation via content addressable memory (CAM)
Experiments confirm the feasibility of such Product-Quantized Content Addressable Memory Network (PECAN)
arXiv Detail & Related papers (2022-08-13T08:33:56Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Sum-Product-Transform Networks: Exploiting Symmetries using Invertible
Transformations [1.539942973115038]
Sum-Product-Transform Networks (SPTN) is an extension of sum-product networks that uses invertible transformations as additional internal nodes.
G-SPTNs achieve state-of-the-art results on the density estimation task and are competitive with state-of-the-art methods for anomaly detection.
arXiv Detail & Related papers (2020-05-04T07:05:51Z) - Sum-product networks: A survey [0.0]
A sum-product network (SPN) is a probabilistic model, based on a rooted acyclic directed graph.
This paper offers a survey of SPNs, including their definition, the main algorithms for inference and learning from data, the main applications, a brief review of software libraries, and a comparison with related models.
arXiv Detail & Related papers (2020-04-02T17:46:29Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.