Related papers: Masking schemes for universal marginalisers

Masking schemes for universal marginalisers

URL: http://arxiv.org/abs/2001.05895v1
Date: Thu, 16 Jan 2020 15:35:06 GMT
Title: Masking schemes for universal marginalisers
Authors: Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson, Saurabh Johri
Abstract summary: We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser. We compare networks trained with different masking schemes in terms of their predictive performance and generalisation properties.
Score: 1.0412114420493723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with different masking schemes in terms of their predictive performance and generalisation properties.

Related papers

Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models [65.71506381302815]
We propose amortize the cost of sampling from a posterior distribution of the form $p(mathbfxmidmathbfy) propto p_theta(mathbfx)$. For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference.
arXiv Detail & Related papers (2025-02-10T19:49:54Z)
Denoising Score Matching with Random Features: Insights on Diffusion Models from Precise Learning Curves [8.539326630369592]
We derive precise expressions for test and train errors of denoising score matching in generative diffusion models. We operate in a regime where the dimension $d$, number of data samples $n$, and number of features $p$ tend to infinity. Our work sheds light on the conditions enhancing either generalization or memorization.
arXiv Detail & Related papers (2025-02-01T06:43:33Z)
Mitigating covariate shift in non-colocated data with learned parameter priors [0.0]
We present textitFragmentation-induced co-shift remediation ($FIcsR$), which minimizes an $f$-divergence between a fragment's covariate distribution and that of the standard cross-validation baseline. We run extensive classification experiments on multiple data classes, over $40$ datasets, and with data batched over multiple sequence lengths. The results are promising under all these conditions; with improved accuracy against batch and fold state-of-the-art by more than $5%$ and $10%$, respectively.
arXiv Detail & Related papers (2024-11-10T15:48:29Z)
Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$. We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z)
Prediction with Incomplete Data under Agnostic Mask Distribution Shift [35.86200694774949]
We consider prediction with incomplete data in the presence of distribution shift. We leverage the observation that for each mask, there is an invariant optimal predictor. We propose a novel prediction method called StableMiss.
arXiv Detail & Related papers (2023-05-18T14:06:06Z)
DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition [1.278093617645299]
Anomaly detection can be conceived either through generative modelling of regular training data or by discriminating with respect to negative training data. This paper presents a novel hybrid anomaly score which allows dense open-set recognition on large natural images. Experiments evaluate our contributions on standard dense anomaly detection benchmarks as well as in terms of open-mIoU - a novel metric for dense open-set performance.
arXiv Detail & Related papers (2022-07-06T11:48:50Z)
CARD: Classification and Regression Diffusion Models [51.0421331214229]
We introduce classification and regression diffusion (CARD) models, which combine a conditional generative model and a pre-trained conditional mean estimator. We demonstrate the outstanding ability of CARD in conditional distribution prediction with both toy examples and real-world datasets.
arXiv Detail & Related papers (2022-06-15T03:30:38Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Learning from Incomplete Features by Simultaneous Training of Neural Networks and Sparse Coding [24.3769047873156]
This paper addresses the problem of training a classifier on a dataset with incomplete features. We assume that different subsets of features (random or structured) are available at each data instance. A new supervised learning method is developed to train a general classifier, using only a subset of features per sample.
arXiv Detail & Related papers (2020-11-28T02:20:39Z)
Network Classifiers Based on Social Learning [71.86764107527812]
We propose a new way of combining independently trained classifiers over space and time. The proposed architecture is able to improve prediction performance over time with unlabeled data. We show that this strategy results in consistent learning with high probability, and it yields a robust structure against poorly trained classifiers.
arXiv Detail & Related papers (2020-10-23T11:18:20Z)
Real-Time Regression with Dividing Local Gaussian Processes [62.01822866877782]
Local Gaussian processes are a novel, computationally efficient modeling approach based on Gaussian process regression. Due to an iterative, data-driven division of the input space, they achieve a sublinear computational complexity in the total number of training points in practice. A numerical evaluation on real-world data sets shows their advantages over other state-of-the-art methods in terms of accuracy as well as prediction and update speed.
arXiv Detail & Related papers (2020-06-16T18:43:31Z)
On the Preservation of Spatio-temporal Information in Machine Learning Applications [0.0]
In machine learning applications, each data attribute is assumed to be of others. Shift vectors-in $k$means is proposed in a novel framework with the help of sparse representations. Experiments suggest that feature extraction as a simulation of shallow neural networks provides a little better performance than Gaboral dictionary learning.
arXiv Detail & Related papers (2020-06-15T12:22:36Z)
Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes. It allows computing statistical quantities that are in general difficult to compute. We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.