Sparsity-Inducing Categorical Prior Improves Robustness of the
Information Bottleneck
- URL: http://arxiv.org/abs/2203.02592v1
- Date: Fri, 4 Mar 2022 22:22:51 GMT
- Title: Sparsity-Inducing Categorical Prior Improves Robustness of the
Information Bottleneck
- Authors: Anirban Samaddar, Sandeep Madireddy, Prasanna Balaprakash
- Abstract summary: We present a novel sparsity-inducing spike-slab prior that uses sparsity as a mechanism to provide flexibility.
We show that the proposed approach improves the accuracy and robustness compared with the traditional fixed -imensional priors.
- Score: 4.2903672492917755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The information bottleneck framework provides a systematic approach to learn
representations that compress nuisance information in inputs and extract
semantically meaningful information about the predictions. However, the choice
of the prior distribution that fix the dimensionality across all the data can
restrict the flexibility of this approach to learn robust representations. We
present a novel sparsity-inducing spike-slab prior that uses sparsity as a
mechanism to provide flexibility that allows each data point to learn its own
dimension distribution. In addition, it provides a mechanism to learn a joint
distribution of the latent variable and the sparsity. Thus, unlike other
approaches, it can account for the full uncertainty in the latent space.
Through a series of experiments using in-distribution and out-of-distribution
learning scenarios on the MNIST and Fashion-MNIST data we show that the
proposed approach improves the accuracy and robustness compared with the
traditional fixed -imensional priors as well as other sparsity-induction
mechanisms proposed in the literature.
Related papers
- The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - Uncertainty Quantification via Stable Distribution Propagation [60.065272548502]
We propose a new approach for propagating stable probability distributions through neural networks.
Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity.
arXiv Detail & Related papers (2024-02-13T09:40:19Z) - Unveiling the Potential of Probabilistic Embeddings in Self-Supervised
Learning [4.124934010794795]
Self-supervised learning has played a pivotal role in advancing machine learning by allowing models to acquire meaningful representations from unlabeled data.
We investigate the impact of probabilistic modeling on the information bottleneck, shedding light on a trade-off between compression and preservation of information in both representation and loss space.
Our findings suggest that introducing an additional bottleneck in the loss space can significantly enhance the ability to detect out-of-distribution examples.
arXiv Detail & Related papers (2023-10-27T12:01:16Z) - Integrating Large Pre-trained Models into Multimodal Named Entity
Recognition with Evidential Fusion [31.234455370113075]
We propose incorporating uncertainty estimation into the MNER task, producing trustworthy predictions.
Our proposed algorithm models the distribution of each modality as a Normal-inverse Gamma distribution, and fuses them into a unified distribution.
Experiments on two datasets demonstrate that our proposed method outperforms the baselines and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2023-06-29T14:50:23Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via
Intermediary Latents [60.785317191131284]
We introduce a simple and effective method for learning VAEs with controllable biases by using an intermediary set of latent variables.
In particular, it allows us to impose desired properties like sparsity or clustering on learned representations.
We show that this, in turn, allows InteL-VAEs to learn both better generative models and representations.
arXiv Detail & Related papers (2021-06-25T16:34:05Z) - Incorporating Causal Graphical Prior Knowledge into Predictive Modeling
via Simple Data Augmentation [92.96204497841032]
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions.
We propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the conditional independence (CI) relations.
We experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
arXiv Detail & Related papers (2021-02-27T06:13:59Z) - Robust model training and generalisation with Studentising flows [22.757298187704745]
We discuss how these methods can be further improved based on insights from robust (in particular, resistant) statistics.
We propose to endow flow-based models with fat-tailed latent distributions as a simple drop-in replacement for the Gaussian distribution.
Experiments on several different datasets confirm the efficacy of the proposed approach.
arXiv Detail & Related papers (2020-06-11T16:47:01Z) - Secure and Differentially Private Bayesian Learning on Distributed Data [17.098036331529784]
We present a distributed Bayesian learning approach via Preconditioned Langevin Dynamics with RMSprop, which combines differential privacy and homomorphic encryption in a manner while protecting private information.
We applied the proposed secure and privacy-preserving distributed Bayesian learning approach to logistic regression and survival analysis on distributed data, and demonstrated its feasibility in terms of prediction accuracy and time complexity, compared to the centralized approach.
arXiv Detail & Related papers (2020-05-22T05:13:43Z) - Distributionally Robust Chance Constrained Programming with Generative
Adversarial Networks (GANs) [0.0]
A novel generative adversarial network (GAN) based data-driven distributionally robust chance constrained programming framework is proposed.
GAN is applied to fully extract distributional information from historical data in a nonparametric and unsupervised way.
The proposed framework is then applied to supply chain optimization under demand uncertainty.
arXiv Detail & Related papers (2020-02-28T00:05:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.