Learnable Bernoulli Dropout for Bayesian Deep Learning
- URL: http://arxiv.org/abs/2002.05155v1
- Date: Wed, 12 Feb 2020 18:57:14 GMT
- Title: Learnable Bernoulli Dropout for Bayesian Deep Learning
- Authors: Shahin Boluki, Randy Ardywibowo, Siamak Zamani Dadaneh, Mingyuan Zhou,
Xiaoning Qian
- Abstract summary: Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
- Score: 53.79615543862426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose learnable Bernoulli dropout (LBD), a new
model-agnostic dropout scheme that considers the dropout rates as parameters
jointly optimized with other model parameters. By probabilistic modeling of
Bernoulli dropout, our method enables more robust prediction and uncertainty
quantification in deep models. Especially, when combined with variational
auto-encoders (VAEs), LBD enables flexible semi-implicit posterior
representations, leading to new semi-implicit VAE~(SIVAE) models. We solve the
optimization for training with respect to the dropout parameters using
Augment-REINFORCE-Merge (ARM), an unbiased and low-variance gradient estimator.
Our experiments on a range of tasks show the superior performance of our
approach compared with other commonly used dropout schemes. Overall, LBD leads
to improved accuracy and uncertainty estimates in image classification and
semantic segmentation. Moreover, using SIVAE, we can achieve state-of-the-art
performance on collaborative filtering for implicit feedback on several public
datasets.
Related papers
- A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning.
Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation.
We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z) - On the Impact of Sampling on Deep Sequential State Estimation [17.92198582435315]
State inference and parameter learning in sequential models can be successfully performed with approximation techniques.
Tighter Monte Carlo objectives have been proposed in the literature to enhance generative modeling performance.
arXiv Detail & Related papers (2023-11-28T17:59:49Z) - Model Merging by Uncertainty-Based Gradient Matching [70.54580972266096]
We propose a new uncertainty-based scheme to improve the performance by reducing the mismatch.
Our new method gives consistent improvements for large language models and vision transformers.
arXiv Detail & Related papers (2023-10-19T15:02:45Z) - FedDIP: Federated Learning with Extreme Dynamic Pruning and Incremental
Regularization [5.182014186927254]
Federated Learning (FL) has been successfully adopted for distributed training and inference of large-scale Deep Neural Networks (DNNs)
We contribute with a novel FL framework (coined FedDIP) which combines (i) dynamic model pruning with error feedback to eliminate redundant information exchange.
We provide convergence analysis of FedDIP and report on a comprehensive performance and comparative assessment against state-of-the-art methods.
arXiv Detail & Related papers (2023-09-13T08:51:19Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution.
We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space.
We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z) - Contextual Dropout: An Efficient Sample-Dependent Dropout Module [60.63525456640462]
Dropout has been demonstrated as a simple and effective module to regularize the training process of deep neural networks.
We propose contextual dropout with an efficient structural design as a simple and scalable sample-dependent dropout module.
Our experimental results show that the proposed method outperforms baseline methods in terms of both accuracy and quality of uncertainty estimation.
arXiv Detail & Related papers (2021-03-06T19:30:32Z) - Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs)
The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate.
We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.