Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization
- URL: http://arxiv.org/abs/2010.05244v2
- Date: Tue, 10 Aug 2021 08:04:11 GMT
- Title: Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization
- Authors: Jiyang Xie and Zhanyu Ma and and Jianjun Lei and Guoqiang Zhang and
Jing-Hao Xue and Zheng-Hua Tan and Jun Guo
- Abstract summary: Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs)
The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate.
We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
- Score: 62.8384110757689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to lack of data, overfitting ubiquitously exists in real-world
applications of deep neural networks (DNNs). We propose advanced dropout, a
model-free methodology, to mitigate overfitting and improve the performance of
DNNs. The advanced dropout technique applies a model-free and easily
implemented distribution with parametric prior, and adaptively adjusts dropout
rate. Specifically, the distribution parameters are optimized by stochastic
gradient variational Bayes in order to carry out an end-to-end training. We
evaluate the effectiveness of the advanced dropout against nine dropout
techniques on seven computer vision datasets (five small-scale datasets and two
large-scale datasets) with various base models. The advanced dropout
outperforms all the referred techniques on all the datasets.We further compare
the effectiveness ratios and find that advanced dropout achieves the highest
one on most cases. Next, we conduct a set of analysis of dropout rate
characteristics, including convergence of the adaptive dropout rate, the
learned distributions of dropout masks, and a comparison with dropout rate
generation without an explicit distribution. In addition, the ability of
overfitting prevention is evaluated and confirmed. Finally, we extend the
application of the advanced dropout to uncertainty inference, network pruning,
text classification, and regression. The proposed advanced dropout is also
superior to the corresponding referred methods. Codes are available at
https://github.com/PRIS-CV/AdvancedDropout.
Related papers
- FlexiDrop: Theoretical Insights and Practical Advances in Random Dropout Method on GNNs [4.52430575477004]
We propose a novel random dropout method for Graph Neural Networks (GNNs) called FlexiDrop.
We show that our method enables adaptive adjustment of the dropout rate and theoretically balances the trade-off between model complexity and generalization ability.
arXiv Detail & Related papers (2024-05-30T12:48:44Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net
Estimation and Optimization [58.90989478049686]
Bi-Drop is a fine-tuning strategy that selectively updates model parameters using gradients from various sub-nets.
Experiments on the GLUE benchmark demonstrate that Bi-Drop consistently outperforms previous fine-tuning methods.
arXiv Detail & Related papers (2023-05-24T06:09:26Z) - Dropout Reduces Underfitting [85.61466286688385]
In this study, we demonstrate that dropout can also mitigate underfitting when used at the start of training.
We find dropout reduces the directional variance of gradients across mini-batches and helps align the mini-batch gradients with the entire dataset's gradient.
Our findings lead us to a solution for improving performance in underfitting models - early dropout: dropout is applied only during the initial phases of training, and turned off afterwards.
arXiv Detail & Related papers (2023-03-02T18:59:15Z) - Training Deep Normalizing Flow Models in Highly Incomplete Data
Scenarios with Prior Regularization [13.985534521589257]
We propose a novel framework to facilitate the learning of data distributions in high paucity scenarios.
The proposed framework naturally stems from posing the process of learning from incomplete data as a joint optimization task.
arXiv Detail & Related papers (2021-04-03T20:57:57Z) - Contextual Dropout: An Efficient Sample-Dependent Dropout Module [60.63525456640462]
Dropout has been demonstrated as a simple and effective module to regularize the training process of deep neural networks.
We propose contextual dropout with an efficient structural design as a simple and scalable sample-dependent dropout module.
Our experimental results show that the proposed method outperforms baseline methods in terms of both accuracy and quality of uncertainty estimation.
arXiv Detail & Related papers (2021-03-06T19:30:32Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.