MaxDropout: Deep Neural Network Regularization Based on Maximum Output
Values
- URL: http://arxiv.org/abs/2007.13723v1
- Date: Mon, 27 Jul 2020 17:55:54 GMT
- Title: MaxDropout: Deep Neural Network Regularization Based on Maximum Output
Values
- Authors: Claudio Filipi Goncalves do Santos, Danilo Colombo, Mateus Roder,
Jo\~ao Paulo Papa
- Abstract summary: MaxDropout is a regularizer for deep neural network models that works in a supervised fashion by removing prominent neurons.
We show that it is possible to improve existing neural networks and provide better results in neural networks when Dropout is replaced by MaxDropout.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Different techniques have emerged in the deep learning scenario, such as
Convolutional Neural Networks, Deep Belief Networks, and Long Short-Term Memory
Networks, to cite a few. In lockstep, regularization methods, which aim to
prevent overfitting by penalizing the weight connections, or turning off some
units, have been widely studied either. In this paper, we present a novel
approach called MaxDropout, a regularizer for deep neural network models that
works in a supervised fashion by removing (shutting off) the prominent neurons
(i.e., most active) in each hidden layer. The model forces fewer activated
units to learn more representative information, thus providing sparsity.
Regarding the experiments, we show that it is possible to improve existing
neural networks and provide better results in neural networks when Dropout is
replaced by MaxDropout. The proposed method was evaluated in image
classification, achieving comparable results to existing regularizers, such as
Cutout and RandomErasing, also improving the accuracy of neural networks that
uses Dropout by replacing the existing layer by MaxDropout.
Related papers
- Survey on Leveraging Uncertainty Estimation Towards Trustworthy Deep
Neural Networks: The Case of Reject Option and Post-training Processing [11.1569804870748]
We present a systematic review of the prediction with the reject option in the context of various neural networks.
We address the application of the rejection option in reducing the prediction time for the real-time problems.
arXiv Detail & Related papers (2023-04-11T00:35:10Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Multi-Grade Deep Learning [3.0069322256338906]
Current deep learning model is of a single-grade neural network.
We propose a multi-grade learning model that enables us to learn deep neural network much more effectively and efficiently.
arXiv Detail & Related papers (2023-02-01T00:09:56Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Stochastic Neural Networks with Infinite Width are Deterministic [7.07065078444922]
We study neural networks, a main type of neural network in use.
We prove that as the width of an optimized neural network tends to infinity, its predictive variance on the training set decreases to zero.
arXiv Detail & Related papers (2022-01-30T04:52:31Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Searching for Minimal Optimal Neural Networks [4.94950858749529]
Large neural network models have high predictive power but may suffer from overfitting if the training set is not large enough.
The destructive approach, which starts with a large architecture and then reduces the size using a Lasso-type penalty, has been used extensively for this task.
We prove that Adaptive group Lasso is consistent and can reconstruct the correct number of hidden nodes of one-hidden-layer feedforward networks with high probability.
arXiv Detail & Related papers (2021-09-27T14:08:07Z) - Incremental Deep Neural Network Learning using Classification Confidence
Thresholding [4.061135251278187]
Most modern neural networks for classification fail to take into account the concept of the unknown.
This paper proposes the Classification Confidence Threshold approach to prime neural networks for incremental learning.
arXiv Detail & Related papers (2021-06-21T22:46:28Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Beyond Dropout: Feature Map Distortion to Regularize Deep Neural
Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks.
We propose a feature distortion method (Disout) for addressing the aforementioned problem.
The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.