Adaptive Low-Rank Factorization to regularize shallow and deep neural
networks
- URL: http://arxiv.org/abs/2005.01995v1
- Date: Tue, 5 May 2020 08:13:30 GMT
- Title: Adaptive Low-Rank Factorization to regularize shallow and deep neural
networks
- Authors: Mohammad Mahdi Bejani, Mehdi Ghatee
- Abstract summary: We use Low-Rank matrix Factorization (LRF) to drop out some parameters of the learning model along the training process.
The best results of AdaptiveLRF on SVHN and CIFAR-10 datasets are 98%, 94.1% F-measure, and 97.9%, 94% accuracy.
- Score: 9.607123078804959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The overfitting is one of the cursing subjects in the deep learning field. To
solve this challenge, many approaches were proposed to regularize the learning
models. They add some hyper-parameters to the model to extend the
generalization; however, it is a hard task to determine these hyper-parameters
and a bad setting diverges the training process. In addition, most of the
regularization schemes decrease the learning speed. Recently, Tai et al. [1]
proposed low-rank tensor decomposition as a constrained filter for removing the
redundancy in the convolution kernels of CNN. With a different viewpoint, we
use Low-Rank matrix Factorization (LRF) to drop out some parameters of the
learning model along the training process. However, this scheme similar to [1]
probably decreases the training accuracy when it tries to decrease the number
of operations. Instead, we use this regularization scheme adaptively when the
complexity of a layer is high. The complexity of any layer can be evaluated by
the nonlinear condition numbers of its learning system. The resulted method
entitled "AdaptiveLRF" neither decreases the training speed nor vanishes the
accuracy of the layer. The behavior of AdaptiveLRF is visualized on a noisy
dataset. Then, the improvements are presented on some small-size and
large-scale datasets. The preference of AdaptiveLRF on famous dropout
regularizers on shallow networks is demonstrated. Also, AdaptiveLRF competes
with dropout and adaptive dropout on the various deep networks including
MobileNet V2, ResNet V2, DenseNet, and Xception. The best results of
AdaptiveLRF on SVHN and CIFAR-10 datasets are 98%, 94.1% F-measure, and 97.9%,
94% accuracy. Finally, we state the usage of the LRF-based loss function to
improve the quality of the learning model.
Related papers
- Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources.
The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z) - LegoNet: A Fast and Exact Unlearning Architecture [59.49058450583149]
Machine unlearning aims to erase the impact of specific training samples upon deleted requests from a trained model.
We present a novel network, namely textitLegoNet, which adopts the framework of fixed encoder + multiple adapters''
We show that LegoNet accomplishes fast and exact unlearning while maintaining acceptable performance, synthetically outperforming unlearning baselines.
arXiv Detail & Related papers (2022-10-28T09:53:05Z) - Adaptive Self-supervision Algorithms for Physics-informed Neural
Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function.
We study the impact of the location of the collocation points on the trainability of these models.
We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy
Weights in Deep Networks [13.122543280692641]
This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers.
An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF)
The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage.
arXiv Detail & Related papers (2021-06-17T17:28:14Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.