Related papers: Adaptive Low-Rank Factorization to regularize shallow and deep neural networks

Adaptive Low-Rank Factorization to regularize shallow and deep neural networks

URL: http://arxiv.org/abs/2005.01995v1
Date: Tue, 5 May 2020 08:13:30 GMT
Title: Adaptive Low-Rank Factorization to regularize shallow and deep neural networks
Authors: Mohammad Mahdi Bejani, Mehdi Ghatee
Abstract summary: We use Low-Rank matrix Factorization (LRF) to drop out some parameters of the learning model along the training process. The best results of AdaptiveLRF on SVHN and CIFAR-10 datasets are 98%, 94.1% F-measure, and 97.9%, 94% accuracy.
Score: 9.607123078804959
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The overfitting is one of the cursing subjects in the deep learning field. To solve this challenge, many approaches were proposed to regularize the learning models. They add some hyper-parameters to the model to extend the generalization; however, it is a hard task to determine these hyper-parameters and a bad setting diverges the training process. In addition, most of the regularization schemes decrease the learning speed. Recently, Tai et al. [1] proposed low-rank tensor decomposition as a constrained filter for removing the redundancy in the convolution kernels of CNN. With a different viewpoint, we use Low-Rank matrix Factorization (LRF) to drop out some parameters of the learning model along the training process. However, this scheme similar to [1] probably decreases the training accuracy when it tries to decrease the number of operations. Instead, we use this regularization scheme adaptively when the complexity of a layer is high. The complexity of any layer can be evaluated by the nonlinear condition numbers of its learning system. The resulted method entitled "AdaptiveLRF" neither decreases the training speed nor vanishes the accuracy of the layer. The behavior of AdaptiveLRF is visualized on a noisy dataset. Then, the improvements are presented on some small-size and large-scale datasets. The preference of AdaptiveLRF on famous dropout regularizers on shallow networks is demonstrated. Also, AdaptiveLRF competes with dropout and adaptive dropout on the various deep networks including MobileNet V2, ResNet V2, DenseNet, and Xception. The best results of AdaptiveLRF on SVHN and CIFAR-10 datasets are 98%, 94.1% F-measure, and 97.9%, 94% accuracy. Finally, we state the usage of the LRF-based loss function to improve the quality of the learning model.

Related papers

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning [24.52451100497884]
Reinforcement finetuning (RFT) has shown great potential for enhancing the mathematical reasoning capabilities of large language models (LLMs) AdaRFT dynamically adjusts the difficulty of training problems based on the model's recent reward signals. AdaRFT reduces training time by up to 2x and improves accuracy by a considerable margin, offering a more scalable and effective RFT framework.
arXiv Detail & Related papers (2025-04-07T21:31:31Z)
Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters. In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z)
Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources. The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z)
LegoNet: A Fast and Exact Unlearning Architecture [59.49058450583149]
Machine unlearning aims to erase the impact of specific training samples upon deleted requests from a trained model. We present a novel network, namely textitLegoNet, which adopts the framework of fixed encoder + multiple adapters'' We show that LegoNet accomplishes fast and exact unlearning while maintaining acceptable performance, synthetically outperforming unlearning baselines.
arXiv Detail & Related papers (2022-10-28T09:53:05Z)
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function. We study the impact of the location of the collocation points on the trainability of these models. We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z)
Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z)
Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks [13.122543280692641]
This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers. An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF) The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage.
arXiv Detail & Related papers (2021-06-17T17:28:14Z)
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task. We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings. RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.