Comparison between layer-to-layer network training and conventional
network training using Deep Convolutional Neural Networks
- URL: http://arxiv.org/abs/2303.15245v2
- Date: Thu, 11 May 2023 03:38:10 GMT
- Title: Comparison between layer-to-layer network training and conventional
network training using Deep Convolutional Neural Networks
- Authors: Kiran Kumar Ashish Bhyravabhottla and WonSook Lee
- Abstract summary: Convolutional neural networks (CNNs) are widely used in various applications due to their effectiveness in extracting features from data.
We propose a layer-to-layer training method and compare its performance with the conventional training method.
Our experiments show that the layer-to-layer training method outperforms the conventional training method for both models.
- Score: 0.6853165736531939
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Title: Comparison between layer-to-layer network training and conventional
network training using Deep Convolutional Neural Networks
Abstract: Convolutional neural networks (CNNs) are widely used in various
applications due to their effectiveness in extracting features from data.
However, the performance of a CNN heavily depends on its architecture and
training process. In this study, we propose a layer-to-layer training method
and compare its performance with the conventional training method.
In the layer-to-layer training approach, we treat a portion of the early
layers as a student network and the later layers as a teacher network. During
each training step, we incrementally train the student network to learn from
the output of the teacher network, and vice versa. We evaluate this approach on
VGG16, ResNext, and DenseNet networks without pre-trained ImageNet weights and
a regular CNN model.
Our experiments show that the layer-to-layer training method outperforms the
conventional training method for both models. Specifically, we achieve higher
accuracy on the test set for the VGG16, ResNext, and DeseNet networks and the
CNN model using layer-to-layer training compared to the conventional training
method.
Overall, our study highlights the importance of layer-wise training in CNNs
and suggests that layer-to-layer training can be a promising approach for
improving the accuracy of CNNs.
Related papers
- A Gradient Boosting Approach for Training Convolutional and Deep Neural
Networks [0.0]
We introduce two procedures for training Convolutional Neural Networks (CNNs) and Deep Neural Network based on Gradient Boosting (GB)
The presented models show superior performance in terms of classification accuracy with respect to standard CNN and Deep-NN with the same architectures.
arXiv Detail & Related papers (2023-02-22T12:17:32Z) - Efficient DNN Training with Knowledge-Guided Layer Freezing [9.934418641613105]
Training deep neural networks (DNNs) is time-consuming.
This paper goes one step further by skipping computing and communication through DNN layer freezing.
KGT achieves 19%-43% training speedup w.r.t. the state-of-the-art without sacrificing accuracy.
arXiv Detail & Related papers (2022-01-17T06:08:49Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - Simultaneous Training of Partially Masked Neural Networks [67.19481956584465]
We show that it is possible to train neural networks in such a way that a predefined 'core' subnetwork can be split-off from the trained full network with remarkable good performance.
We show that training a Transformer with a low-rank core gives a low-rank model with superior performance than when training the low-rank model alone.
arXiv Detail & Related papers (2021-06-16T15:57:51Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Accelerated MRI with Un-trained Neural Networks [29.346778609548995]
We address the reconstruction problem arising in accelerated MRI with un-trained neural networks.
We propose a highly optimized un-trained recovery approach based on a variation of the Deep Decoder.
We find that our un-trained algorithm achieves similar performance to a baseline trained neural network, but a state-of-the-art trained network outperforms the un-trained one.
arXiv Detail & Related papers (2020-07-06T00:01:25Z) - Go Wide, Then Narrow: Efficient Training of Deep Thin Networks [62.26044348366186]
We propose an efficient method to train a deep thin network with a theoretic guarantee.
By training with our method, ResNet50 can outperform ResNet101, and BERT Base can be comparable with BERT Large.
arXiv Detail & Related papers (2020-07-01T23:34:35Z) - A Hybrid Method for Training Convolutional Neural Networks [3.172761915061083]
We propose a hybrid method that uses both backpropagation and evolutionary strategies to train Convolutional Neural Networks.
We show that the proposed hybrid method is capable of improving upon regular training in the task of image classification.
arXiv Detail & Related papers (2020-04-15T17:52:48Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z) - Taylorized Training: Towards Better Approximation of Neural Network
Training at Finite Width [116.69845849754186]
Taylorized training involves training the $k$-th order Taylor expansion of the neural network.
We show that Taylorized training agrees with full neural network training increasingly better as we increase $k$.
We complement our experiments with theoretical results showing that the approximation error of $k$-th order Taylorized models decay exponentially over $k$ in wide neural networks.
arXiv Detail & Related papers (2020-02-10T18:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.