AutoTune: Automatically Tuning Convolutional Neural Networks for
Improved Transfer Learning
- URL: http://arxiv.org/abs/2005.02165v2
- Date: Thu, 3 Dec 2020 05:35:23 GMT
- Title: AutoTune: Automatically Tuning Convolutional Neural Networks for
Improved Transfer Learning
- Authors: S.H.Shabbeer Basha, Sravan Kumar Vinakota, Viswanath Pulabaigari,
Snehasis Mukherjee, Shiv Ram Dubey
- Abstract summary: We introduce a mechanism for automatically tuning the Convolutional Neural Networks (CNN) for improved transfer learning.
The pre-trained CNN layers are tuned with the knowledge from target data using Bayesian Optimization.
Experiments are conducted on three benchmark datasets, e.g., CalTech-101, CalTech-256, and Stanford Dogs.
- Score: 13.909484906513102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning enables solving a specific task having limited data by
using the pre-trained deep networks trained on large-scale datasets. Typically,
while transferring the learned knowledge from source task to the target task,
the last few layers are fine-tuned (re-trained) over the target dataset.
However, these layers are originally designed for the source task that might
not be suitable for the target task. In this paper, we introduce a mechanism
for automatically tuning the Convolutional Neural Networks (CNN) for improved
transfer learning. The pre-trained CNN layers are tuned with the knowledge from
target data using Bayesian Optimization. First, we train the final layer of the
base CNN model by replacing the number of neurons in the softmax layer with the
number of classes involved in the target task. Next, the pre-trained CNN is
tuned automatically by observing the classification performance on the
validation data (greedy criteria). To evaluate the performance of the proposed
method, experiments are conducted on three benchmark datasets, e.g.,
CalTech-101, CalTech-256, and Stanford Dogs. The classification results
obtained through the proposed AutoTune method outperforms the standard baseline
transfer learning methods over the three datasets by achieving $95.92\%$,
$86.54\%$, and $84.67\%$ accuracy over CalTech-101, CalTech-256, and Stanford
Dogs, respectively. The experimental results obtained in this study depict that
tuning of the pre-trained CNN layers with the knowledge from the target dataset
confesses better transfer learning ability. The source codes are available at
https://github.com/JekyllAndHyde8999/AutoTune_CNN_TransferLearning.
Related papers
- Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - CoV-TI-Net: Transferred Initialization with Modified End Layer for
COVID-19 Diagnosis [5.546855806629448]
Transfer learning is a relatively new learning method that has been employed in many sectors to achieve good performance with fewer computations.
In this research, the PyTorch pre-trained models (VGG19_bn and WideResNet -101) are applied in the MNIST dataset.
The proposed model is developed and verified in the Kaggle notebook, and it reached the outstanding accuracy of 99.77% without taking a huge computational time.
arXiv Detail & Related papers (2022-09-20T08:52:52Z) - Continual Prune-and-Select: Class-incremental learning with specialized
subnetworks [66.4795381419701]
Continual-Prune-and-Select (CP&S) is capable of sequentially learning 10 tasks from ImageNet-1000 keeping an accuracy around 94% with negligible forgetting.
This is a first-of-its-kind result in class-incremental learning.
arXiv Detail & Related papers (2022-08-09T10:49:40Z) - TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent
Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions.
We show this disparity can largely be attributed to challenges presented by non-NISTity.
We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z) - Target Aware Network Architecture Search and Compression for Efficient
Knowledge Transfer [9.434523476406424]
We propose a two-stage framework called TASCNet which enables efficient knowledge transfer.
TASCNet reduces the computational complexity of pre-trained CNNs over the target task by reducing both trainable parameters and FLOPs.
Similar to computer vision tasks, we have also conducted experiments on Movie Review Sentiment Analysis task.
arXiv Detail & Related papers (2022-05-12T09:11:00Z) - A Novel Sleep Stage Classification Using CNN Generated by an Efficient
Neural Architecture Search with a New Data Processing Trick [4.365107026636095]
We propose an efficient five-sleep-consuming classification method using convolutional neural networks (CNNs) with a novel data processing trick.
We make full use of genetic algorithm (GA), NASG, to search for the best CNN architecture.
We verify convergence of our data processing trick and compare the performance of traditional CNNs before and after using our trick.
arXiv Detail & Related papers (2021-10-27T10:36:52Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Self-Competitive Neural Networks [0.0]
Deep Neural Networks (DNNs) have improved the accuracy of classification problems in lots of applications.
One of the challenges in training a DNN is its need to be fed by an enriched dataset to increase its accuracy and avoid it suffering from overfitting.
Recently, researchers have worked extensively to propose methods for data augmentation.
In this paper, we generate adversarial samples to refine the Domains of Attraction (DoAs) of each class. In this approach, at each stage, we use the model learned by the primary and generated adversarial data (up to that stage) to manipulate the primary data in a way that look complicated to
arXiv Detail & Related papers (2020-08-22T12:28:35Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z) - AutoFCL: Automatically Tuning Fully Connected Layers for Handling Small
Dataset [13.909484906513102]
The proposed AutoFCL model attempts to learn the structure of FC layers of a CNN automatically using Bayesian optimization.
Fine-tuning the newly learned (target-dependent) FC layers leads to state-of-the-art performance.
The proposed AutoFCL method outperforms the existing methods over CalTech-101 and Oxford-102 Flowers datasets.
arXiv Detail & Related papers (2020-01-22T08:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.