Non-binary deep transfer learning for imageclassification
- URL: http://arxiv.org/abs/2107.08585v1
- Date: Mon, 19 Jul 2021 02:34:38 GMT
- Title: Non-binary deep transfer learning for imageclassification
- Authors: Jo Plested, Xuyang Shen, and Tom Gedeon
- Abstract summary: Current standard for computer vision tasks is to fine-tune weights pre-trained on a large image classification dataset such as ImageNet.
The application of transfer learning and transfer learning methods tends to be rigidly binary.
We present methods for non-binary transfer learning including combining L2SP and L2 regularization.
- Score: 1.858151490268935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The current standard for a variety of computer vision tasks using smaller
numbers of labelled training examples is to fine-tune from weights pre-trained
on a large image classification dataset such as ImageNet. The application of
transfer learning and transfer learning methods tends to be rigidly binary. A
model is either pre-trained or not pre-trained. Pre-training a model either
increases performance or decreases it, the latter being defined as negative
transfer. Application of L2-SP regularisation that decays the weights towards
their pre-trained values is either applied or all weights are decayed towards
0. This paper re-examines these assumptions. Our recommendations are based on
extensive empirical evaluation that demonstrate the application of a non-binary
approach to achieve optimal results. (1) Achieving best performance on each
individual dataset requires careful adjustment of various transfer learning
hyperparameters not usually considered, including number of layers to transfer,
different learning rates for different layers and different combinations of
L2SP and L2 regularization. (2) Best practice can be achieved using a number of
measures of how well the pre-trained weights fit the target dataset to guide
optimal hyperparameters. We present methods for non-binary transfer learning
including combining L2SP and L2 regularization and performing non-traditional
fine-tuning hyperparameter searches. Finally we suggest heuristics for
determining the optimal transfer learning hyperparameters. The benefits of
using a non-binary approach are supported by final results that come close to
or exceed state of the art performance on a variety of tasks that have
traditionally been more difficult for transfer learning.
Related papers
- Learning to Transform Dynamically for Better Adversarial Transferability [32.267484632957576]
Adversarial examples, crafted by adding perturbations imperceptible to humans, can deceive neural networks.
We introduce a novel approach named Learning to Transform (L2T)
L2T increases the diversity of transformed images by selecting the optimal combination of operations from a pool of candidates.
arXiv Detail & Related papers (2024-05-23T00:46:53Z) - Tune without Validation: Searching for Learning Rate and Weight Decay on
Training Sets [0.0]
Tune without validation (Twin) is a pipeline for tuning learning rate and weight decay.
We run extensive experiments on 20 image classification datasets and train several families of deep networks.
We demonstrate proper HP selection when training from scratch and fine-tuning, emphasizing small-sample scenarios.
arXiv Detail & Related papers (2024-03-08T18:57:00Z) - Class Incremental Learning with Pre-trained Vision-Language Models [59.15538370859431]
We propose an approach to exploiting pre-trained vision-language models (e.g. CLIP) that enables further adaptation.
Experiments on several conventional benchmarks consistently show a significant margin of improvement over the current state-of-the-art.
arXiv Detail & Related papers (2023-10-31T10:45:03Z) - Class Adaptive Network Calibration [19.80805957502909]
We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks.
Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization.
arXiv Detail & Related papers (2022-11-28T06:05:31Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - Task-Customized Self-Supervised Pre-training with Scalable Dynamic
Routing [76.78772372631623]
A common practice for self-supervised pre-training is to use as much data as possible.
For a specific downstream task, however, involving irrelevant data in pre-training may degenerate the downstream performance.
It is burdensome and infeasible to use different downstream-task-customized datasets in pre-training for different tasks.
arXiv Detail & Related papers (2022-05-26T10:49:43Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Partial transfusion: on the expressive influence of trainable batch norm
parameters for transfer learning [0.0]
Transfer learning from ImageNet is the go-to approach when applying deep learning to medical images.
Most modern architecture contain batch normalisation layers, and fine-tuning a model with such layers requires taking a few precautions.
We find that only fine-tuning the trainable weights of the batch normalisation layers leads to similar performance as to fine-tuning all of the weights.
arXiv Detail & Related papers (2021-02-10T16:29:03Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Rethinking the Hyperparameters for Fine-tuning [78.15505286781293]
Fine-tuning from pre-trained ImageNet models has become the de-facto standard for various computer vision tasks.
Current practices for fine-tuning typically involve selecting an ad-hoc choice of hyper parameters.
This paper re-examines several common practices of setting hyper parameters for fine-tuning.
arXiv Detail & Related papers (2020-02-19T18:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.