A Flexible Selection Scheme for Minimum-Effort Transfer Learning
- URL: http://arxiv.org/abs/2008.11995v1
- Date: Thu, 27 Aug 2020 08:57:30 GMT
- Title: A Flexible Selection Scheme for Minimum-Effort Transfer Learning
- Authors: Amelie Royer and Christoph H. Lampert
- Abstract summary: Fine-tuning is a popular way of exploiting knowledge contained in a pre-trained convolutional network for a new visual recognition task.
We introduce a new form of fine-tuning, called flex-tuning, in which any individual unit of a network can be tuned.
We show that fine-tuning individual units, despite its simplicity, yields very good results as an adaptation technique.
- Score: 27.920304852537534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-tuning is a popular way of exploiting knowledge contained in a
pre-trained convolutional network for a new visual recognition task. However,
the orthogonal setting of transferring knowledge from a pretrained network to a
visually different yet semantically close source is rarely considered: This
commonly happens with real-life data, which is not necessarily as clean as the
training source (noise, geometric transformations, different modalities, etc.).
To tackle such scenarios, we introduce a new, generalized form of fine-tuning,
called flex-tuning, in which any individual unit (e.g. layer) of a network can
be tuned, and the most promising one is chosen automatically. In order to make
the method appealing for practical use, we propose two lightweight and faster
selection procedures that prove to be good approximations in practice. We study
these selection criteria empirically across a variety of domain shifts and data
scarcity scenarios, and show that fine-tuning individual units, despite its
simplicity, yields very good results as an adaptation technique. As it turns
out, in contrast to common practice, rather than the last fully-connected unit
it is best to tune an intermediate or early one in many domain-shift scenarios,
which is accurately detected by flex-tuning.
Related papers
- Surgical Fine-Tuning Improves Adaptation to Distribution Shifts [114.17184775397067]
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model.
This paper shows that in such settings, selectively fine-tuning a subset of layers matches or outperforms commonly used fine-tuning approaches.
arXiv Detail & Related papers (2022-10-20T17:59:15Z) - An Effective Baseline for Robustness to Distributional Shift [5.627346969563955]
Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems.
We present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention.
arXiv Detail & Related papers (2021-05-15T00:46:11Z) - Self-supervised Augmentation Consistency for Adapting Semantic
Segmentation [56.91850268635183]
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate.
We employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions.
We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.
arXiv Detail & Related papers (2021-04-30T21:32:40Z) - All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network.
Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student.
To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z) - Multi-Stage Transfer Learning with an Application to Selection Process [5.933303832684138]
In multi-stage processes, decisions happen in an ordered sequence of stages.
In this work, we proposed a textitMulti-StaGe Transfer Learning (MSGTL) approach that uses knowledge from simple classifiers trained in early stages.
We show that it is possible to control the trade-off between conserving knowledge and fine-tuning using a simple probabilistic map.
arXiv Detail & Related papers (2020-06-01T21:27:04Z) - Side-Tuning: A Baseline for Network Adaptation via Additive Side
Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network.
In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.