Head2Toe: Utilizing Intermediate Representations for Better Transfer
Learning
- URL: http://arxiv.org/abs/2201.03529v1
- Date: Mon, 10 Jan 2022 18:40:07 GMT
- Title: Head2Toe: Utilizing Intermediate Representations for Better Transfer
Learning
- Authors: Utku Evci, Vincent Dumoulin, Hugo Larochelle, Michael C. Mozer
- Abstract summary: Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain.
We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain.
- Score: 31.171051511744636
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer-learning methods aim to improve performance in a data-scarce target
domain using a model pretrained on a data-rich source domain. A cost-efficient
strategy, linear probing, involves freezing the source model and training a new
classification head for the target domain. This strategy is outperformed by a
more costly but state-of-the-art method -- fine-tuning all parameters of the
source model to the target domain -- possibly because fine-tuning allows the
model to leverage useful information from intermediate layers which is
otherwise discarded by the later pretrained layers. We explore the hypothesis
that these intermediate layers might be directly exploited. We propose a
method, Head-to-Toe probing (Head2Toe), that selects features from all layers
of the source model to train a classification head for the target-domain. In
evaluations on the VTAB-1k, Head2Toe matches performance obtained with
fine-tuning on average while reducing training and storage cost hundred folds
or more, but critically, for out-of-distribution transfer, Head2Toe outperforms
fine-tuning.
Related papers
- Open Domain Generalization with a Single Network by Regularization
Exploiting Pre-trained Features [37.518025833882334]
Open Domain Generalization (ODG) is a challenging task as it deals with distribution shifts and category shifts.
Previous work has used multiple source-specific networks, which involve a high cost.
This paper proposes a method that can handle ODG using only a single network.
arXiv Detail & Related papers (2023-12-08T16:22:10Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - Source-Free Open Compound Domain Adaptation in Semantic Segmentation [99.82890571842603]
In SF-OCDA, only the source pre-trained model and the target data are available to learn the target model.
We propose the Cross-Patch Style Swap (CPSS) to diversify samples with various patch styles in the feature-level.
Our method produces state-of-the-art results on the C-Driving dataset.
arXiv Detail & Related papers (2021-06-07T08:38:41Z) - Source Data-absent Unsupervised Domain Adaptation through Hypothesis
Transfer and Labeling Transfer [137.36099660616975]
Unsupervised adaptation adaptation (UDA) aims to transfer knowledge from a related but different well-labeled source domain to a new unlabeled target domain.
Most existing UDA methods require access to the source data, and thus are not applicable when the data are confidential and not shareable due to privacy concerns.
This paper aims to tackle a realistic setting with only a classification model available trained over, instead of accessing to the source data.
arXiv Detail & Related papers (2020-12-14T07:28:50Z) - Domain Adaptation Using Class Similarity for Robust Speech Recognition [24.951852740214413]
This paper proposes a novel adaptation method for deep neural network (DNN) acoustic model using class similarity.
Experiments showed that our approach outperforms fine-tuning using one-hot labels on both accent and noise adaptation task.
arXiv Detail & Related papers (2020-11-05T12:26:43Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Generalized Zero and Few-Shot Transfer for Facial Forgery Detection [3.8073142980733]
We propose a new transfer learning approach to address the problem of zero and few-shot transfer in the context of forgery detection.
We find this learning strategy to be surprisingly effective at domain transfer compared to a traditional classification or even state-of-the-art domain adaptation/few-shot learning methods.
arXiv Detail & Related papers (2020-06-21T18:10:52Z) - Towards Fair Cross-Domain Adaptation via Generative Learning [50.76694500782927]
Domain Adaptation (DA) targets at adapting a model trained over the well-labeled source domain to the unlabeled target domain lying in different distributions.
We develop a novel Generative Few-shot Cross-domain Adaptation (GFCA) algorithm for fair cross-domain classification.
arXiv Detail & Related papers (2020-03-04T23:25:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.