The Power and Limitation of Pretraining-Finetuning for Linear Regression
under Covariate Shift
- URL: http://arxiv.org/abs/2208.01857v1
- Date: Wed, 3 Aug 2022 05:59:49 GMT
- Title: The Power and Limitation of Pretraining-Finetuning for Linear Regression
under Covariate Shift
- Authors: Jingfeng Wu and Difan Zou and Vladimir Braverman and Quanquan Gu and
Sham M. Kakade
- Abstract summary: We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data.
For a large class of linear regression instances, transfer learning with $O(N2)$ source data is as effective as supervised learning with $N$ target data.
- Score: 127.21287240963859
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study linear regression under covariate shift, where the marginal
distribution over the input covariates differs in the source and the target
domains, while the conditional distribution of the output given the input
covariates is similar across the two domains. We investigate a transfer
learning approach with pretraining on the source data and finetuning based on
the target data (both conducted by online SGD) for this problem. We establish
sharp instance-dependent excess risk upper and lower bounds for this approach.
Our bounds suggest that for a large class of linear regression instances,
transfer learning with $O(N^2)$ source data (and scarce or no target data) is
as effective as supervised learning with $N$ target data. In addition, we show
that finetuning, even with only a small amount of target data, could
drastically reduce the amount of source data required by pretraining. Our
theory sheds light on the effectiveness and limitation of pretraining as well
as the benefits of finetuning for tackling covariate shift problems.
Related papers
- Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - Memory Consistent Unsupervised Off-the-Shelf Model Adaptation for
Source-Relaxed Medical Image Segmentation [13.260109561599904]
Unsupervised domain adaptation (UDA) has been a vital protocol for migrating information learned from a labeled source domain to an unlabeled heterogeneous target domain.
We propose "off-the-shelf (OS)" UDA (OSUDA), aimed at image segmentation, by adapting an OS segmentor trained in a source domain to a target domain, in the absence of source domain data in adaptation.
arXiv Detail & Related papers (2022-09-16T13:13:50Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - Learning Invariant Representation with Consistency and Diversity for
Semi-supervised Source Hypothesis Transfer [46.68586555288172]
We propose a novel task named Semi-supervised Source Hypothesis Transfer (SSHT), which performs domain adaptation based on source trained model, to generalize well in target domain with a few supervisions.
We propose Consistency and Diversity Learning (CDL), a simple but effective framework for SSHT by facilitating prediction consistency between two randomly augmented unlabeled data.
Experimental results show that our method outperforms existing SSDA methods and unsupervised model adaptation methods on DomainNet, Office-Home and Office-31 datasets.
arXiv Detail & Related papers (2021-07-07T04:14:24Z) - Near-Optimal Linear Regression under Distribution Shift [63.87137348308034]
We show that linear minimax estimators are within an absolute constant of the minimax risk even among nonlinear estimators for various source/target distributions.
arXiv Detail & Related papers (2021-06-23T00:52:50Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Distill and Fine-tune: Effective Adaptation from a Black-box Source
Model [138.12678159620248]
Unsupervised domain adaptation (UDA) aims to transfer knowledge in previous related labeled datasets (source) to a new unlabeled dataset (target)
We propose a novel two-step adaptation framework called Distill and Fine-tune (Dis-tune)
arXiv Detail & Related papers (2021-04-04T05:29:05Z) - Regressive Domain Adaptation for Unsupervised Keypoint Detection [67.2950306888855]
Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain.
We present a method of regressive domain adaptation (RegDA) for unsupervised keypoint detection.
Our method brings large improvement by 8% to 11% in terms of PCK on different datasets.
arXiv Detail & Related papers (2021-03-10T16:45:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.