Transfer Learning for High-dimensional Linear Regression: Prediction,
Estimation, and Minimax Optimality
- URL: http://arxiv.org/abs/2006.10593v1
- Date: Thu, 18 Jun 2020 14:55:29 GMT
- Title: Transfer Learning for High-dimensional Linear Regression: Prediction,
Estimation, and Minimax Optimality
- Authors: Sai Li and T. Tony Cai and Hongzhe Li
- Abstract summary: It is shown that Trans-Lasso leads to improved performance in gene expression prediction in a target tissue by incorporating the data from multiple different tissues as auxiliary samples.
- Score: 6.230751621285322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers the estimation and prediction of a high-dimensional
linear regression in the setting of transfer learning, using samples from the
target model as well as auxiliary samples from different but possibly related
regression models. When the set of "informative" auxiliary samples is known, an
estimator and a predictor are proposed and their optimality is established. The
optimal rates of convergence for prediction and estimation are faster than the
corresponding rates without using the auxiliary samples. This implies that
knowledge from the informative auxiliary samples can be transferred to improve
the learning performance of the target problem. In the case that the set of
informative auxiliary samples is unknown, we propose a data-driven procedure
for transfer learning, called Trans-Lasso, and reveal its robustness to
non-informative auxiliary samples and its efficiency in knowledge transfer. The
proposed procedures are demonstrated in numerical studies and are applied to a
dataset concerning the associations among gene expressions. It is shown that
Trans-Lasso leads to improved performance in gene expression prediction in a
target tissue by incorporating the data from multiple different tissues as
auxiliary samples.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis [10.79615566320291]
We explore transfer learning with the goal of optimizing downstream performance.
We introduce a simple linear model that takes as input an arbitrary pretrained feature.
We identify the optimal pretrained representation by minimizing the downstream risk averaged over an ensemble of downstream tasks.
arXiv Detail & Related papers (2024-04-18T19:33:55Z) - DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated
MR Images [2.352695945685781]
We propose a new method that employs transfer learning techniques to correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation.
The proposed method derives high-quality classifiers for the different tissue classes from sparse and unambiguous annotations.
Compared to training on fully labeled data, we reduced the time for labeling and training by a factor greater than 70 and 180 respectively without sacrificing accuracy.
arXiv Detail & Related papers (2024-03-12T09:17:21Z) - Iterative self-transfer learning: A general methodology for response
time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study.
The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z) - Estimation and inference for transfer learning with high-dimensional
quantile regression [3.4510296013600374]
We propose a transfer learning procedure in the framework of high-dimensional quantile regression models.
We establish error bounds of transfer learning estimator based on delicately selected transferable source domains.
By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer.
arXiv Detail & Related papers (2022-11-26T14:40:19Z) - SURF: Semi-supervised Reward Learning with Data Augmentation for
Feedback-efficient Preference-based Reinforcement Learning [168.89470249446023]
We present SURF, a semi-supervised reward learning framework that utilizes a large amount of unlabeled samples with data augmentation.
In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.
Our experiments demonstrate that our approach significantly improves the feedback-efficiency of the preference-based method on a variety of locomotion and robotic manipulation tasks.
arXiv Detail & Related papers (2022-03-18T16:50:38Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Self-paced Data Augmentation for Training Neural Networks [11.554821454921536]
We propose a self-paced augmentation to automatically select suitable samples for data augmentation when training a neural network.
The proposed method mitigates the deterioration of generalization performance caused by ineffective data augmentation.
Experimental results demonstrate that the proposed SPA can improve the generalization performance, particularly when the number of training samples is small.
arXiv Detail & Related papers (2020-10-29T09:13:18Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.