Minimax Optimal Transfer Learning for Kernel-based Nonparametric
Regression
- URL: http://arxiv.org/abs/2310.13966v1
- Date: Sat, 21 Oct 2023 10:55:31 GMT
- Title: Minimax Optimal Transfer Learning for Kernel-based Nonparametric
Regression
- Authors: Chao Wang, Caixing Wang, Xin He, and Xingdong Feng
- Abstract summary: This paper focuses on investigating the transfer learning problem within the context of nonparametric regression.
The aim is to bridge the gap between practical effectiveness and theoretical guarantees.
- Score: 18.240776405802205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, transfer learning has garnered significant attention in the
machine learning community. Its ability to leverage knowledge from related
studies to improve generalization performance in a target study has made it
highly appealing. This paper focuses on investigating the transfer learning
problem within the context of nonparametric regression over a reproducing
kernel Hilbert space. The aim is to bridge the gap between practical
effectiveness and theoretical guarantees. We specifically consider two
scenarios: one where the transferable sources are known and another where they
are unknown. For the known transferable source case, we propose a two-step
kernel-based estimator by solely using kernel ridge regression. For the unknown
case, we develop a novel method based on an efficient aggregation algorithm,
which can automatically detect and alleviate the effects of negative sources.
This paper provides the statistical properties of the desired estimators and
establishes the minimax optimal rate. Through extensive numerical experiments
on synthetic data and real examples, we validate our theoretical findings and
demonstrate the effectiveness of our proposed method.
Related papers
- A Kernel Perspective on Distillation-based Collaborative Learning [8.971234046933349]
We propose a nonparametric collaborative learning algorithm that does not directly share local data or models in statistically heterogeneous environments.
Inspired by our theoretical results, we also propose a practical distillation-based collaborative learning algorithm based on neural network architecture.
arXiv Detail & Related papers (2024-10-23T06:40:13Z) - Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning [88.78080749909665]
Current on-device training methods just focus on efficient training without considering the catastrophic forgetting.
This paper proposes a simple but effective edge-friendly incremental learning framework.
Our method achieves average accuracy boost of 38.08% with even less memory and approximate computation.
arXiv Detail & Related papers (2024-06-13T05:49:29Z) - Knowledge Transfer across Multiple Principal Component Analysis Studies [8.602833477729899]
We propose a two-step transfer learning algorithm to extract useful information from multiple source principal component analysis (PCA) studies.
In the first step, we integrate the shared subspace information across multiple studies by a proposed method named as Grassmannian barycenter.
The resulting estimator for the shared subspace from the first step is further utilized to estimate the target private subspace.
arXiv Detail & Related papers (2024-03-12T09:15:12Z) - Transfer Learning for Nonparametric Regression: Non-asymptotic Minimax
Analysis and Adaptive Procedure [5.303044915173525]
We develop a novel estimator called the confidence thresholding estimator, which is shown to achieve the minimax optimal risk up to a logarithmic factor.
We then propose a data-driven algorithm that adaptively achieves the minimax risk up to a logarithmic factor across a wide range of parameter spaces.
arXiv Detail & Related papers (2024-01-22T16:24:04Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Estimation and inference for transfer learning with high-dimensional
quantile regression [3.4510296013600374]
We propose a transfer learning procedure in the framework of high-dimensional quantile regression models.
We establish error bounds of transfer learning estimator based on delicately selected transferable source domains.
By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer.
arXiv Detail & Related papers (2022-11-26T14:40:19Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory
to Learning Algorithms [91.3755431537592]
We analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression.
We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice.
arXiv Detail & Related papers (2021-01-26T17:11:40Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.