Transfer learning with affine model transformation
- URL: http://arxiv.org/abs/2210.09745v2
- Date: Fri, 19 Jan 2024 19:00:03 GMT
- Title: Transfer learning with affine model transformation
- Authors: Shunya Minami, Kenji Fukumizu, Yoshihiro Hayashi, Ryo Yoshida
- Abstract summary: This paper presents a general class of transfer learning regression called affine model transfer.
It is shown that the affine model transfer broadly encompasses various existing methods, including the most common procedure based on neural feature extractors.
- Score: 18.13383101189326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised transfer learning has received considerable attention due to its
potential to boost the predictive power of machine learning in scenarios where
data are scarce. Generally, a given set of source models and a dataset from a
target domain are used to adapt the pre-trained models to a target domain by
statistically learning domain shift and domain-specific factors. While such
procedurally and intuitively plausible methods have achieved great success in a
wide range of real-world applications, the lack of a theoretical basis hinders
further methodological development. This paper presents a general class of
transfer learning regression called affine model transfer, following the
principle of expected-square loss minimization. It is shown that the affine
model transfer broadly encompasses various existing methods, including the most
common procedure based on neural feature extractors. Furthermore, the current
paper clarifies theoretical properties of the affine model transfer such as
generalization error and excess risk. Through several case studies, we
demonstrate the practical benefits of modeling and estimating inter-domain
commonality and domain-specific factors separately with the affine-type
transfer models.
Related papers
- Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Transfer Learning for Diffusion Models [43.10840361752551]
Diffusion models consistently produce high-quality synthetic samples.
They can be impractical in real-world applications due to high collection costs or associated risks.
This paper introduces the Transfer Guided Diffusion Process (TGDP), a novel approach distinct from conventional finetuning and regularization methods.
arXiv Detail & Related papers (2024-05-27T06:48:58Z) - Reflected Schr\"odinger Bridge for Constrained Generative Modeling [16.72888494254555]
Reflected diffusion models have become the go-to method for large-scale generative models in real-world applications.
We introduce the Reflected Schrodinger Bridge algorithm: an entropy-regularized optimal transport approach tailored generating data within diverse bounded domains.
Our algorithm yields robust generative modeling in diverse domains, and its scalability is demonstrated in real-world constrained generative modeling through standard image benchmarks.
arXiv Detail & Related papers (2024-01-06T14:39:58Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Distribution-free Deviation Bounds and The Role of Domain Knowledge in Learning via Model Selection with Cross-validation Risk Estimation [0.0]
Cross-validation techniques for risk estimation and model selection are widely used in statistics and machine learning.
This paper presents learning via model selection with cross-validation risk estimation as a general systematic learning framework.
arXiv Detail & Related papers (2023-03-15T17:18:31Z) - Transfer Learning with Uncertainty Quantification: Random Effect
Calibration of Source to Target (RECaST) [1.8047694351309207]
We develop a statistical framework for model predictions based on transfer learning, called RECaST.
We mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models.
We examine our method's performance in a simulation study and in an application to real hospital data.
arXiv Detail & Related papers (2022-11-29T19:39:47Z) - Learning from few examples with nonlinear feature maps [68.8204255655161]
We explore the phenomenon and reveal key relationships between dimensionality of AI model's feature space, non-degeneracy of data distributions, and the model's generalisation capabilities.
The main thrust of our present analysis is on the influence of nonlinear feature transformations mapping original data into higher- and possibly infinite-dimensional spaces on the resulting model's generalisation capabilities.
arXiv Detail & Related papers (2022-03-31T10:36:50Z) - A Class of Geometric Structures in Transfer Learning: Minimax Bounds and
Optimality [5.2172436904905535]
We exploit the geometric structure of the source and target domains for transfer learning.
Our proposed estimator outperforms state-of-the-art transfer learning methods in both moderate- and high-dimensional settings.
arXiv Detail & Related papers (2022-02-23T18:47:53Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z) - Self-balanced Learning For Domain Generalization [64.99791119112503]
Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics.
Most existing approaches have been developed under the assumption that the source data is well-balanced in terms of both domain and class.
We propose a self-balanced domain generalization framework that adaptively learns the weights of losses to alleviate the bias caused by different distributions of the multi-domain source data.
arXiv Detail & Related papers (2021-08-31T03:17:54Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.