An Improved Transfer Model: Randomized Transferable Machine
- URL: http://arxiv.org/abs/2011.13629v2
- Date: Thu, 21 Apr 2022 05:13:06 GMT
- Title: An Improved Transfer Model: Randomized Transferable Machine
- Authors: Pengfei Wei, Xinghua Qu, Yew Soon Ong, Zejun Ma
- Abstract summary: We propose a new transfer model called Randomized Transferable Machine (RTM) to handle small divergence of domains.
Specifically, we work on the new source and target data learned from existing feature-based transfer methods.
In principle, the more corruptions are made, the higher the probability of the new target data can be covered by the constructed source data populations.
- Score: 32.50263074872975
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Feature-based transfer is one of the most effective methodologies for
transfer learning. Existing studies usually assume that the learned new feature
representation is \emph{domain-invariant}, and thus train a transfer model
$\mathcal{M}$ on the source domain. In this paper, we consider a more realistic
scenario where the new feature representation is suboptimal and small
divergence still exists across domains. We propose a new transfer model called
Randomized Transferable Machine (RTM) to handle such small divergence of
domains. Specifically, we work on the new source and target data learned from
existing feature-based transfer methods. The key idea is to enlarge source
training data populations by randomly corrupting the new source data using some
noises, and then train a transfer model $\widetilde{\mathcal{M}}$ that performs
well on all the corrupted source data populations. In principle, the more
corruptions are made, the higher the probability of the new target data can be
covered by the constructed source data populations, and thus better transfer
performance can be achieved by $\widetilde{\mathcal{M}}$. An ideal case is with
infinite corruptions, which however is infeasible in reality. We develop a
marginalized solution that enables to train an $\widetilde{\mathcal{M}}$
without conducting any corruption but equivalent to be trained using infinite
source noisy data populations. We further propose two instantiations of
$\widetilde{\mathcal{M}}$, which theoretically show the transfer superiority
over the conventional transfer model $\mathcal{M}$. More importantly, both
instantiations have closed-form solutions, leading to a fast and efficient
training process. Experiments on various real-world transfer tasks show that
RTM is a promising transfer model.
Related papers
- Deep Transfer Learning: Model Framework and Error Analysis [4.898032902660655]
This paper presents a framework for deep transfer learning with a large number of samples $n$ to a single-domain downstream task.
We show that the transfer under our framework can significantly improve the convergence rate for learning Lipschitz functions in downstream supervised tasks.
arXiv Detail & Related papers (2024-10-12T06:24:35Z) - Transfer Learning Beyond Bounded Density Ratios [21.522183597134234]
We study the fundamental problem of transfer learning where a learning algorithm collects data from some source distribution $P$ but needs to perform well with respect to a different target distribution $Q$.
Our main result is a general transfer inequality over the domain $mathbbRn$, proving that non-trivial transfer learning for low-degrees is possible under very mild assumptions.
arXiv Detail & Related papers (2024-03-18T17:02:41Z) - Informative Data Mining for One-Shot Cross-Domain Semantic Segmentation [84.82153655786183]
We propose a novel framework called Informative Data Mining (IDM) to enable efficient one-shot domain adaptation for semantic segmentation.
IDM provides an uncertainty-based selection criterion to identify the most informative samples, which facilitates quick adaptation and reduces redundant training.
Our approach outperforms existing methods and achieves a new state-of-the-art one-shot performance of 56.7%/55.4% on the GTA5/SYNTHIA to Cityscapes adaptation tasks.
arXiv Detail & Related papers (2023-09-25T15:56:01Z) - Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized
Language Model Finetuning Using Shared Randomness [86.61582747039053]
Language model training in distributed settings is limited by the communication cost of exchanges.
We extend recent work using shared randomness to perform distributed fine-tuning with low bandwidth.
arXiv Detail & Related papers (2023-06-16T17:59:51Z) - $\Delta$-Patching: A Framework for Rapid Adaptation of Pre-trained
Convolutional Networks without Base Performance Loss [71.46601663956521]
Models pre-trained on large-scale datasets are often fine-tuned to support newer tasks and datasets that arrive over time.
We propose $Delta$-Patching for fine-tuning neural network models in an efficient manner, without the need to store model copies.
Our experiments show that $Delta$-Networks outperform earlier model patching work while only requiring a fraction of parameters to be trained.
arXiv Detail & Related papers (2023-03-26T16:39:44Z) - The Power and Limitation of Pretraining-Finetuning for Linear Regression
under Covariate Shift [127.21287240963859]
We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data.
For a large class of linear regression instances, transfer learning with $O(N2)$ source data is as effective as supervised learning with $N$ target data.
arXiv Detail & Related papers (2022-08-03T05:59:49Z) - When does Bias Transfer in Transfer Learning? [89.22641454588278]
Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside.
We demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class.
arXiv Detail & Related papers (2022-07-06T17:58:07Z) - MixACM: Mixup-Based Robustness Transfer via Distillation of Activated
Channel Maps [24.22149102286949]
Deep neural networks are susceptible to adversarially crafted, small and imperceptible changes in the natural inputs.
adversarial training constructs adversarial examples during training by iterative generalization of loss.
This min-max generalization requires more data, larger capacity models, and additional computing resources.
We show the transferability of robustness from an adversarially trained teacher model to a student model with the help of mixup augmentation.
arXiv Detail & Related papers (2021-11-09T12:03:20Z) - On Generating Transferable Targeted Perturbations [102.3506210331038]
We propose a new generative approach for highly transferable targeted perturbations.
Our approach matches the perturbed image distribution' with that of the target class, leading to high targeted transferability rates.
arXiv Detail & Related papers (2021-03-26T17:55:28Z) - Generalized Zero and Few-Shot Transfer for Facial Forgery Detection [3.8073142980733]
We propose a new transfer learning approach to address the problem of zero and few-shot transfer in the context of forgery detection.
We find this learning strategy to be surprisingly effective at domain transfer compared to a traditional classification or even state-of-the-art domain adaptation/few-shot learning methods.
arXiv Detail & Related papers (2020-06-21T18:10:52Z) - Automatic Cross-Domain Transfer Learning for Linear Regression [0.0]
This paper helps to extend the capability of transfer learning for linear regression problems.
For normal datasets, we assume that some latent domain information is available for transfer learning.
arXiv Detail & Related papers (2020-05-08T15:05:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.