MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and
Architectures
- URL: http://arxiv.org/abs/2006.07540v3
- Date: Tue, 15 Feb 2022 13:56:01 GMT
- Title: MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and
Architectures
- Authors: Jeongun Ryu and Jaewoong Shin and Hae Beom Lee and Sung Ju Hwang
- Abstract summary: We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data.
As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
- Score: 61.73533544385352
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regularization and transfer learning are two popular techniques to enhance
generalization on unseen data, which is a fundamental problem of machine
learning. Regularization techniques are versatile, as they are task- and
architecture-agnostic, but they do not exploit a large amount of data
available. Transfer learning methods learn to transfer knowledge from one
domain to another, but may not generalize across tasks and architectures, and
may introduce new training cost for adapting to the target task. To bridge the
gap between the two, we propose a transferable perturbation, MetaPerturb, which
is meta-learned to improve generalization performance on unseen data.
MetaPerturb is implemented as a set-based lightweight network that is agnostic
to the size and the order of the input, which is shared across the layers.
Then, we propose a meta-learning framework, to jointly train the perturbation
function over heterogeneous tasks in parallel. As MetaPerturb is a set-function
trained over diverse distributions across layers and tasks, it can generalize
to heterogeneous tasks and architectures. We validate the efficacy and
generality of MetaPerturb trained on a specific source domain and architecture,
by applying it to the training of diverse neural architectures on heterogeneous
target datasets against various regularizers and fine-tuning. The results show
that the networks trained with MetaPerturb significantly outperform the
baselines on most of the tasks and architectures, with a negligible increase in
the parameter size and no hyperparameters to tune.
Related papers
- Meta-Learning with Heterogeneous Tasks [42.695853959923625]
Heterogeneous Tasks Robust Meta-learning (HeTRoM)
An efficient iterative optimization algorithm based on bi-level optimization.
Results demonstrate that our method provides flexibility, enabling users to adapt to diverse task settings.
arXiv Detail & Related papers (2024-10-24T16:32:23Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Improving the Generalization of Meta-learning on Unseen Domains via
Adversarial Shift [3.1219977244201056]
We propose a model-agnostic shift layer to learn how to simulate the domain shift and generate pseudo tasks.
Based on the pseudo tasks, the meta-learning model can learn cross-domain meta-knowledge, which can generalize well on unseen domains.
arXiv Detail & Related papers (2021-07-23T07:29:30Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Large-Scale Meta-Learning with Continual Trajectory Shifting [76.29017270864308]
We show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale tasks.
In order to increase the frequency of meta-updates, we propose to estimate the required shift of the task-specific parameters.
We show that the algorithm largely outperforms the previous first-order meta-learning methods in terms of both generalization performance and convergence.
arXiv Detail & Related papers (2021-02-14T18:36:33Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.