Information-Theoretic Generalization Bounds for Meta-Learning and
Applications
- URL: http://arxiv.org/abs/2005.04372v4
- Date: Fri, 15 Jan 2021 12:00:37 GMT
- Title: Information-Theoretic Generalization Bounds for Meta-Learning and
Applications
- Authors: Sharu Theresa Jose, Osvaldo Simeone
- Abstract summary: Key performance measure for meta-learning is the meta-generalization gap.
This paper presents novel information-theoretic upper bounds on the meta-generalization gap.
- Score: 42.275148861039895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta-learning, or "learning to learn", refers to techniques that infer an
inductive bias from data corresponding to multiple related tasks with the goal
of improving the sample efficiency for new, previously unobserved, tasks. A key
performance measure for meta-learning is the meta-generalization gap, that is,
the difference between the average loss measured on the meta-training data and
on a new, randomly selected task. This paper presents novel
information-theoretic upper bounds on the meta-generalization gap. Two broad
classes of meta-learning algorithms are considered that uses either separate
within-task training and test sets, like MAML, or joint within-task training
and test sets, like Reptile. Extending the existing work for conventional
learning, an upper bound on the meta-generalization gap is derived for the
former class that depends on the mutual information (MI) between the output of
the meta-learning algorithm and its input meta-training data. For the latter,
the derived bound includes an additional MI between the output of the per-task
learning procedure and corresponding data set to capture within-task
uncertainty. Tighter bounds are then developed, under given technical
conditions, for the two classes via novel Individual Task MI (ITMI) bounds.
Applications of the derived bounds are finally discussed, including a broad
class of noisy iterative algorithms for meta-learning.
Related papers
- Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - An Information-Theoretic Analysis of the Impact of Task Similarity on
Meta-Learning [44.320945743871285]
We present novel information-theoretic bounds on the average absolute value of the meta-generalization gap.
Our bounds explicitly capture the impact of task relatedness, the number of tasks, and the number of data samples per task on the meta-generalization gap.
arXiv Detail & Related papers (2021-01-21T01:38:16Z) - Transfer Meta-Learning: Information-Theoretic Bounds and Information
Meta-Risk Minimization [47.7605527786164]
Meta-learning automatically infers an inductive bias by observing data from a number of related tasks.
We introduce the problem of transfer meta-learning, in which tasks are drawn from a target task environment during meta-testing.
arXiv Detail & Related papers (2020-11-04T12:55:43Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z) - Incremental Meta-Learning via Indirect Discriminant Alignment [118.61152684795178]
We develop a notion of incremental learning during the meta-training phase of meta-learning.
Our approach performs favorably at test time as compared to training a model with the full meta-training set.
arXiv Detail & Related papers (2020-02-11T01:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.