Information Theoretic Meta Learning with Gaussian Processes
- URL: http://arxiv.org/abs/2009.03228v3
- Date: Mon, 5 Jul 2021 12:26:24 GMT
- Title: Information Theoretic Meta Learning with Gaussian Processes
- Authors: Michalis K. Titsias and Francisco J. R. Ruiz and Sotirios
Nikoloutsopoulos and Alexandre Galashov
- Abstract summary: We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck.
By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
- Score: 74.54485310507336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We formulate meta learning using information theoretic concepts; namely,
mutual information and the information bottleneck. The idea is to learn a
stochastic representation or encoding of the task description, given by a
training set, that is highly informative about predicting the validation set.
By making use of variational approximations to the mutual information, we
derive a general and tractable framework for meta learning. This framework
unifies existing gradient-based algorithms and also allows us to derive new
algorithms. In particular, we develop a memory-based algorithm that uses
Gaussian processes to obtain non-parametric encoding representations. We
demonstrate our method on a few-shot regression problem and on four few-shot
classification problems, obtaining competitive accuracy when compared to
existing baselines.
Related papers
- Improved Graph-based semi-supervised learning Schemes [0.0]
In this work, we improve the accuracy of several known algorithms to address the classification of large datasets when few labels are available.
Our framework lies in the realm of graph-based semi-supervised learning.
arXiv Detail & Related papers (2024-06-30T16:50:08Z) - Semantic Information for Object Detection [0.0]
We introduce a novel method for extracting a knowledge graph from a dataset of images provided with instance-level annotations.
We investigate the effectiveness of knowledge-aware re-optimization on the Faster-RCNN and DETR object detection models.
arXiv Detail & Related papers (2023-08-17T13:53:29Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - General-Purpose In-Context Learning by Meta-Learning Transformers [45.63069059498147]
We show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners.
We characterize transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all.
We propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose in-context learning algorithms.
arXiv Detail & Related papers (2022-12-08T18:30:22Z) - Integrating Semantics and Neighborhood Information with Graph-Driven
Generative Models for Document Retrieval [51.823187647843945]
In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model.
Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones.
arXiv Detail & Related papers (2021-05-27T11:29:03Z) - Feature space approximation for kernel-based supervised learning [2.653409741248232]
The goal is to reduce the size of the training data, resulting in lower storage consumption and computational complexity.
We demonstrate significant improvements in comparison to the computation of data-driven predictions involving the full training data set.
The method is applied to classification and regression problems from different application areas such as image recognition, system identification, and oceanographic time series analysis.
arXiv Detail & Related papers (2020-11-25T11:23:58Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Rethinking Few-Shot Image Classification: a Good Embedding Is All You
Need? [72.00712736992618]
We show that a simple baseline: learning a supervised or self-supervised representation on the meta-training set, outperforms state-of-the-art few-shot learning methods.
An additional boost can be achieved through the use of self-distillation.
We believe that our findings motivate a rethinking of few-shot image classification benchmarks and the associated role of meta-learning algorithms.
arXiv Detail & Related papers (2020-03-25T17:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.