Unsupervised Meta-Learning via In-Context Learning
- URL: http://arxiv.org/abs/2405.16124v2
- Date: Tue, 01 Oct 2024 06:29:08 GMT
- Title: Unsupervised Meta-Learning via In-Context Learning
- Authors: Anna Vettoruzzo, Lorenzo Braccaioli, Joaquin Vanschoren, Marlena Nowaczyk,
- Abstract summary: We propose a novel approach to unsupervised meta-learning that leverages the generalization abilities of in-supervised learning.
Our method reframes meta-learning as a sequence modeling problem, enabling the transformer encoder to learn task context from support images.
- Score: 3.4165401459803335
- License:
- Abstract: Unsupervised meta-learning aims to learn feature representations from unsupervised datasets that can transfer to downstream tasks with limited labeled data. In this paper, we propose a novel approach to unsupervised meta-learning that leverages the generalization abilities of in-context learning observed in transformer architectures. Our method reframes meta-learning as a sequence modeling problem, enabling the transformer encoder to learn task context from support images and utilize it to predict query images. At the core of our approach lies the creation of diverse tasks generated using a combination of data augmentations and a mixing strategy that challenges the model during training while fostering generalization to unseen tasks at test time. Experimental results on benchmark datasets showcase the superiority of our approach over existing unsupervised meta-learning baselines, establishing it as the new state-of-the-art in the field. Remarkably, our method achieves competitive results with supervised and self-supervised approaches, underscoring the efficacy of the model in leveraging generalization over memorization.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Continual Learning From Unlabeled Data Via Deep Clustering [7.704949298975352]
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives.
We introduce a new framework to make continual learning feasible in unsupervised mode by using pseudo label obtained from cluster assignments to update model.
arXiv Detail & Related papers (2021-04-14T23:46:17Z) - On Data Efficiency of Meta-learning [17.739215706060605]
We study the often overlooked aspect of the modern meta-learning algorithms -- their data efficiency.
We introduce a new simple framework for evaluating meta-learning methods under a limit on the available supervision.
We propose active meta-learning, which incorporates active data selection into learning-to-learn, leading to better performance of all methods in the limited supervision regime.
arXiv Detail & Related papers (2021-01-30T01:44:12Z) - Self-Supervised Prototypical Transfer Learning for Few-Shot
Classification [11.96734018295146]
Self-supervised transfer learning approach ProtoTransfer outperforms state-of-the-art unsupervised meta-learning methods on few-shot tasks.
In few-shot experiments with domain shift, our approach even has comparable performance to supervised methods, but requires orders of magnitude fewer labels.
arXiv Detail & Related papers (2020-06-19T19:00:11Z) - Unsupervised Meta-Learning through Latent-Space Interpolation in
Generative Models [11.943374020641214]
We describe an approach that generates meta-tasks using generative models.
We find that the proposed approach, LAtent Space Interpolation Unsupervised Meta-learning (LASIUM), outperforms or is competitive with current unsupervised learning baselines.
arXiv Detail & Related papers (2020-06-18T02:10:56Z) - Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning.
By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning.
This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.