Wandering Within a World: Online Contextualized Few-Shot Learning
- URL: http://arxiv.org/abs/2007.04546v3
- Date: Thu, 22 Apr 2021 20:15:19 GMT
- Title: Wandering Within a World: Online Contextualized Few-Shot Learning
- Authors: Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer, Richard S. Zemel
- Abstract summary: We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online setting.
We propose a new prototypical few-shot learning based on large scale indoor imagery that mimics the visual experience of an agent wandering within a world.
- Score: 62.28521610606054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We aim to bridge the gap between typical human and machine-learning
environments by extending the standard framework of few-shot learning to an
online, continual setting. In this setting, episodes do not have separate
training and testing phases, and instead models are evaluated online while
learning novel classes. As in the real world, where the presence of
spatiotemporal context helps us retrieve learned skills in the past, our online
few-shot learning setting also features an underlying context that changes
throughout time. Object classes are correlated within a context and inferring
the correct context can lead to better performance. Building upon this setting,
we propose a new few-shot learning dataset based on large scale indoor imagery
that mimics the visual experience of an agent wandering within a world.
Furthermore, we convert popular few-shot learning approaches into online
versions and we also propose a new contextual prototypical memory model that
can make use of spatiotemporal contextual information from the recent past.
Related papers
- Improving In-Context Learning in Diffusion Models with Visual
Context-Modulated Prompts [83.03471704115786]
We introduce improved Prompt Diffusion (iPromptDiff) in this study.
iPromptDiff integrates an end-to-end trained vision encoder that converts visual context into an embedding vector.
We show that a diffusion-based vision foundation model, when equipped with this visual context-modulated text guidance and a standard ControlNet structure, exhibits versatility and robustness across a variety of training tasks.
arXiv Detail & Related papers (2023-12-03T14:15:52Z) - Learning to Model the World with Language [100.76069091703505]
To interact with humans and act in the world, agents need to understand the range of language that people use and relate it to the visual world.
Our key idea is that agents should interpret such diverse language as a signal that helps them predict the future.
We instantiate this in Dynalang, an agent that learns a multimodal world model to predict future text and image representations.
arXiv Detail & Related papers (2023-07-31T17:57:49Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - FILM: How can Few-Shot Image Classification Benefit from Pre-Trained
Language Models? [14.582209994281374]
Few-shot learning aims to train models that can be generalized to novel classes with only a few samples.
We propose a novel few-shot learning framework that uses pre-trained language models based on contrastive learning.
arXiv Detail & Related papers (2023-07-09T08:07:43Z) - Lifelong Wandering: A realistic few-shot online continual learning
setting [23.134299907227796]
Online few-shot learning describes a setting where models are trained and evaluated on a stream of data while learning emerging classes.
While prior work in this setting has achieved very promising performance on instance classification when learning from data-streams composed of a single indoor environment, we propose to extend this setting to consider object classification on a series of several indoor environments.
In this work, we benchmark several existing methods and adapted baselines within our setting, and show there exists a trade-off between catastrophic forgetting and online performance.
arXiv Detail & Related papers (2022-06-16T05:39:08Z) - Meta-Learning and Self-Supervised Pretraining for Real World Image
Translation [5.469808405577674]
We explore image-to-image translation problem in order to formulate a novel multi-task few-shot image generation benchmark.
We present several baselines for the few-shotiru problem and discuss trade-offs between different approaches.
arXiv Detail & Related papers (2021-12-22T14:48:22Z) - Deploying Lifelong Open-Domain Dialogue Learning [48.12600947313494]
In this work, we build and deploy a role-playing game, whereby human players converse with learning agents situated in an open-domain fantasy world.
We show that by training models on the conversations they have with humans in the game the models progressively improve, as measured by automatic metrics and online engagement scores.
This learning is shown to be more efficient than crowdsourced data when applied to conversations with real users, as well as being far cheaper to collect.
arXiv Detail & Related papers (2020-08-18T17:57:26Z) - Insights from the Future for Continual Learning [45.58831178202245]
We propose prescient continual learning, a novel experimental setting, to incorporate existing information about the classes, prior to any training data.
Our setting adds future classes, with no training samples at all.
A generative model of the representation space in concert with a careful adjustment of the losses allows us to exploit insights from future classes to constraint the spatial arrangement of the past and current classes.
arXiv Detail & Related papers (2020-06-24T14:05:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.