Learning Task Informed Abstractions
- URL: http://arxiv.org/abs/2106.15612v2
- Date: Wed, 30 Jun 2021 01:45:03 GMT
- Title: Learning Task Informed Abstractions
- Authors: Xiang Fu, Ge Yang, Pulkit Agrawal, Tommi Jaakkola
- Abstract summary: We propose learning Task Informed Abstractions (TIA) that explicitly separates reward-correlated visual features from distractors.
TIA leads to significant performance gains over state-of-the-art methods on many visual control tasks.
- Score: 10.920599910769276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current model-based reinforcement learning methods struggle when operating
from complex visual scenes due to their inability to prioritize task-relevant
features. To mitigate this problem, we propose learning Task Informed
Abstractions (TIA) that explicitly separates reward-correlated visual features
from distractors. For learning TIA, we introduce the formalism of Task Informed
MDP (TiMDP) that is realized by training two models that learn visual features
via cooperative reconstruction, but one model is adversarially dissociated from
the reward signal. Empirical evaluation shows that TIA leads to significant
performance gains over state-of-the-art methods on many visual control tasks
where natural and unconstrained visual distractions pose a formidable
challenge.
Related papers
- Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions [14.274653873720334]
We propose a simple yet effective auxiliary task to facilitate representation learning in distracting environments.
Under the assumption that task-relevant components of image observations are straightforward to identify with prior knowledge, we use a segmentation mask on image observations to only task-relevant components.
In modified DeepMind Control suite (DMC) and Meta-World tasks with added visual distractions, SD achieves significantly better sample efficiency and greater final performance than prior work.
arXiv Detail & Related papers (2024-10-13T19:24:07Z) - Visual In-Context Learning for Large Vision-Language Models [62.5507897575317]
In Large Visual Language Models (LVLMs) the efficacy of In-Context Learning (ICL) remains limited by challenges in cross-modal interactions and representation disparities.
We introduce a novel Visual In-Context Learning (VICL) method comprising Visual Demonstration Retrieval, Intent-Oriented Image Summarization, and Intent-Oriented Demonstration Composition.
Our approach retrieves images via ''Retrieval & Rerank'' paradigm, summarises images with task intent and task-specific visual parsing, and composes language-based demonstrations.
arXiv Detail & Related papers (2024-02-18T12:43:38Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models [22.472167814814448]
We propose a new model-based imitation learning algorithm named Separated Model-based Adversarial Imitation Learning (SeMAIL)
Our method achieves near-expert performance on various visual control tasks with complex observations and the more challenging tasks with different backgrounds from expert observations.
arXiv Detail & Related papers (2023-06-19T04:33:44Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - Visual Perturbation-aware Collaborative Learning for Overcoming the
Language Prior Problem [60.0878532426877]
We propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration.
Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents.
The experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness.
arXiv Detail & Related papers (2022-07-24T23:50:52Z) - Task-Induced Representation Learning [14.095897879222672]
We evaluate the effectiveness of representation learning approaches for decision making in visually complex environments.
We find that representation learning generally improves sample efficiency on unseen tasks even in visually complex scenes.
arXiv Detail & Related papers (2022-04-25T17:57:10Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.