Generalization with Lossy Affordances: Leveraging Broad Offline Data for
Learning Visuomotor Tasks
- URL: http://arxiv.org/abs/2210.06601v2
- Date: Tue, 18 Apr 2023 07:10:29 GMT
- Title: Generalization with Lossy Affordances: Leveraging Broad Offline Data for
Learning Visuomotor Tasks
- Authors: Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Gengchen Yan, Sergey
Levine
- Abstract summary: We introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data.
When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems.
We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.
- Score: 65.23947618404046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The utilization of broad datasets has proven to be crucial for generalization
for a wide range of fields. However, how to effectively make use of diverse
multi-task data for novel downstream tasks still remains a grand challenge in
robotics. To tackle this challenge, we introduce a framework that acquires
goal-conditioned policies for unseen temporally extended tasks via offline
reinforcement learning on broad data, in combination with online fine-tuning
guided by subgoals in learned lossy representation space. When faced with a
novel task goal, the framework uses an affordance model to plan a sequence of
lossy representations as subgoals that decomposes the original task into easier
problems. Learned from the broad data, the lossy representation emphasizes
task-relevant information about states and goals while abstracting away
redundant contexts that hinder generalization. It thus enables subgoal planning
for unseen tasks, provides a compact input to the policy, and facilitates
reward shaping during fine-tuning. We show that our framework can be
pre-trained on large-scale datasets of robot experiences from prior work and
efficiently fine-tuned for novel tasks, entirely from visual inputs without any
manual reward engineering.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Disentangling Policy from Offline Task Representation Learning via
Adversarial Data Augmentation [29.49883684368039]
offline meta-reinforcement learning (OMRL) proficiently allows an agent to tackle novel tasks while relying on a static dataset.
We introduce a novel algorithm to disentangle the impact of behavior policy from task representation learning.
arXiv Detail & Related papers (2024-03-12T02:38:36Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning [15.41342100228504]
In deep learning, auxiliary objectives are often used to facilitate learning in situations where data is scarce.
We propose a novel framework, dubbed Detaux, whereby a weakly supervised disentanglement procedure is used to discover new unrelated classification tasks.
arXiv Detail & Related papers (2023-10-13T17:40:39Z) - CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code
Models [33.78307982736911]
Cross-task generalization is of strong research and application value.
We propose a large-scale benchmark that includes 216 existing code-related tasks.
arXiv Detail & Related papers (2023-02-08T13:04:52Z) - Learning Goal-Conditioned Policies Offline with Self-Supervised Reward
Shaping [94.89128390954572]
We propose a novel self-supervised learning phase on the pre-collected dataset to understand the structure and the dynamics of the model.
We evaluate our method on three continuous control tasks, and show that our model significantly outperforms existing approaches.
arXiv Detail & Related papers (2023-01-05T15:07:10Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral [18.387162887917164]
We formulate a model-agnostic framework that performs fine-grained manipulation of the auxiliary task gradients.
We propose to decompose auxiliary updates into directions which help, damage or leave the primary task loss unchanged.
Our approach consistently outperforms strong and widely used baselines when leveraging out-of-distribution data for Text and Image classification tasks.
arXiv Detail & Related papers (2021-08-25T17:09:48Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.