LIMT: Language-Informed Multi-Task Visual World Models
- URL: http://arxiv.org/abs/2407.13466v1
- Date: Thu, 18 Jul 2024 12:40:58 GMT
- Title: LIMT: Language-Informed Multi-Task Visual World Models
- Authors: Elie Aljalbout, Nikolaos Sotirakis, Patrick van der Smagt, Maximilian Karl, Nutan Chen,
- Abstract summary: Multi-task reinforcement learning can be very challenging due to the increased sample complexity and the potentially conflicting task objectives.
We propose a method for learning multi-task visual world models, leveraging pre-trained language models to extract semantically meaningful task representations.
Our results highlight the benefits of using language-driven task representations for world models and a clear advantage of model-based multi-task learning over the more common model-free paradigm.
- Score: 6.128332310539627
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most recent successes in robot reinforcement learning involve learning a specialized single-task agent. However, robots capable of performing multiple tasks can be much more valuable in real-world applications. Multi-task reinforcement learning can be very challenging due to the increased sample complexity and the potentially conflicting task objectives. Previous work on this topic is dominated by model-free approaches. The latter can be very sample inefficient even when learning specialized single-task agents. In this work, we focus on model-based multi-task reinforcement learning. We propose a method for learning multi-task visual world models, leveraging pre-trained language models to extract semantically meaningful task representations. These representations are used by the world model and policy to reason about task similarity in dynamics and behavior. Our results highlight the benefits of using language-driven task representations for world models and a clear advantage of model-based multi-task learning over the more common model-free paradigm.
Related papers
- Deploying Multi-task Online Server with Large Language Model [9.118405878982383]
We present a three-stage multi-task learning framework for large language models.
It involves task filtering, followed by fine-tuning on high-resource tasks, and finally fine-tuning on all tasks.
Our approach, exemplified on different benchmarks, demonstrates that it is able to achieve performance comparable to the single-task method while reducing up to 90.9% of its overhead.
arXiv Detail & Related papers (2024-11-06T03:48:41Z) - Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning [49.92517970237088]
We tackle the problem of training a robot to understand multimodal prompts.
This type of task poses a major challenge to robots' capability to understand the interconnection and complementarity between vision and language signals.
We introduce an effective framework that learns a policy to perform robot manipulation with multimodal prompts.
arXiv Detail & Related papers (2023-10-14T22:24:58Z) - Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts [75.75548749888029]
We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks.
With a single model, Musketeer achieves results comparable to or better than strong baselines trained on single tasks, almost uniformly across multiple tasks.
arXiv Detail & Related papers (2023-05-11T17:57:49Z) - OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist
Models [72.8156832931841]
Generalist models are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model.
We release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction.
arXiv Detail & Related papers (2022-12-08T17:07:09Z) - Design Perspectives of Multitask Deep Learning Models and Applications [1.3701366534590496]
Multi-task learning has been able to generalize the models even better.
We try to enhance the feature mapping of the multi-tasking models by sharing features among related tasks.
Also, our interest is in learning the task relationships among various tasks for acquiring better benefits from multi-task learning.
arXiv Detail & Related papers (2022-09-27T15:04:31Z) - An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
Multitask Learning Systems [4.675744559395732]
Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer.
State of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks.
We propose an evolutionary method that can generate a large scale multitask model and can support the dynamic and continuous addition of new tasks.
arXiv Detail & Related papers (2022-05-25T13:10:47Z) - Zero Experience Required: Plug & Play Modular Transfer Learning for
Semantic Visual Navigation [97.17517060585875]
We present a unified approach to visual navigation using a novel modular transfer learning model.
Our model can effectively leverage its experience from one source task and apply it to multiple target tasks.
Our approach learns faster, generalizes better, and outperforms SoTA models by a significant margin.
arXiv Detail & Related papers (2022-02-05T00:07:21Z) - Towards More Generalizable One-shot Visual Imitation Learning [81.09074706236858]
A general-purpose robot should be able to master a wide range of tasks and quickly learn a novel one by leveraging past experiences.
One-shot imitation learning (OSIL) approaches this goal by training an agent with (pairs of) expert demonstrations.
We push for a higher level of generalization ability by investigating a more ambitious multi-task setup.
arXiv Detail & Related papers (2021-10-26T05:49:46Z) - Boosting a Model Zoo for Multi-Task and Continual Learning [15.110807414130923]
"Model Zoo" is an algorithm that builds an ensemble of models, each of which is very small, and it is trained on a smaller set of tasks.
Model Zoo achieves large gains in prediction accuracy compared to state-of-the-art methods in multi-task and continual learning.
arXiv Detail & Related papers (2021-06-06T04:25:09Z) - MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale [103.7609761511652]
We show how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously.
New tasks can be continuously instantiated from previously learned tasks.
We train and evaluate our system on a set of 12 real-world tasks with data collected from 7 robots.
arXiv Detail & Related papers (2021-04-16T16:38:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.