Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning
- URL: http://arxiv.org/abs/2305.18459v2
- Date: Tue, 10 Oct 2023 13:01:41 GMT
- Title: Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning
- Authors: Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong
Wang, Bin Zhao, Xuelong Li
- Abstract summary: Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
- Score: 101.66860222415512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have demonstrated highly-expressive generative capabilities
in vision and NLP. Recent studies in reinforcement learning (RL) have shown
that diffusion models are also powerful in modeling complex policies or
trajectories in offline datasets. However, these works have been limited to
single-task settings where a generalist agent capable of addressing multi-task
predicaments is absent. In this paper, we aim to investigate the effectiveness
of a single diffusion model in modeling large-scale multi-task offline data,
which can be challenging due to diverse and multimodal data distribution.
Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a
diffusion-based method that incorporates Transformer backbones and prompt
learning for generative planning and data synthesis in multi-task offline
settings. \textsc{MTDiff} leverages vast amounts of knowledge available in
multi-task data and performs implicit knowledge sharing among tasks. For
generative planning, we find \textsc{MTDiff} outperforms state-of-the-art
algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data
synthesis, \textsc{MTDiff} generates high-quality data for testing tasks given
a single demonstration as a prompt, which enhances the low-quality datasets for
even unseen tasks.
Related papers
- Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications.
The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard.
We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z) - Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data [35.85909368345219]
We introduce Infinity-MM, a large-scale multimodal instruction dataset.
We perform unified preprocessing, resulting in a dataset with over 40 million samples that ensures diversity and accuracy.
We propose a synthetic instruction generation method based on a tagging system and open-source Vision-Language Models.
arXiv Detail & Related papers (2024-10-24T09:03:48Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner [12.360598915420255]
Diffusion models have demonstrated their capabilities in modeling trajectories of multi-tasks.
Existing multi-task planners or policies typically rely on task-specific demonstrations via multi-task imitation, or require task-specific reward labels.
We propose a versatile diffusion planner capable of leveraging large-scale inferior data that contains task-agnostic sub-optimal trajectories.
arXiv Detail & Related papers (2024-09-30T05:05:37Z) - A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.