Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning
- URL: http://arxiv.org/abs/2305.18459v2
- Date: Tue, 10 Oct 2023 13:01:41 GMT
- Title: Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning
- Authors: Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong
Wang, Bin Zhao, Xuelong Li
- Abstract summary: Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
- Score: 101.66860222415512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have demonstrated highly-expressive generative capabilities
in vision and NLP. Recent studies in reinforcement learning (RL) have shown
that diffusion models are also powerful in modeling complex policies or
trajectories in offline datasets. However, these works have been limited to
single-task settings where a generalist agent capable of addressing multi-task
predicaments is absent. In this paper, we aim to investigate the effectiveness
of a single diffusion model in modeling large-scale multi-task offline data,
which can be challenging due to diverse and multimodal data distribution.
Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a
diffusion-based method that incorporates Transformer backbones and prompt
learning for generative planning and data synthesis in multi-task offline
settings. \textsc{MTDiff} leverages vast amounts of knowledge available in
multi-task data and performs implicit knowledge sharing among tasks. For
generative planning, we find \textsc{MTDiff} outperforms state-of-the-art
algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data
synthesis, \textsc{MTDiff} generates high-quality data for testing tasks given
a single demonstration as a prompt, which enhances the low-quality datasets for
even unseen tasks.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner [12.360598915420255]
We propose textbfSODP, a two-stage framework that learns a textbfDiffusion textbfPlanner, which is generalizable for various downstream tasks.
In the pre-training stage, we train a foundation diffusion planner that extracts general planning capabilities by modeling the versatile distribution of multi-task trajectories.
Then for downstream tasks, we adopt RL-based fine-tuning with task-specific rewards to fast refine the diffusion planner.
arXiv Detail & Related papers (2024-09-30T05:05:37Z) - AdapMTL: Adaptive Pruning Framework for Multitask Learning Model [5.643658120200373]
AdapMTL is an adaptive pruning framework for multitask models.
It balances sparsity allocation and accuracy performance across multiple tasks.
It showcases superior performance compared to state-of-the-art pruning methods.
arXiv Detail & Related papers (2024-08-07T17:19:15Z) - A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
This dataset includes figures such as schematic diagrams, simulated images, macroscopic/microscopic photos, and experimental visualizations.
We developed benchmarks for scientific figure captioning and multiple-choice questions, evaluating six proprietary and over ten open-source models.
The dataset and benchmarks will be released to support further research.
arXiv Detail & Related papers (2024-07-06T00:40:53Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Multi-Task Variational Information Bottleneck [8.55293326934818]
Multi-task learning (MTL) is an important subject in machine learning and artificial intelligence.
This article proposes an MTL model based on the architecture of the variational information bottleneck (VIB)
Extensive observations on three public data sets under adversarial attacks show that the proposed model is competitive to the state-of-the-art algorithms.
arXiv Detail & Related papers (2020-07-01T09:06:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.