Related papers: Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

URL: http://arxiv.org/abs/2305.18459v2
Date: Tue, 10 Oct 2023 13:01:41 GMT
Title: Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
Authors: Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li
Abstract summary: Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis. For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
Score: 101.66860222415512
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings. \textsc{MTDiff} leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find \textsc{MTDiff} outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, \textsc{MTDiff} generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.

Related papers

MultiMAE Meets Earth Observation: Pre-training Multi-modal Multi-task Masked Autoencoders for Earth Observation Tasks [11.359741665798195]
This paper explores a more flexible multi-modal, multi-task pre-training strategy for Earth Observation (EO) data.<n>Specifically, we adopt a Multi-modal Multi-task Masked Autoencoder (MultiMAE) that we pre-train by reconstructing diverse input modalities.<n>Our approach exhibits significant flexibility, handling diverse input configurations without requiring modality-specific pre-trained models.
arXiv Detail & Related papers (2025-05-20T22:24:36Z)
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning [20.79390984800288]
Large Language Models (LLMs) are increasingly applied across various tasks. We propose MDIT, a novel model-free data method for diverse instruction tuning. Extensive experiments show that our method achieves superior performance in multiple benchmark tasks.
arXiv Detail & Related papers (2025-04-09T21:28:17Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data [35.85909368345219]
We introduce Infinity-MM, a large-scale multimodal instruction dataset. We perform unified preprocessing, resulting in a dataset with over 40 million samples that ensures diversity and accuracy. We propose a synthetic instruction generation method based on a tagging system and open-source Vision-Language Models.
arXiv Detail & Related papers (2024-10-24T09:03:48Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner [12.360598915420255]
We propose textbfSODP, a two-stage framework that learns a textbfDiffusion textbfPlanner, which is generalizable for various downstream tasks. In the pre-training stage, we train a foundation diffusion planner that extracts general planning capabilities by modeling the versatile distribution of multi-task trajectories. Then for downstream tasks, we adopt RL-based fine-tuning with task-specific rewards to fast refine the diffusion planner.
arXiv Detail & Related papers (2024-09-30T05:05:37Z)
AdapMTL: Adaptive Pruning Framework for Multitask Learning Model [5.643658120200373]
AdapMTL is an adaptive pruning framework for multitask models. It balances sparsity allocation and accuracy performance across multiple tasks. It showcases superior performance compared to state-of-the-art pruning methods.
arXiv Detail & Related papers (2024-08-07T17:19:15Z)
A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images. We validated our approach on a large hyperspectral dataset called TAIGA. A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z)
MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
This dataset includes figures such as schematic diagrams, simulated images, macroscopic/microscopic photos, and experimental visualizations. We developed benchmarks for scientific figure captioning and multiple-choice questions, evaluating six proprietary and over ten open-source models. The dataset and benchmarks will be released to support further research.
arXiv Detail & Related papers (2024-07-06T00:40:53Z)
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models. Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging) It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data. Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z)
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning. This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models. Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z)
An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently. Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z)
Multi-Task Variational Information Bottleneck [8.55293326934818]
Multi-task learning (MTL) is an important subject in machine learning and artificial intelligence. This article proposes an MTL model based on the architecture of the variational information bottleneck (VIB) Extensive observations on three public data sets under adversarial attacks show that the proposed model is competitive to the state-of-the-art algorithms.
arXiv Detail & Related papers (2020-07-01T09:06:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.