Related papers: Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing

Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing

URL: http://arxiv.org/abs/2406.14054v1
Date: Thu, 20 Jun 2024 07:24:24 GMT
Title: Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing
Authors: Xinbo Zhao, Yingxue Zhang, Xin Zhang, Yu Yang, Yiqun Xie, Yanhua Li, Jun Luo,
Abstract summary: We introduce MODA -- a Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing approach. We develop a novel model-based multi-task offline RL algorithm. Experiments conducted in a real-world multi-task urban setting validate the effectiveness of MODA.
Score: 19.139077084857487
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Enhancing diverse human decision-making processes in an urban environment is a critical issue across various applications, including ride-sharing vehicle dispatching, public transportation management, and autonomous driving. Offline reinforcement learning (RL) is a promising approach to learn and optimize human urban strategies (or policies) from pre-collected human-generated spatial-temporal urban data. However, standard offline RL faces two significant challenges: (1) data scarcity and data heterogeneity, and (2) distributional shift. In this paper, we introduce MODA -- a Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing approach. MODA addresses the challenges of data scarcity and heterogeneity in a multi-task urban setting through Contrastive Data Sharing among tasks. This technique involves extracting latent representations of human behaviors by contrasting positive and negative data pairs. It then shares data presenting similar representations with the target task, facilitating data augmentation for each task. Moreover, MODA develops a novel model-based multi-task offline RL algorithm. This algorithm constructs a robust Markov Decision Process (MDP) by integrating a dynamics model with a Generative Adversarial Network (GAN). Once the robust MDP is established, any online RL or planning algorithm can be applied. Extensive experiments conducted in a real-world multi-task urban setting validate the effectiveness of MODA. The results demonstrate that MODA exhibits significant improvements compared to state-of-the-art baselines, showcasing its capability in advancing urban decision-making processes. We also made our code available to the research community.

Related papers

MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning [20.79390984800288]
Large Language Models (LLMs) are increasingly applied across various tasks. We propose MDIT, a novel model-free data method for diverse instruction tuning. Extensive experiments show that our method achieves superior performance in multiple benchmark tasks.
arXiv Detail & Related papers (2025-04-09T21:28:17Z)
Collaborative Imputation of Urban Time Series through Cross-city Meta-learning [54.438991949772145]
We propose a novel collaborative imputation paradigm leveraging meta-learned implicit neural representations (INRs) We then introduce a cross-city collaborative learning scheme through model-agnostic meta learning. Experiments on a diverse urban dataset from 20 global cities demonstrate our model's superior imputation performance and generalizability.
arXiv Detail & Related papers (2025-01-20T07:12:40Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL [57.202733701029594]
Decision Mamba is a novel multi-grained state space model with a self-evolving policy learning strategy. To mitigate the overfitting issue on noisy trajectories, a self-evolving policy is proposed by using progressive regularization. The policy evolves by using its own past knowledge to refine the suboptimal actions, thus enhancing its robustness on noisy demonstrations.
arXiv Detail & Related papers (2024-06-08T10:12:00Z)
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications. Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders. We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z)
MEL: Efficient Multi-Task Evolutionary Learning for High-Dimensional Feature Selection [11.934379476825551]
We propose a novel approach called PSO-based Multi-task Evolutionary Learning (MEL) By incorporating information sharing between different feature selection tasks, MEL achieves enhanced learning ability and efficiency. We evaluate the effectiveness of MEL through extensive experiments on 22 high-dimensional datasets.
arXiv Detail & Related papers (2024-02-14T06:51:49Z)
M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL) Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms. We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z)
UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation [59.77275587857252]
A holistic human dataset inevitably has insufficient and low-resolution information on local parts. We propose to use multi-source datasets with various resolution images to jointly learn a high-resolution human generative model.
arXiv Detail & Related papers (2023-09-25T17:58:46Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis. For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z)
Distilled Mid-Fusion Transformer Networks for Multi-Modal Human Activity Recognition [34.424960016807795]
Multi-modal Human Activity Recognition could utilize the complementary information to build models that can generalize well. Deep learning methods have shown promising results, their potential in extracting salient multi-modal spatial-temporal features has not been fully explored. A knowledge distillation-based Multi-modal Mid-Fusion approach, DMFT, is proposed to conduct informative feature extraction and fusion to resolve the Multi-modal Human Activity Recognition task efficiently.
arXiv Detail & Related papers (2023-05-05T19:26:06Z)
A Transformer Framework for Data Fusion and Multi-Task Learning in Smart Cities [99.56635097352628]
This paper proposes a Transformer-based AI system for emerging smart cities. It supports virtually any input data and output task types present S&CCs. It is demonstrated through learning diverse task sets representative of S&CC environments.
arXiv Detail & Related papers (2022-11-18T20:43:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.