Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2409.12001v1
- Date: Wed, 18 Sep 2024 14:13:24 GMT
- Title: Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning
- Authors: Claude Formanek, Louise Beyers, Callum Rhys Tilbury, Jonathan P. Shock, Arnu Pretorius,
- Abstract summary: offline multi-agent reinforcement learning (MARL) is an exciting direction of research that uses static datasets to find optimal control policies for multi-agent systems.
Though the field is by definition data-driven, efforts have thus far neglected data in their drive to achieve state-of-the-art results.
We show how the majority of works generate their own datasets without consistent methodology and provide sparse information about the characteristics of these datasets.
- Score: 3.623224034411137
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Offline multi-agent reinforcement learning (MARL) is an exciting direction of research that uses static datasets to find optimal control policies for multi-agent systems. Though the field is by definition data-driven, efforts have thus far neglected data in their drive to achieve state-of-the-art results. We first substantiate this claim by surveying the literature, showing how the majority of works generate their own datasets without consistent methodology and provide sparse information about the characteristics of these datasets. We then show why neglecting the nature of the data is problematic, through salient examples of how tightly algorithmic performance is coupled to the dataset used, necessitating a common foundation for experiments in the field. In response, we take a big step towards improving data usage and data awareness in offline MARL, with three key contributions: (1) a clear guideline for generating novel datasets; (2) a standardisation of over 80 existing datasets, hosted in a publicly available repository, using a consistent storage format and easy-to-use API; and (3) a suite of analysis tools that allow us to understand these datasets better, aiding further development.
Related papers
- A Survey on Data Synthesis and Augmentation for Large Language Models [35.59526251210408]
This paper reviews and summarizes data generation techniques throughout the lifecycle of Large Language Models.
We discuss the current constraints faced by these methods and investigate potential pathways for future development and research.
arXiv Detail & Related papers (2024-10-16T16:12:39Z) - Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models [79.65071553905021]
We propose Data Advisor, a method for generating data that takes into account the characteristics of the desired dataset.
Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation.
arXiv Detail & Related papers (2024-10-07T17:59:58Z) - SEGAN: semi-supervised learning approach for missing data imputation [12.552699799009037]
This paper proposes a missing data completion model SEGAN based on semi-supervised learning.
In the SEGAN model, the classifier enables the generator to make more full use of known data and its label information when predicting missing data values.
This paper theoretically proves that the SEGAN model can learn the real known data distribution characteristics when reaching Nash equilibrium.
arXiv Detail & Related papers (2024-05-21T11:42:20Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - The Challenges of Exploration for Offline Reinforcement Learning [8.484491887821473]
We study the two processes of reinforcement learning: collecting informative experience and inferring optimal behaviour.
The task-agnostic setting for data collection, where the task is not known a priori, is of particular interest.
We use this decoupled framework to strengthen intuitions about exploration and the data prerequisites for effective offline RL.
arXiv Detail & Related papers (2022-01-27T23:59:56Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Dual-Teacher: Integrating Intra-domain and Inter-domain Teachers for
Annotation-efficient Cardiac Segmentation [65.81546955181781]
We propose a novel semi-supervised domain adaptation approach, namely Dual-Teacher.
The student model learns the knowledge of unlabeled target data and labeled source data by two teacher models.
We demonstrate that our approach is able to concurrently utilize unlabeled data and cross-modality data with superior performance.
arXiv Detail & Related papers (2020-07-13T10:00:44Z) - D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.