Related papers: Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning

Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2409.12001v1
Date: Wed, 18 Sep 2024 14:13:24 GMT
Title: Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning
Authors: Claude Formanek, Louise Beyers, Callum Rhys Tilbury, Jonathan P. Shock, Arnu Pretorius,
Abstract summary: offline multi-agent reinforcement learning (MARL) is an exciting direction of research that uses static datasets to find optimal control policies for multi-agent systems. Though the field is by definition data-driven, efforts have thus far neglected data in their drive to achieve state-of-the-art results. We show how the majority of works generate their own datasets without consistent methodology and provide sparse information about the characteristics of these datasets.
Score: 3.623224034411137
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Offline multi-agent reinforcement learning (MARL) is an exciting direction of research that uses static datasets to find optimal control policies for multi-agent systems. Though the field is by definition data-driven, efforts have thus far neglected data in their drive to achieve state-of-the-art results. We first substantiate this claim by surveying the literature, showing how the majority of works generate their own datasets without consistent methodology and provide sparse information about the characteristics of these datasets. We then show why neglecting the nature of the data is problematic, through salient examples of how tightly algorithmic performance is coupled to the dataset used, necessitating a common foundation for experiments in the field. In response, we take a big step towards improving data usage and data awareness in offline MARL, with three key contributions: (1) a clear guideline for generating novel datasets; (2) a standardisation of over 80 existing datasets, hosted in a publicly available repository, using a consistent storage format and easy-to-use API; and (3) a suite of analysis tools that allow us to understand these datasets better, aiding further development.

Related papers

Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs [66.63911043019294]
Data preparation aims to denoise raw datasets, uncover cross-dataset relationships, and extract valuable insights from them.<n>This paper focuses on the use of LLM techniques to prepare data for diverse downstream tasks.<n>We introduce a task-centric taxonomy that organizes the field into three major tasks: data cleaning, standardization, error processing, imputation, data integration, and data enrichment.
arXiv Detail & Related papers (2026-01-22T12:02:45Z)
Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets [46.480867560675584]
We propose a paradigm shift from ad-hoc curation to a closed-loop dataset engineering framework using OpenDataArena (ODA)<n>We instantiate this methodology through two new datasets: textbfODA-Math460-k, a specialized mathematics reasoning dataset that utilizes a novel two-stage difficulty-aware pipeline to achieve State-of-the-Art (SOTA) results on benchmarks such as AIME and HMMT, and textbfODA-Mixture (100k & 500k), a series of multi-domain instruction datasets built via an Anchor-and-
arXiv Detail & Related papers (2025-12-30T17:46:38Z)
Scaling Generalist Data-Analytic Agents [95.05161133349242]
DataMind is a scalable data synthesis and agent training recipe designed to build generalist data-analytic agents.<n>DataMind tackles three key challenges in building open-source data-analytic agents.
arXiv Detail & Related papers (2025-09-29T17:23:08Z)
AutoData: A Multi-Agent System for Open Web Data Collection [37.832257245199365]
AutoData is a novel multi-agent system for Automated web Data collection that requires minimal human intervention.<n>Instruct2DS is a new benchmark dataset supporting live data collection from web sources across three domains: academic, finance, and sports.
arXiv Detail & Related papers (2025-05-21T04:32:35Z)
Generating Skyline Datasets for Data Science Models [11.454081868173725]
This paper introduces MODis, a framework that discovers datasets by optimizing multiple user-defined, model-performance measures. We derive three feasible algorithms to generate skyline datasets. We experimentally verify the efficiency and effectiveness of our skyline data discovery algorithms.
arXiv Detail & Related papers (2025-02-16T20:33:59Z)
Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers [0.0]
This paper presents a machine learning framework that automates dataset mention detection across research domains. We employ zero-shot extraction from research papers, an LLM-as-a-Judge for quality assessment, and a reasoning agent for refinement to generate a weakly supervised synthetic dataset. At inference, a ModernBERT-based classifier efficiently filters dataset mentions, reducing computational overhead while maintaining high recall.
arXiv Detail & Related papers (2025-02-14T16:16:02Z)
Making Sense of Data in the Wild: Data Analysis Automation at Scale [0.1747623282473278]
We propose a novel approach that combines intelligent agents with retrieval augmented generation to automate data analysis, dataset curation and indexing at scale. We demonstrate that our approach results in more detailed dataset descriptions, higher hit rates and greater diversity in dataset retrieval tasks.
arXiv Detail & Related papers (2025-01-27T10:04:10Z)
A Survey on Data Synthesis and Augmentation for Large Language Models [35.59526251210408]
This paper reviews and summarizes data generation techniques throughout the lifecycle of Large Language Models. We discuss the current constraints faced by these methods and investigate potential pathways for future development and research.
arXiv Detail & Related papers (2024-10-16T16:12:39Z)
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models [79.65071553905021]
We propose Data Advisor, a method for generating data that takes into account the characteristics of the desired dataset. Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation.
arXiv Detail & Related papers (2024-10-07T17:59:58Z)
Metadata-based Data Exploration with Retrieval-Augmented Generation for Large Language Models [3.7685718201378746]
This research introduces a new architecture for data exploration which employs a form of Retrieval-Augmented Generation (RAG) to enhance metadata-based data discovery. The proposed framework offers a new method for evaluating semantic similarity among heterogeneous data sources.
arXiv Detail & Related papers (2024-10-05T17:11:37Z)
SEGAN: semi-supervised learning approach for missing data imputation [12.552699799009037]
This paper proposes a missing data completion model SEGAN based on semi-supervised learning. In the SEGAN model, the classifier enables the generator to make more full use of known data and its label information when predicting missing data values. This paper theoretically proves that the SEGAN model can learn the real known data distribution characteristics when reaching Nash equilibrium.
arXiv Detail & Related papers (2024-05-21T11:42:20Z)
infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization. infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information. In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z)
TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets. We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
Unsupervised Domain Adaptive Learning via Synthetic Data for Person Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance. Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models. In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
Dual-Teacher: Integrating Intra-domain and Inter-domain Teachers for Annotation-efficient Cardiac Segmentation [65.81546955181781]
We propose a novel semi-supervised domain adaptation approach, namely Dual-Teacher. The student model learns the knowledge of unlabeled target data and labeled source data by two teacher models. We demonstrate that our approach is able to concurrently utilize unlabeled data and cross-modality data with superior performance.
arXiv Detail & Related papers (2020-07-13T10:00:44Z)
D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL. By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.