RD2Bench: Toward Data-Centric Automatic R&D
- URL: http://arxiv.org/abs/2404.11276v1
- Date: Wed, 17 Apr 2024 11:33:21 GMT
- Title: RD2Bench: Toward Data-Centric Automatic R&D
- Authors: Haotian Chen, Xinjie Shen, Zeqi Ye, Xiao Yang, Xu Yang, Weiqing Liu, Jiang Bian,
- Abstract summary: Researchers often seek the potential research directions by reading and then verifying them through experiments.
The data-driven black-box deep learning method demonstrates its effectiveness in a wide range of real-world scenarios.
We propose a Real-world Data-centric automatic R&D Benchmark, namely RD2Bench.
- Score: 18.570307541212053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments. Researchers often seek the potential research directions by reading and then verifying them through experiments. The process imposes a significant burden on researchers. In the past decade, the data-driven black-box deep learning method demonstrates its effectiveness in a wide range of real-world scenarios, which exacerbates the experimental burden of researchers and thus renders the potential successful discoveries veiled. Therefore, automating such a research and development (R&D) process is an urgent need. In this paper, we serve as the first effort to formalize the goal by proposing a Real-world Data-centric automatic R&D Benchmark, namely RD2Bench. RD2Bench benchmarks all the operations in data-centric automatic R&D (D-CARD) as a whole to navigate future work toward our goal directly. We focuses on evaluating the interaction and synergistic effects of various model capabilities and aiding to select the well-performed trustworthy models. Although RD2Bench is very challenging to the state-of-the-art (SOTA) large language model (LLM) named GPT-4, indicating ample research opportunities and more research efforts, LLMs possess promising potential to bring more significant development to D-CARD: They are able to implement some simple methods without adopting any additional techniques. We appeal to future work to take developing techniques for tackling automatic R&D into consideration, thus bringing the opportunities of the potential revolutionary upgrade to human productivity.
Related papers
- Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective [111.58315434849047]
robustness of neural information retrieval models (IR) models has garnered significant attention.
We view the robustness of IR to be a multifaceted concept, emphasizing its necessity against adversarial attacks, out-of-distribution (OOD) scenarios and performance variance.
We provide an in-depth discussion of existing methods, datasets, and evaluation metrics, shedding light on challenges and future directions in the era of large language models.
arXiv Detail & Related papers (2024-07-09T16:07:01Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - Autonomous LLM-driven research from data to human-verifiable research papers [0.0]
We build an automation platform that guides interacting through complete stepwise process.
In mode provided annotated data alone, datapaper raised hypotheses, designed plans, wrote and interpreted analysis codes, generated and interpreted results.
We demonstrate potential for AI-driven acceleration of scientific discovery while enhancing traceability, transparency and verifiability.
arXiv Detail & Related papers (2024-04-24T23:15:49Z) - Data-driven Discovery with Large Generative Models [47.324203863823335]
This position paper urges the Machine Learning (ML) community to exploit the capabilities of large generative models (LGMs)
We demonstrate how LGMs fulfill several desideratas for an ideal data-driven discovery system.
We advocate for fail-proof tool integration, along with active user moderation through feedback mechanisms.
arXiv Detail & Related papers (2024-02-21T08:26:43Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - RPT: Toward Transferable Model on Heterogeneous Researcher Data via
Pre-Training [19.987304448524043]
We propose a multi-task self-supervised learning-based researcher data pre-training model named RPT.
We divide the researchers' data into semantic document sets and community graph.
We propose three self-supervised learning objectives to train the whole model.
arXiv Detail & Related papers (2021-10-08T03:42:09Z) - A Survey of Knowledge Tracing: Models, Variants, and Applications [70.69281873057619]
Knowledge Tracing is one of the fundamental tasks for student behavioral data analysis.
We present three types of fundamental KT models with distinct technical routes.
We discuss potential directions for future research in this rapidly growing field.
arXiv Detail & Related papers (2021-05-06T13:05:55Z) - Distributed Deep Reinforcement Learning: An Overview [0.0]
In this article, we provide a survey of the role of the distributed approaches in DRL.
We overview the state of the field, by studying the key research works that have a significant impact on how we can use distributed methods in DRL.
Also, we evaluate these methods on different tasks and compare their performance with each other and with single actor and learner agents.
arXiv Detail & Related papers (2020-11-22T13:24:35Z) - ACDER: Augmented Curiosity-Driven Experience Replay [16.755555854030412]
We propose a novel method called Augmented Curiosity-Driven Experience Replay (ACDER)
ACDER uses a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully.
Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push.
arXiv Detail & Related papers (2020-11-16T15:27:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.