DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
- URL: http://arxiv.org/abs/2410.06215v1
- Date: Tue, 8 Oct 2024 17:20:37 GMT
- Title: DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
- Authors: Zaid Khan, Elias Stengel-Eskin, Jaemin Cho, Mohit Bansal,
- Abstract summary: We introduce DataEnvGym, a testbed of teacher environments for data generation agents.
DataEnvGym frames data generation as a sequential decision-making task.
Agent's goal is to improve student performance.
We support 3 diverse tasks (math, code, and VQA) and test multiple students and teachers.
- Score: 62.235925602004535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The process of creating training data to teach models is currently driven by humans, who manually analyze model weaknesses and plan how to create data that improves a student model. Recent approaches using LLMs as annotators reduce human effort, but still require humans to interpret feedback from evaluations and control the LLM to produce data the student needs. Automating this labor-intensive process by creating autonomous data generation agents - or teachers - is desirable, but requires environments that can simulate the feedback-driven, iterative, closed loop of data creation. To enable rapid and scalable testing for such agents and their modules, we introduce DataEnvGym, a testbed of teacher environments for data generation agents. DataEnvGym frames data generation as a sequential decision-making task, involving an agent consisting of a data generation policy (which generates a plan for creating training data) and a data generation engine (which transforms the plan into data), inside an environment that provides student feedback. The agent's goal is to improve student performance. Students are iteratively trained and evaluated on generated data, with their feedback (in the form of errors or weak skills) being reported to the agent after each iteration. DataEnvGym includes multiple teacher environment instantiations across 3 levels of structure in the state representation and action space. More structured environments are based on inferred skills and offer more interpretability and curriculum control. We support 3 diverse tasks (math, code, and VQA) and test multiple students and teachers. Example agents in our teaching environments can iteratively improve students across tasks and settings. Moreover, we show that environments teach different skill levels and test variants of key modules, pointing to future work in improving data generation agents, engines, and feedback mechanisms.
Related papers
- Dynamic Skill Adaptation for Large Language Models [78.31322532135272]
We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel and complex skills to Large Language Models (LLMs)
For every skill, we utilize LLMs to generate both textbook-like data which contains detailed descriptions of skills for pre-training and exercise-like data which targets at explicitly utilizing the skills to solve problems for instruction-tuning.
Experiments on large language models such as LLAMA and Mistral demonstrate the effectiveness of our proposed methods in adapting math reasoning skills and social study skills.
arXiv Detail & Related papers (2024-12-26T22:04:23Z) - DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production [5.030384831047144]
We present DialogAgent, an automated tool for generating synthetic training data that closely mimics real developer interactions.
The tool significantly reduces the reliance on manual data generation, increasing efficiency by 4.8 times compared to traditional methods.
arXiv Detail & Related papers (2024-12-11T03:31:36Z) - Evaluating Language Models as Synthetic Data Generators [74.80905172696366]
AgoraBench is a benchmark that provides standardized settings and metrics to evaluate LMs' data generation abilities.
Through synthesizing 1.26 million training instances using 6 LMs and training 99 student models, we uncover key insights about LMs' data generation capabilities.
arXiv Detail & Related papers (2024-12-04T19:20:32Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Self-Regulated Data-Free Knowledge Amalgamation for Text Classification [9.169836450935724]
We develop a lightweight student network that can learn from multiple teacher models without accessing their original training data.
To accomplish this, we propose STRATANET, a modeling framework that produces text data tailored to each teacher.
We evaluate our method on three benchmark text classification datasets with varying labels or domains.
arXiv Detail & Related papers (2024-06-16T21:13:30Z) - GenQA: Generating Millions of Instructions from a Handful of Prompts [67.54980063851605]
Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models.
In this work, we study methods for generating large instruction datasets from a single prompt.
Our dataset meets or exceeds both WizardLM and Ultrachat on both knowledge-intensive leaderboard tasks as well as conversational evaluations.
arXiv Detail & Related papers (2024-06-14T17:44:08Z) - Data Interpreter: An LLM Agent For Data Science [43.13678782387546]
Large Language Model (LLM)-based agents have shown effectiveness across many applications.
However, their use in data science scenarios requiring solving long-term interconnected tasks, dynamic data adjustments and domain expertise remains challenging.
We present Data Interpreter, an LLM-based agent designed to automatically solve various data science problems end-to-end.
arXiv Detail & Related papers (2024-02-28T19:49:55Z) - Intermediate Training on Question Answering Datasets Improves Generative
Data Augmentation [32.83012699501051]
We improve generative data augmentation by formulating the data generation as context generation task.
We cast downstream tasks into question answering format and adapt the fine-tuned context generators to the target task domain.
We demonstrate substantial improvements in performance in few-shot, zero-shot settings.
arXiv Detail & Related papers (2022-05-25T09:28:21Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.