Large Language Model-based Data Science Agent: A Survey
- URL: http://arxiv.org/abs/2508.02744v1
- Date: Sat, 02 Aug 2025 17:33:18 GMT
- Title: Large Language Model-based Data Science Agent: A Survey
- Authors: Peiran Wang, Yaoning Yu, Ke Chen, Xianyang Zhan, Haohan Wang,
- Abstract summary: This survey presents a comprehensive analysis of LLM-based agents designed for data science tasks.<n>From the agent perspective, we discuss the key design principles, covering agent roles, execution, knowledge, and reflection methods.<n>From the data science perspective, we identify key processes for LLM-based agents, including data preprocessing, model development, evaluation, visualization, etc.
- Score: 14.31246443624872
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid advancement of Large Language Models (LLMs) has driven novel applications across diverse domains, with LLM-based agents emerging as a crucial area of exploration. This survey presents a comprehensive analysis of LLM-based agents designed for data science tasks, summarizing insights from recent studies. From the agent perspective, we discuss the key design principles, covering agent roles, execution, knowledge, and reflection methods. From the data science perspective, we identify key processes for LLM-based agents, including data preprocessing, model development, evaluation, visualization, etc. Our work offers two key contributions: (1) a comprehensive review of recent developments in applying LLMbased agents to data science tasks; (2) a dual-perspective framework that connects general agent design principles with the practical workflows in data science.
Related papers
- A Survey of AIOps in the Era of Large Language Models [60.59720351854515]
We analyzed 183 research papers published between January 2020 and December 2024 to answer four key research questions (RQs)<n>We discuss the state-of-the-art advancements and trends, identify gaps in existing research, and propose promising directions for future exploration.
arXiv Detail & Related papers (2025-06-23T02:40:16Z) - Large Language Model Agent: A Survey on Methodology, Applications and Challenges [88.3032929492409]
Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence.<n>This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy.<n>Our work provides a unified architectural perspective, examining how agents are constructed, how they collaborate, and how they evolve over time.
arXiv Detail & Related papers (2025-03-27T12:50:17Z) - Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets [3.8740749765622167]
Large language models (LLMs) and transformer-based architectures are increasingly utilized for source code analysis.<n>This paper explores the role of LLMs for different code analysis tasks, focusing on three key aspects.
arXiv Detail & Related papers (2025-03-21T19:29:50Z) - The Evolution of LLM Adoption in Industry Data Curation Practices [20.143297690624298]
This paper explores the evolution of large language models (LLMs) among practitioners at a large technology company.<n>Through a series of surveys, interviews, and user studies, we provide a timely snapshot of how organizations are navigating a pivotal moment in LLM evolution.
arXiv Detail & Related papers (2024-12-20T17:34:16Z) - A Survey on Large Language Model-based Agents for Statistics and Data Science [7.240586338370509]
Data science agents powered by Large Language Models (LLMs) have shown significant potential to transform the traditional data analysis paradigm.<n>This survey provides an overview of the evolution, capabilities, and applications of LLM-based data agents.
arXiv Detail & Related papers (2024-12-18T15:03:26Z) - MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs [97.94579295913606]
Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia.<n>In the development process, evaluation is critical since it provides intuitive feedback and guidance on improving models.<n>This work aims to offer researchers an easy grasp of how to effectively evaluate MLLMs according to different needs and to inspire better evaluation methods.
arXiv Detail & Related papers (2024-11-22T18:59:54Z) - GUI Agents with Foundation Models: A Comprehensive Survey [91.97447457550703]
This survey consolidates recent research on (M)LLM-based GUI agents.<n>We identify key challenges and propose future research directions.<n>We hope this survey will inspire further advancements in the field of (M)LLM-based GUI agents.
arXiv Detail & Related papers (2024-11-07T17:28:10Z) - A Survey on Multimodal Benchmarks: In the Era of Large AI Models [13.299775710527962]
Multimodal Large Language Models (MLLMs) have brought substantial advancements in artificial intelligence.
This survey systematically reviews 211 benchmarks that assess MLLMs across four core domains: understanding, reasoning, generation, and application.
arXiv Detail & Related papers (2024-09-21T15:22:26Z) - The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective [53.48484062444108]
We find that the development of models and data is not two separate paths but rather interconnected.
On the one hand, vaster and higher-quality data contribute to better performance of MLLMs; on the other hand, MLLMs can facilitate the development of data.
To promote the data-model co-development for MLLM community, we systematically review existing works related to MLLMs from the data-model co-development perspective.
arXiv Detail & Related papers (2024-07-11T15:08:11Z) - A Survey on the Memory Mechanism of Large Language Model based Agents [66.4963345269611]
Large language model (LLM) based agents have recently attracted much attention from the research and industry communities.
LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems.
The key component to support agent-environment interactions is the memory of the agents.
arXiv Detail & Related papers (2024-04-21T01:49:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.