LAMBDA: A Large Model Based Data Agent
- URL: http://arxiv.org/abs/2407.17535v2
- Date: Sat, 14 Sep 2024 08:03:43 GMT
- Title: LAMBDA: A Large Model Based Data Agent
- Authors: Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan, Jian Huang,
- Abstract summary: We introduce LArge Model Based Data Agent (LAMBDA), a novel open-source, code-free multi-agent data analysis system.
LAMBDA is designed to address data analysis challenges in complex data-driven applications.
It has the potential to enhance data analysis paradigms by seamlessly integrating human and artificial intelligence.
- Score: 7.240586338370509
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce LArge Model Based Data Agent (LAMBDA), a novel open-source, code-free multi-agent data analysis system that leverages the power of large models. LAMBDA is designed to address data analysis challenges in complex data-driven applications through innovatively designed data agents that operate iteratively and generatively using natural language. At the core of LAMBDA are two key agent roles: the programmer and the inspector, which are engineered to work together seamlessly. Specifically, the programmer generates code based on the user's instructions and domain-specific knowledge, enhanced by advanced models. Meanwhile, the inspector debugs the code when necessary. To ensure robustness and handle adverse scenarios, LAMBDA features a user interface that allows direct user intervention in the operational loop. Additionally, LAMBDA can flexibly integrate external models and algorithms through our proposed Knowledge Integration Mechanism, catering to the needs of customized data analysis. LAMBDA has demonstrated strong performance on various data analysis tasks. It has the potential to enhance data analysis paradigms by seamlessly integrating human and artificial intelligence, making it more accessible, effective, and efficient for users from diverse backgrounds. The strong performance of LAMBDA in solving data analysis problems is demonstrated using real-world data examples. Videos of several case studies are available at https://xxxlambda.github.io/lambda_webpage.
Related papers
- Capturing and Anticipating User Intents in Data Analytics via Knowledge Graphs [0.061446808540639365]
This work explores the usage of Knowledge Graphs (KG) as a basic framework for capturing a human-centered manner complex analytics.
The data stored in the generated KG can then be exploited to provide assistance (e.g., recommendations) to the users interacting with these systems.
arXiv Detail & Related papers (2024-11-01T20:45:23Z) - GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration [46.663380413396226]
GraphTeam consists of five LLM-based agents from three modules, and the agents with different specialities can collaborate to address complex problems.
Experiments on six graph analysis benchmarks demonstrate that GraphTeam achieves state-of-the-art performance with an average 25.85% improvement over the best baseline in terms of accuracy.
arXiv Detail & Related papers (2024-10-23T17:02:59Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains.
BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution.
Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z) - Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning [93.96463520716759]
Large language model (LLM) agents have demonstrated impressive capabilities in utilizing external tools and knowledge to boost accuracy and hallucinations.
Here, we introduce AvaTaR, a novel and automated framework that optimize an LLM agent to effectively leverage provided tools, improving performance on a given task.
arXiv Detail & Related papers (2024-06-17T04:20:02Z) - DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation [83.30006900263744]
Data analysis is a crucial analytical process to generate in-depth studies and conclusive insights.
We propose to automatically generate high-quality answer annotations leveraging the code-generation capabilities of LLMs.
Our DACO-RL algorithm is evaluated by human annotators to produce more helpful answers than SFT model in 57.72% cases.
arXiv Detail & Related papers (2024-03-04T22:47:58Z) - Collaborative business intelligence virtual assistant [1.9953434933575993]
This study focuses on the applications of data mining within distributed virtual teams through the interaction of users and a CBI Virtual Assistant.
The proposed virtual assistant for CBI endeavors to enhance data exploration accessibility for a wider range of users and streamline the time and effort required for data analysis.
arXiv Detail & Related papers (2023-12-20T05:34:12Z) - Analytical Engines With Context-Rich Processing: Towards Efficient
Next-Generation Analytics [12.317930859033149]
We envision an analytical engine co-optimized with components that enable context-rich analysis.
We aim for a holistic pipeline cost- and rule-based optimization across relational and model-based operators.
arXiv Detail & Related papers (2022-12-14T21:46:33Z) - A Novel Micro-service Based Platform for Composition, Deployment and
Execution of BDA Applications [0.0]
ALIDA aims to achieve a unified platform that allows both BDA application developers and data analysts to interact with it.
Developers will be able to register new BDA applications through the exposed API and/or through the web user interface.
Data analysts will be able to use the BDA applications provided to create batch/stream through a dashboard user interface to manipulate and subsequently visualize results from one or more sources.
arXiv Detail & Related papers (2022-02-06T20:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.