ChatGPT as your Personal Data Scientist
- URL: http://arxiv.org/abs/2305.13657v1
- Date: Tue, 23 May 2023 04:00:16 GMT
- Title: ChatGPT as your Personal Data Scientist
- Authors: Md Mahadi Hassan, Alex Knipper, Shubhra Kanti Karmaker Santu
- Abstract summary: This paper introduces a ChatGPT-based conversational data-science framework to act as a "personal data scientist"
Our model pivots around four dialogue states: Data visualization, Task Formulation, Prediction Engineering, and Result Summary and Recommendation.
In summary, we developed an end-to-end system that not only proves the viability of the novel concept of conversational data science but also underscores the potency of LLMs in solving complex tasks.
- Score: 0.9689893038619583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rise of big data has amplified the need for efficient, user-friendly
automated machine learning (AutoML) tools. However, the intricacy of
understanding domain-specific data and defining prediction tasks necessitates
human intervention making the process time-consuming while preventing full
automation. Instead, envision an intelligent agent capable of assisting users
in conducting AutoML tasks through intuitive, natural conversations without
requiring in-depth knowledge of the underlying machine learning (ML) processes.
This agent's key challenge is to accurately comprehend the user's prediction
goals and, consequently, formulate precise ML tasks, adjust data sets and model
parameters accordingly, and articulate results effectively. In this paper, we
take a pioneering step towards this ambitious goal by introducing a
ChatGPT-based conversational data-science framework to act as a "personal data
scientist". Precisely, we utilize Large Language Models (ChatGPT) to build a
natural interface between the users and the ML models (Scikit-Learn), which in
turn, allows us to approach this ambitious problem with a realistic solution.
Our model pivots around four dialogue states: Data Visualization, Task
Formulation, Prediction Engineering, and Result Summary and Recommendation.
Each state marks a unique conversation phase, impacting the overall user-system
interaction. Multiple LLM instances, serving as "micro-agents", ensure a
cohesive conversation flow, granting us granular control over the
conversation's progression. In summary, we developed an end-to-end system that
not only proves the viability of the novel concept of conversational data
science but also underscores the potency of LLMs in solving complex tasks.
Interestingly, its development spotlighted several critical weaknesses in the
current LLMs (ChatGPT) and highlighted substantial opportunities for
improvement.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk [11.706292228586332]
Large language models (LLMs) are powerful dialogue agents, but specializing them towards fulfilling a specific function can be challenging.
We propose a more effective method for data collection through LLMs engaging in a conversation in various roles.
This approach generates a training data via "self-talk" of LLMs that can be refined and utilized for supervised fine-tuning.
arXiv Detail & Related papers (2024-01-10T09:49:10Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation.
Specifically, task decomposition, tool selection, and parameter prediction are assessed.
Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z) - Adapting LLMs for Efficient, Personalized Information Retrieval: Methods
and Implications [0.7832189413179361]
Large Language Models (LLMs) excel in comprehending and generating human-like text.
This paper explores strategies for integrating Language Models (LLMs) with Information Retrieval (IR) systems.
arXiv Detail & Related papers (2023-11-21T02:01:01Z) - AutoML-GPT: Large Language Model for AutoML [5.9145212342776805]
We have established a framework called AutoML-GPT that integrates a comprehensive set of tools and libraries.
Through a conversational interface, users can specify their requirements, constraints, and evaluation metrics.
We have demonstrated that AutoML-GPT significantly reduces the time and effort required for machine learning tasks.
arXiv Detail & Related papers (2023-09-03T09:39:49Z) - AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline.
We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters.
This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z) - OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques.
We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.