Related papers: Conversational Data Exploration: A Game-Changer for Designing Data Science Pipelines

Related papers

Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset [47.98539809308384]
We analyze the Asta Interaction dataset, a large-scale resource comprising over 200,000 user queries and interaction logs.<n>We characterize query patterns, engagement behaviors, and how usage evolves with experience.<n>We release the anonymized dataset and analysis with a new query taxonomy to inform future designs of real-world AI research assistants.
arXiv Detail & Related papers (2026-02-26T18:40:28Z)
What's the next frontier for Data-centric AI? Data Savvy Agents [71.76058707995398]
We argue that data-savvy capabilities should be a top priority in the design of agentic systems.<n>We propose four key capabilities to realize this vision: Proactive data acquisition, Sophisticated data processing, Interactive test data synthesis, and Continual adaptation.
arXiv Detail & Related papers (2025-11-02T17:09:29Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation [64.36799373890916]
Speech emotion recognition (SER) plays a crucial role in human-computer interaction.<n>The emergence of edge devices in the Internet of Things presents challenges in constructing intricate deep learning models.<n>We propose a data distillation framework to facilitate efficient development of SER models in IoT applications.
arXiv Detail & Related papers (2024-06-21T13:10:46Z)
A Survey on Recent Advances in Conversational Data Generation [14.237954885530396]
We offer a systematic and comprehensive review of multi-turn conversational data generation. We focus on three types of dialogue systems: open domain, task-oriented, and information-seeking. We examine the evaluation metrics and methods for assessing synthetic conversational data.
arXiv Detail & Related papers (2024-05-12T10:11:12Z)
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future [59.78608958395464]
We build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets. Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models' performance in different social intelligence aspects. We show there is a need for multifaceted datasets, increased diversity in language and culture, more long-tailed social situations, and more interactive data in future social intelligence data efforts.
arXiv Detail & Related papers (2024-02-28T00:22:42Z)
Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data. We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z)
Collaborative business intelligence virtual assistant [1.9953434933575993]
This study focuses on the applications of data mining within distributed virtual teams through the interaction of users and a CBI Virtual Assistant. The proposed virtual assistant for CBI endeavors to enhance data exploration accessibility for a wider range of users and streamline the time and effort required for data analysis.
arXiv Detail & Related papers (2023-12-20T05:34:12Z)
AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets [56.052803235932686]
We propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues. In doing so, we exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets.
arXiv Detail & Related papers (2023-06-16T05:27:14Z)
On the Potential of Artificial Intelligence Chatbots for Data Exploration of Federated Bioinformatics Knowledge Graphs [0.0]
We present work in progress on the role of artificial intelligence (AI) chatbots, such as ChatGPT, in facilitating data access to federated knowledge graphs. In particular, we provide examples from the field of bioinformatics, to illustrate the potential use of Conversational AI to describe datasets, as well as generate and explain (federated) queries across datasets for the benefit of domain experts.
arXiv Detail & Related papers (2023-04-20T16:16:40Z)
Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System [48.62158108517576]
We introduce InsightPilot, an automated data exploration system designed to simplify the data exploration process. InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining. In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts.
arXiv Detail & Related papers (2023-04-02T07:27:49Z)
A Vision for Semantically Enriched Data Science [19.604667287258724]
Key areas such as utilizing domain knowledge and data semantics are areas where we have seen little automation. We envision how leveraging "semantic" understanding and reasoning on data in combination with novel tools for data science automation can help with consistent and explainable data augmentation and transformation.
arXiv Detail & Related papers (2023-03-02T16:03:12Z)
DeepShovel: An Online Collaborative Platform for Data Extraction in Geoscience Literature with AI Assistance [48.55345030503826]
Geoscientists need to read a huge amount of literature to locate, extract, and aggregate relevant results and data. DeepShovel is a publicly-available AI-assisted data extraction system to support their needs. A follow-up user evaluation with 14 researchers suggested DeepShovel improved users' efficiency of data extraction for building scientific databases.
arXiv Detail & Related papers (2022-02-21T12:18:08Z)
INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision] [30.411996388471817]
INODE is an end-to-end data exploration system. We demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics.
arXiv Detail & Related papers (2021-04-09T05:04:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.