Related papers: Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

URL: http://arxiv.org/abs/2310.17894v3
Date: Mon, 20 May 2024 02:45:37 GMT
Title: Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey
Authors: Weixu Zhang, Yifei Wang, Yuanfeng Song, Victor Junqiu Wei, Yuxing Tian, Yiyan Qi, Jonathan H. Chan, Raymond Chi-Wing Wong, Haiqin Yang,
Abstract summary: The rise of large language models (LLMs) has further advanced this field, opening new avenues for natural language processing techniques. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements.
Score: 30.836162812277085
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models.

Related papers

Improving Large Vision-Language Models' Understanding for Field Data [62.917026891829025]
We introduce FieldLVLM, a framework designed to improve large vision-language models' understanding of field data.<n>FieldLVLM consists of two main components: a field-aware language generation strategy and a data-compressed multimodal model tuning.<n> Experimental results on newly proposed benchmark datasets demonstrate that FieldLVLM significantly outperforms existing methods in tasks involving scientific field data.
arXiv Detail & Related papers (2025-07-24T11:28:53Z)
Text-to-TrajVis: Enabling Trajectory Data Visualizations from Natural Language Questions [7.042074641736026]
This paper introduces the Text-to-TrajVis task, which aims to transform natural language questions into trajectory data visualizations. As this is a novel task, there is currently no relevant dataset available in the community.
arXiv Detail & Related papers (2025-04-23T02:15:52Z)
Prompt4Vis: Prompting Large Language Models with Example Mining and Schema Filtering for Tabular Data Visualization [13.425454489560376]
We introduce Prompt4Vis, a framework for generating data visualization queries from natural language. In-context learning is introduced into the text-to-vis for generating data visualization queries. Prompt4Vis surpasses the state-of-the-art (SOTA) RGVisNet by approximately 35.9% and 71.3% on dev and test sets, respectively.
arXiv Detail & Related papers (2024-01-29T10:23:47Z)
Natural Language Models for Data Visualization Utilizing nvBench Dataset [6.996262696260261]
We build natural language translation models to construct simplified versions of data and visualization queries in a language called Vega Zero. In this paper, we explore the design and performance of these sequences to sequence transformer based machine learning model architectures.
arXiv Detail & Related papers (2023-10-02T00:48:01Z)
Using Large Language Models to Generate Engaging Captions for Data Visualizations [51.98253121636079]
Large language models (LLM) use sophisticated deep learning technology to produce human-like prose. Key challenge lies in designing the most effective prompt for the LLM, a task called prompt engineering. We report on first experiments using the popular LLM GPT-3 and deliver some promising results.
arXiv Detail & Related papers (2022-12-27T23:56:57Z)
Recent Advances in Text-to-SQL: A Survey of What We Have and What We Expect [12.445150614650801]
Text-to-of has attracted attention from both the natural language processing and database communities. We review recent progress on text-to-of for datasets, methods, and evaluation. We hope that this survey can serve as quick access to existing work and motivate future research.
arXiv Detail & Related papers (2022-08-22T07:18:23Z)
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding. COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z)
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments [54.405920619915655]
We introduce Mobile app Tasks with Iterative Feedback (MoTIF), a dataset with natural language commands for the greatest number of interactive environments to date. MoTIF is the first to contain natural language requests for interactive environments that are not satisfiable. We perform initial feasibility classification experiments and only reach an F1 score of 37.3, verifying the need for richer vision-language representations.
arXiv Detail & Related papers (2021-04-17T14:48:02Z)
"What Do You Mean by That?" A Parser-Independent Interactive Approach for Enhancing Text-to-SQL [49.85635994436742]
We include human in the loop and present a novel-independent interactive approach (PIIA) that interacts with users using multi-choice questions. PIIA is capable of enhancing the text-to-domain performance with limited interaction turns by using both simulation and human evaluation.
arXiv Detail & Related papers (2020-11-09T02:14:33Z)
Efficient Deployment of Conversational Natural Language Interfaces over Databases [45.52672694140881]
We propose a novel method for accelerating the training dataset collection for developing the natural language-to-query-language machine learning models. Our system allows one to generate conversational multi-term data, where multiple turns define a dialogue session.
arXiv Detail & Related papers (2020-05-31T19:16:27Z)
Recent Advances in SQL Query Generation: A Survey [0.0]
With the rise of deep learning techniques, there is extensive ongoing research in designing a suitable natural language interface to relational databases. We describe models with various architectures such as convolutional neural networks, recurrent neural networks, pointer networks, reinforcement learning, etc.
arXiv Detail & Related papers (2020-05-15T17:31:29Z)
A Multi-Perspective Architecture for Semantic Code Search [58.73778219645548]
We propose a novel multi-perspective cross-lingual neural framework for code--text matching. Our experiments on the CoNaLa dataset show that our proposed model yields better performance than previous approaches.
arXiv Detail & Related papers (2020-05-06T04:46:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.