Prompt4Vis: Prompting Large Language Models with Example Mining and
Schema Filtering for Tabular Data Visualization
- URL: http://arxiv.org/abs/2402.07909v1
- Date: Mon, 29 Jan 2024 10:23:47 GMT
- Title: Prompt4Vis: Prompting Large Language Models with Example Mining and
Schema Filtering for Tabular Data Visualization
- Authors: Shuaimin Li, Xuanang Chen, Yuanfeng Song, Yunze Song, Chen Zhang
- Abstract summary: We introduce Prompt4Vis, a framework for generating data visualization queries from natural language.
In-context learning is introduced into the text-to-vis for generating data visualization queries.
Prompt4Vis surpasses the state-of-the-art (SOTA) RGVisNet by approximately 35.9% and 71.3% on dev and test sets, respectively.
- Score: 13.425454489560376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data visualization (DV) systems are increasingly recognized for their
profound capability to uncover insights from vast datasets, gaining attention
across both industry and academia. Crafting data queries is an essential
process within certain declarative visualization languages (DVLs, e.g.,
Vega-Lite, EChart.). The evolution of natural language processing (NLP)
technologies has streamlined the use of natural language interfaces to
visualize tabular data, offering a more accessible and intuitive user
experience. However, current methods for converting natural language questions
into data visualization queries, such as Seq2Vis, ncNet, and RGVisNet, despite
utilizing complex neural network architectures, still fall short of
expectations and have great room for improvement.
Large language models (LLMs) such as ChatGPT and GPT-4, have established new
benchmarks in a variety of NLP tasks, fundamentally altering the landscape of
the field. Inspired by these advancements, we introduce a novel framework,
Prompt4Vis, leveraging LLMs and in-context learning to enhance the performance
of generating data visualization from natural language. Prompt4Vis comprises
two key components: (1) a multi-objective example mining module, designed to
find out the truly effective examples that strengthen the LLM's in-context
learning capabilities for text-to-vis; (2) a schema filtering module, which is
proposed to simplify the schema of the database. Extensive experiments through
5-fold cross-validation on the NVBench dataset demonstrate the superiority of
Prompt4Vis, which notably surpasses the state-of-the-art (SOTA) RGVisNet by
approximately 35.9% and 71.3% on dev and test sets, respectively. To the best
of our knowledge, Prompt4Vis is the first work that introduces in-context
learning into the text-to-vis for generating data visualization queries.
Related papers
- Large Language Models Understand Layout [6.732578061359833]
Large language models (LLMs) demonstrate extraordinary abilities in a wide range of natural language processing (NLP) tasks.
We show that, beyond text understanding capability, LLMs are capable of processing text layouts denoted by spatial markers.
We show that layout understanding ability is beneficial for building efficient visual question-answering (VQA) systems.
arXiv Detail & Related papers (2024-07-08T09:03:12Z) - VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models [76.94378391979228]
We introduce a new, more demanding task known as Interleaved Image-Text (IITC)
This task challenges models to discern and disregard superfluous elements in both images and text to accurately answer questions.
In support of this task, we further craft a new VEGA dataset, tailored for the IITC task on scientific content, and devised a subtask, Image-Text Association (ITA)
arXiv Detail & Related papers (2024-06-14T17:59:40Z) - Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study [41.84915013818794]
The Natural Language to Visualization (NL2Vis) task aims to transform natural-language descriptions into visual representations for a grounded table.
Many deep learning-based approaches have been developed for NL2Vis, but challenges persist in visualizing data sourced from unseen databases or spanning multiple tables.
Taking inspiration from the remarkable generation capabilities of Large Language Models (LLMs), this paper conducts an empirical study to evaluate their potential in generating visualizations.
arXiv Detail & Related papers (2024-04-26T03:25:35Z) - Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models [25.088717058818528]
We introduce nine vision-and-language (VL) tasks and construct multilingual visual-text datasets in four languages: English, Japanese, Swahili, and Urdu.
Our work is the first to conduct such analyses in Swahili and Urdu. Also, it introduces textitrationales in VL analysis, which played a vital role in the evaluation.
arXiv Detail & Related papers (2024-03-29T10:53:07Z) - Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey [30.836162812277085]
The rise of large language models (LLMs) has further advanced this field, opening new avenues for natural language processing techniques.
We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing.
This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements.
arXiv Detail & Related papers (2023-10-27T05:01:20Z) - Natural Language Models for Data Visualization Utilizing nvBench Dataset [6.996262696260261]
We build natural language translation models to construct simplified versions of data and visualization queries in a language called Vega Zero.
In this paper, we explore the design and performance of these sequences to sequence transformer based machine learning model architectures.
arXiv Detail & Related papers (2023-10-02T00:48:01Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z) - Visual Instruction Tuning [79.70923292053097]
We present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data.
By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant.
When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%.
arXiv Detail & Related papers (2023-04-17T17:59:25Z) - Using Large Language Models to Generate Engaging Captions for Data
Visualizations [51.98253121636079]
Large language models (LLM) use sophisticated deep learning technology to produce human-like prose.
Key challenge lies in designing the most effective prompt for the LLM, a task called prompt engineering.
We report on first experiments using the popular LLM GPT-3 and deliver some promising results.
arXiv Detail & Related papers (2022-12-27T23:56:57Z) - Explaining Patterns in Data with Language Models via Interpretable
Autoprompting [143.4162028260874]
We introduce interpretable autoprompting (iPrompt), an algorithm that generates a natural-language string explaining the data.
iPrompt can yield meaningful insights by accurately finding groundtruth dataset descriptions.
Experiments with an fMRI dataset show the potential for iPrompt to aid in scientific discovery.
arXiv Detail & Related papers (2022-10-04T18:32:14Z) - XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems
to Improve Language Understanding [73.24847320536813]
This study explores distilling visual information from pretrained multimodal transformers to pretrained language encoders.
Our framework is inspired by cross-modal encoders' success in visual-language tasks while we alter the learning objective to cater to the language-heavy characteristics of NLU.
arXiv Detail & Related papers (2022-04-15T03:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.