Natural Language Models for Data Visualization Utilizing nvBench Dataset
- URL: http://arxiv.org/abs/2310.00832v1
- Date: Mon, 2 Oct 2023 00:48:01 GMT
- Title: Natural Language Models for Data Visualization Utilizing nvBench Dataset
- Authors: Shuo Wang and Carlos Crespo-Quinones
- Abstract summary: We build natural language translation models to construct simplified versions of data and visualization queries in a language called Vega Zero.
In this paper, we explore the design and performance of these sequences to sequence transformer based machine learning model architectures.
- Score: 6.996262696260261
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Translation of natural language into syntactically correct commands for data
visualization is an important application of natural language models and could
be leveraged to many different tasks. A closely related effort is the task of
translating natural languages into SQL queries, which in turn could be
translated into visualization with additional information from the natural
language query supplied\cite{Zhong:2017qr}. Contributing to the progress in
this area of research, we built natural language translation models to
construct simplified versions of data and visualization queries in a language
called Vega Zero. In this paper, we explore the design and performance of these
sequence to sequence transformer based machine learning model architectures
using large language models such as BERT as encoders to predict visualization
commands from natural language queries, as well as apply available T5 sequence
to sequence models to the problem for comparison.
Related papers
- Prompt4Vis: Prompting Large Language Models with Example Mining and
Schema Filtering for Tabular Data Visualization [13.425454489560376]
We introduce Prompt4Vis, a framework for generating data visualization queries from natural language.
In-context learning is introduced into the text-to-vis for generating data visualization queries.
Prompt4Vis surpasses the state-of-the-art (SOTA) RGVisNet by approximately 35.9% and 71.3% on dev and test sets, respectively.
arXiv Detail & Related papers (2024-01-29T10:23:47Z) - Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey [30.836162812277085]
The rise of large language models (LLMs) has further advanced this field, opening new avenues for natural language processing techniques.
We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing.
This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements.
arXiv Detail & Related papers (2023-10-27T05:01:20Z) - Constructing Multilingual Code Search Dataset Using Neural Machine
Translation [48.32329232202801]
We create a multilingual code search dataset in four natural and four programming languages.
Our results show that the model pre-trained with all natural and programming language data has performed best in most cases.
arXiv Detail & Related papers (2023-06-27T16:42:36Z) - Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding.
In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs.
Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z) - XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for
Cross-lingual Text-to-SQL Semantic Parsing [70.40401197026925]
In-context learning using large language models has recently shown surprising results for semantic parsing tasks.
This work introduces the XRICL framework, which learns to retrieve relevant English exemplars for a given query.
We also include global translation exemplars for a target language to facilitate the translation process for large language models.
arXiv Detail & Related papers (2022-10-25T01:33:49Z) - Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle [88.65264818967489]
We propose a new syntax-aware language model: Syntactic Ordered Memory (SOM)
The model explicitly models the structure with an incremental and maintains the conditional probability setting of a standard language model.
Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests.
arXiv Detail & Related papers (2020-10-21T17:39:15Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing.
We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z) - What the [MASK]? Making Sense of Language-Specific BERT Models [39.54532211263058]
This paper presents the current state of the art in language-specific BERT models.
Our aim is to provide an overview of the commonalities and differences between Language-language-specific BERT models and mBERT models.
arXiv Detail & Related papers (2020-03-05T20:42:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.