Mirror: A Natural Language Interface for Data Querying, Summarization,
and Visualization
- URL: http://arxiv.org/abs/2303.08697v1
- Date: Wed, 15 Mar 2023 15:31:51 GMT
- Title: Mirror: A Natural Language Interface for Data Querying, Summarization,
and Visualization
- Authors: Canwen Xu and Julian McAuley and Penghan Wang
- Abstract summary: Mirror is an open-source platform for data exploration and analysis powered by large language models.
Mirror offers an intuitive natural language interface for querying databases.
Mirror also generates visualizations to facilitate understanding of the data.
- Score: 11.807687905883895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Mirror, an open-source platform for data exploration and analysis
powered by large language models. Mirror offers an intuitive natural language
interface for querying databases, and automatically generates executable SQL
commands to retrieve relevant data and summarize it in natural language. In
addition, users can preview and manually edit the generated SQL commands to
ensure the accuracy of their queries. Mirror also generates visualizations to
facilitate understanding of the data. Designed with flexibility and human input
in mind, Mirror is suitable for both experienced data analysts and
non-technical professionals looking to gain insights from their data.
Related papers
- AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries [56.82807063333088]
We introduce a new benchmark, AMBROSIA, which we hope will inform and inspire the development of text-to-open programs.
Our dataset contains questions showcasing three different types of ambiguity (scope ambiguity, attachment ambiguity, and vagueness)
In each case, the ambiguity persists even when the database context is provided.
This is achieved through a novel approach that involves controlled generation of databases from scratch.
arXiv Detail & Related papers (2024-06-27T10:43:04Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - Can LLMs Generate Visualizations with Dataless Prompts? [17.280610067626135]
We investigate the ability of large language models to provide accurate data and relevant visualizations in response to such queries.
Specifically, we investigate the ability of GPT-3 and GPT-4 to generate visualizations with dataless prompts, where no data accompanies the query.
arXiv Detail & Related papers (2024-06-22T22:59:09Z) - Prompt4Vis: Prompting Large Language Models with Example Mining and
Schema Filtering for Tabular Data Visualization [13.425454489560376]
We introduce Prompt4Vis, a framework for generating data visualization queries from natural language.
In-context learning is introduced into the text-to-vis for generating data visualization queries.
Prompt4Vis surpasses the state-of-the-art (SOTA) RGVisNet by approximately 35.9% and 71.3% on dev and test sets, respectively.
arXiv Detail & Related papers (2024-01-29T10:23:47Z) - Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey [30.836162812277085]
The rise of large language models (LLMs) has further advanced this field, opening new avenues for natural language processing techniques.
We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing.
This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements.
arXiv Detail & Related papers (2023-10-27T05:01:20Z) - Natural Language Models for Data Visualization Utilizing nvBench Dataset [6.996262696260261]
We build natural language translation models to construct simplified versions of data and visualization queries in a language called Vega Zero.
In this paper, we explore the design and performance of these sequences to sequence transformer based machine learning model architectures.
arXiv Detail & Related papers (2023-10-02T00:48:01Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - QTSumm: Query-Focused Summarization over Tabular Data [58.62152746690958]
People primarily consult tables to conduct data analysis or answer specific questions.
We define a new query-focused table summarization task, where text generation models have to perform human-like reasoning.
We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables.
arXiv Detail & Related papers (2023-05-23T17:43:51Z) - Querying Large Language Models with SQL [16.383179496709737]
In many use-cases, information is stored in text but not available in structured data.
With the rise of pre-trained Large Language Models (LLMs), there is now an effective solution to store and use information extracted from massive corpora of text documents.
We present Galois, a prototype based on a traditional database architecture, but with new physical operators for querying the underlying LLM.
arXiv Detail & Related papers (2023-04-02T06:58:14Z) - XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for
Cross-lingual Text-to-SQL Semantic Parsing [70.40401197026925]
In-context learning using large language models has recently shown surprising results for semantic parsing tasks.
This work introduces the XRICL framework, which learns to retrieve relevant English exemplars for a given query.
We also include global translation exemplars for a target language to facilitate the translation process for large language models.
arXiv Detail & Related papers (2022-10-25T01:33:49Z) - Explaining Patterns in Data with Language Models via Interpretable
Autoprompting [143.4162028260874]
We introduce interpretable autoprompting (iPrompt), an algorithm that generates a natural-language string explaining the data.
iPrompt can yield meaningful insights by accurately finding groundtruth dataset descriptions.
Experiments with an fMRI dataset show the potential for iPrompt to aid in scientific discovery.
arXiv Detail & Related papers (2022-10-04T18:32:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.