ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries
- URL: http://arxiv.org/abs/2010.09927v1
- Date: Mon, 19 Oct 2020 23:53:17 GMT
- Title: ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries
- Authors: Karthik Radhakrishnan, Arvind Srikantan, Xi Victoria Lin
- Abstract summary: We introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL)
ColloQL achieves 84.9% (execution) and 90.7% (execution) accuracy on the Wikilogical dataset.
- Score: 10.273545005890496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Translating natural language utterances to executable queries is a helpful
technique in making the vast amount of data stored in relational databases
accessible to a wider range of non-tech-savvy end users. Prior work in this
area has largely focused on textual input that is linguistically correct and
semantically unambiguous. However, real-world user queries are often succinct,
colloquial, and noisy, resembling the input of a search engine. In this work,
we introduce data augmentation techniques and a sampling-based content-aware
BERT model (ColloQL) to achieve robust text-to-SQL modeling over natural
language search (NLS) questions. Due to the lack of evaluation data, we curate
a new dataset of NLS questions and demonstrate the efficacy of our approach.
ColloQL's superior performance extends to well-formed text, achieving 84.9%
(logical) and 90.7% (execution) accuracy on the WikiSQL dataset, making it, to
the best of our knowledge, the highest performing model that does not use
execution guided decoding.
Related papers
- A Survey on Employing Large Language Models for Text-to-SQL Tasks [7.728180183687891]
The increasing volume of data stored in relational databases has led to the need for efficient querying and utilization of this data in various sectors.
To take advantage of the recent developments in Large Language Models (LLMs), a range of new methods have emerged, with a primary focus on prompt engineering and fine-tuning.
arXiv Detail & Related papers (2024-07-21T14:48:23Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries.
To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z) - From Text to CQL: Bridging Natural Language and Corpus Search Engine [27.56738323943742]
Corpus Query Language (CQL) is a critical tool for linguistic research and detailed analysis within text corpora.
This paper presents the first text-to-CQL task that aims to automate the translation of natural language into CQL.
arXiv Detail & Related papers (2024-02-21T12:11:28Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - QTSumm: Query-Focused Summarization over Tabular Data [58.62152746690958]
People primarily consult tables to conduct data analysis or answer specific questions.
We define a new query-focused table summarization task, where text generation models have to perform human-like reasoning.
We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables.
arXiv Detail & Related papers (2023-05-23T17:43:51Z) - XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for
Cross-lingual Text-to-SQL Semantic Parsing [70.40401197026925]
In-context learning using large language models has recently shown surprising results for semantic parsing tasks.
This work introduces the XRICL framework, which learns to retrieve relevant English exemplars for a given query.
We also include global translation exemplars for a target language to facilitate the translation process for large language models.
arXiv Detail & Related papers (2022-10-25T01:33:49Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - SPARQLing Database Queries from Intermediate Question Decompositions [7.475027071883912]
To translate natural language questions into database queries, most approaches rely on a fully annotated training set.
We reduce this burden using grounded in databases intermediate question representations.
Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
arXiv Detail & Related papers (2021-09-13T17:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.