Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding
- URL: http://arxiv.org/abs/2209.14415v1
- Date: Wed, 28 Sep 2022 21:00:30 GMT
- Title: Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding
- Authors: Jun Wang, Patrick Ng, Alexander Hanbo Li, Jiarong Jiang, Zhiguo Wang,
Ramesh Nallapati, Bing Xiang, Sudipta Sengupta
- Abstract summary: We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
- Score: 84.04706075621013
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most recent research on Text-to-SQL semantic parsing relies on either parser
itself or simple heuristic based approach to understand natural language query
(NLQ). When synthesizing a SQL query, there is no explicit semantic information
of NLQ available to the parser which leads to undesirable generalization
performance. In addition, without lexical-level fine-grained query
understanding, linking between query and database can only rely on fuzzy string
match which leads to suboptimal performance in real applications. In view of
this, in this paper we present a general-purpose, modular neural semantic
parsing framework that is based on token-level fine-grained query
understanding. Our framework consists of three modules: named entity recognizer
(NER), neural entity linker (NEL) and neural semantic parser (NSP). By jointly
modeling query and database, NER model analyzes user intents and identifies
entities in the query. NEL model links typed entities to schema and cell values
in database. Parser model leverages available semantic information and linking
results and synthesizes tree-structured SQL queries based on dynamically
generated grammar. Experiments on SQUALL, a newly released semantic parsing
dataset, show that we can achieve 56.8% execution accuracy on
WikiTableQuestions (WTQ) test set, which outperforms the state-of-the-art model
by 2.7%.
Related papers
- UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries.
To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z) - Semantic Parsing for Conversational Question Answering over Knowledge
Graphs [63.939700311269156]
We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
arXiv Detail & Related papers (2023-01-28T14:45:11Z) - Proton: Probing Schema Linking Information from Pre-trained Language
Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric.
Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences.
Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z) - S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder
for Text-to-SQL Parsers [66.78665327694625]
We propose S$2$, injecting Syntax to question- encoder graph for Text-to- relational parsing.
We also employ the decoupling constraint to induce diverse edge embedding, which further improves the network's performance.
Experiments on the Spider and robustness setting Spider-Syn demonstrate that the proposed approach outperforms all existing methods when pre-training models are used.
arXiv Detail & Related papers (2022-03-14T09:49:15Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - SPARQLing Database Queries from Intermediate Question Decompositions [7.475027071883912]
To translate natural language questions into database queries, most approaches rely on a fully annotated training set.
We reduce this burden using grounded in databases intermediate question representations.
Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
arXiv Detail & Related papers (2021-09-13T17:57:12Z) - ShadowGNN: Graph Projection Neural Network for Text-to-SQL Parser [36.12921337235763]
We propose a new architecture, ShadowGNN, which processes schemas at abstract and semantic levels.
On the challenging Text-to-Spider benchmark, empirical results show that ShadowGNN outperforms state-of-the-art models.
arXiv Detail & Related papers (2021-04-10T05:48:28Z) - A Tale of Two Linkings: Dynamically Gating between Schema Linking and
Structural Linking for Text-to-SQL Parsing [25.81069211061945]
In Text-to- semantic parsing, selecting the correct entities for the generatedsql query is both crucial and challenging.
We two linking processes to address this challenge: schema linking which links explicit NL mentions to the database and structural linking which links the entities in the outputsql with their structural relationships in the database schema.
Integrating the proposed method with two graph neural network-based semantics together with BERT representations demonstrates substantial gains in parsing accuracy on the challenging Spider dataset.
arXiv Detail & Related papers (2020-09-30T17:32:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.