Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic
Parsing
- URL: http://arxiv.org/abs/2012.12627v2
- Date: Thu, 31 Dec 2020 01:02:40 GMT
- Title: Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic
Parsing
- Authors: Xi Victoria Lin and Richard Socher and Caiming Xiong
- Abstract summary: BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question.
BRIDGE attained state-of-the-art performance on popular cross-DB text-to- relational benchmarks.
Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks.
- Score: 110.97778888305506
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present BRIDGE, a powerful sequential architecture for modeling
dependencies between natural language questions and relational databases in
cross-DB semantic parsing. BRIDGE represents the question and DB schema in a
tagged sequence where a subset of the fields are augmented with cell values
mentioned in the question. The hybrid sequence is encoded by BERT with minimal
subsequent layers and the text-DB contextualization is realized via the
fine-tuned deep attention in BERT. Combined with a pointer-generator decoder
with schema-consistency driven search space pruning, BRIDGE attained
state-of-the-art performance on popular cross-DB text-to-SQL benchmarks, Spider
(71.1\% dev, 67.5\% test with ensemble model) and WikiSQL (92.6\% dev, 91.9\%
test). Our analysis shows that BRIDGE effectively captures the desired
cross-modal dependencies and has the potential to generalize to more text-DB
related tasks. Our implementation is available at
\url{https://github.com/salesforce/TabularSemanticParsing}.
Related papers
- RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction.
benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection.
Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - Proton: Probing Schema Linking Information from Pre-trained Language
Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric.
Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences.
Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z) - S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder
for Text-to-SQL Parsers [66.78665327694625]
We propose S$2$, injecting Syntax to question- encoder graph for Text-to- relational parsing.
We also employ the decoupling constraint to induce diverse edge embedding, which further improves the network's performance.
Experiments on the Spider and robustness setting Spider-Syn demonstrate that the proposed approach outperforms all existing methods when pre-training models are used.
arXiv Detail & Related papers (2022-03-14T09:49:15Z) - ShadowGNN: Graph Projection Neural Network for Text-to-SQL Parser [36.12921337235763]
We propose a new architecture, ShadowGNN, which processes schemas at abstract and semantic levels.
On the challenging Text-to-Spider benchmark, empirical results show that ShadowGNN outperforms state-of-the-art models.
arXiv Detail & Related papers (2021-04-10T05:48:28Z) - Data Augmentation with Hierarchical SQL-to-Question Generation for
Cross-domain Text-to-SQL Parsing [40.65143087243074]
This paper presents a simple yet effective data augmentation framework.
First, given a database, we automatically produce a large amount ofsql queries based on an abstract syntax tree grammar citeyintranx.
Second, we propose a hierarchicalsql-to-question generation model to obtain high-quality natural language questions.
arXiv Detail & Related papers (2021-03-03T07:37:38Z) - DBTagger: Multi-Task Learning for Keyword Mapping in NLIDBs Using
Bi-Directional Recurrent Neural Networks [0.2578242050187029]
We propose a novel deep learning based supervised approach that utilizes POS tags of NLQs.
We evaluate our approach on eight different datasets, and report new state-of-the-art accuracy results, $92.4%$ on the average.
arXiv Detail & Related papers (2021-01-11T22:54:39Z) - Structure-Grounded Pretraining for Text-to-SQL [75.19554243393814]
We present a novel weakly supervised StructureStrued pretraining framework (G) for text-to-LARGE.
We identify a set of novel prediction tasks: column grounding, value grounding and column-value mapping, and leverage them to pretrain a text-table encoder.
arXiv Detail & Related papers (2020-10-24T04:35:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.