Proton: Probing Schema Linking Information from Pre-trained Language
Models for Text-to-SQL Parsing
- URL: http://arxiv.org/abs/2206.14017v1
- Date: Tue, 28 Jun 2022 14:05:25 GMT
- Title: Proton: Probing Schema Linking Information from Pre-trained Language
Models for Text-to-SQL Parsing
- Authors: Lihan Wang, Bowen Qin, Binyuan Hui, Bowen Li, Min Yang, Bailin Wang,
Binhua Li, Fei Huang, Luo Si, Yongbin Li
- Abstract summary: We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric.
Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences.
Our framework sets new state-of-the-art performance on three benchmarks.
- Score: 66.55478402233399
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The importance of building text-to-SQL parsers which can be applied to new
databases has long been acknowledged, and a critical step to achieve this goal
is schema linking, i.e., properly recognizing mentions of unseen columns or
tables when generating SQLs. In this work, we propose a novel framework to
elicit relational structures from large-scale pre-trained language models
(PLMs) via a probing procedure based on Poincar\'e distance metric, and use the
induced relations to augment current graph-based parsers for better schema
linking. Compared with commonly-used rule-based methods for schema linking, we
found that probing relations can robustly capture semantic correspondences,
even when surface forms of mentions and entities differ. Moreover, our probing
procedure is entirely unsupervised and requires no additional parameters.
Extensive experiments show that our framework sets new state-of-the-art
performance on three benchmarks. We empirically verify that our probing
procedure can indeed find desired relational structures through qualitative
analysis.
Related papers
- The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models [0.9149661171430259]
We revisit schema linking when using the latest generation of large language models (LLMs)
We find empirically that newer models are adept at utilizing relevant schema elements during generation even in the presence of large numbers of irrelevant ones.
Instead of filtering contextual information, we highlight techniques such as augmentation, selection, and correction, and adopt them to improve the accuracy of our Text-to-BIRD pipeline.
arXiv Detail & Related papers (2024-08-14T17:59:04Z) - Schema-Aware Multi-Task Learning for Complex Text-to-SQL [4.913409359995421]
We present a schema-aware multi-task learning framework (named MT) for complicatedsql queries.
Specifically, we design a schema linking discriminator module to distinguish the valid question-schema linkings.
On the decoder side, we define 6-type relationships to describe the connections between tables and columns.
arXiv Detail & Related papers (2024-03-09T01:13:37Z) - STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing [64.80483736666123]
We propose a novel pre-training framework STAR for context-dependent text-to- parsing.
In addition, we construct a large-scale context-dependent text-to-the-art conversation corpus to pre-train STAR.
Extensive experiments show that STAR achieves new state-of-the-art performance on two downstream benchmarks.
arXiv Detail & Related papers (2022-10-21T11:30:07Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema
Linking Graph [6.13728903057727]
The generalizability to new databases is of vital importance to Text-to- systems which aim to parse human utterances intosql statements.
In this paper, we propose a framework named IS ESL to iteratively build a enhanced semantic schema-linking graph between question tokens and database schemas.
Extensive experiments on three benchmarks demonstrate that IS ESL could consistently outperform the baselines and further investigations show its generalizability and robustness.
arXiv Detail & Related papers (2022-08-08T03:59:33Z) - RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model
for Text-to-SQL [37.173390754207766]
We propose a Transformer seq2seq architecture augmented with relationaware self-attention.
Our model is able to incorporate almost all types of existing relations in the literature.
arXiv Detail & Related papers (2022-05-14T06:27:40Z) - Learning to Synthesize Data for Semantic Parsing [57.190817162674875]
We propose a generative model which models the composition of programs and maps a program to an utterance.
Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand.
We evaluate our method in both in-domain and out-of-domain settings of text-to-Query parsing on the standard benchmarks of GeoQuery and Spider.
arXiv Detail & Related papers (2021-04-12T21:24:02Z) - Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic
Parsing [110.97778888305506]
BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question.
BRIDGE attained state-of-the-art performance on popular cross-DB text-to- relational benchmarks.
Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks.
arXiv Detail & Related papers (2020-12-23T12:33:52Z) - A Tale of Two Linkings: Dynamically Gating between Schema Linking and
Structural Linking for Text-to-SQL Parsing [25.81069211061945]
In Text-to- semantic parsing, selecting the correct entities for the generatedsql query is both crucial and challenging.
We two linking processes to address this challenge: schema linking which links explicit NL mentions to the database and structural linking which links the entities in the outputsql with their structural relationships in the database schema.
Integrating the proposed method with two graph neural network-based semantics together with BERT representations demonstrates substantial gains in parsing accuracy on the challenging Spider dataset.
arXiv Detail & Related papers (2020-09-30T17:32:27Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.