Related papers: Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph

Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph

URL: http://arxiv.org/abs/2208.03903v1
Date: Mon, 8 Aug 2022 03:59:33 GMT
Title: Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph
Authors: Aiwei Liu, Xuming Hu, Li Lin and Lijie Wen
Abstract summary: The generalizability to new databases is of vital importance to Text-to- systems which aim to parse human utterances intosql statements. In this paper, we propose a framework named IS ESL to iteratively build a enhanced semantic schema-linking graph between question tokens and database schemas. Extensive experiments on three benchmarks demonstrate that IS ESL could consistently outperform the baselines and further investigations show its generalizability and robustness.
Score: 6.13728903057727
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The generalizability to new databases is of vital importance to Text-to-SQL systems which aim to parse human utterances into SQL statements. Existing works achieve this goal by leveraging the exact matching method to identify the lexical matching between the question words and the schema items. However, these methods fail in other challenging scenarios, such as the synonym substitution in which the surface form differs between the corresponding question words and schema items. In this paper, we propose a framework named ISESL-SQL to iteratively build a semantic enhanced schema-linking graph between question tokens and database schemas. First, we extract a schema linking graph from PLMs through a probing procedure in an unsupervised manner. Then the schema linking graph is further optimized during the training process through a deep graph learning method. Meanwhile, we also design an auxiliary task called graph regularization to improve the schema information mentioned in the schema-linking graph. Extensive experiments on three benchmarks demonstrate that ISESL-SQL could consistently outperform the baselines and further investigations show its generalizability and robustness.

Related papers

PSM-SQL: Progressive Schema Learning with Multi-granularity Semantics for Text-to-SQL [8.416319689644556]
It is challenging to convert tasks due to the vast number of database schemas with redundancy. We propose a progressive schema linking with multi-granularity semantics (PSM-) PSM- learns the schema semantics at the column, table, and database levels.
arXiv Detail & Related papers (2025-02-07T08:31:57Z)
V-SQL: A View-based Two-stage Text-to-SQL Framework [0.9719868595277401]
Text-to-coupling methods based on large language models (LLMs) have garnered significant attention. The core of mainstream text-to-coupling frameworks is schema linking, which aligns user queries with relevant tables and columns in the database. Previous methods focused on schema linking while to enhance LLMs' understanding of database schema.
arXiv Detail & Related papers (2024-12-17T02:27:50Z)
RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction. benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection. Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z)
The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models [0.9149661171430259]
We revisit schema linking when using the latest generation of large language models (LLMs) We find empirically that newer models are adept at utilizing relevant schema elements during generation even in the presence of large numbers of irrelevant ones. Instead of filtering contextual information, we highlight techniques such as augmentation, selection, and correction, and adopt them to improve the accuracy of our Text-to-BIRD pipeline.
arXiv Detail & Related papers (2024-08-14T17:59:04Z)
Schema-Aware Multi-Task Learning for Complex Text-to-SQL [4.913409359995421]
We present a schema-aware multi-task learning framework (named MT) for complicatedsql queries. Specifically, we design a schema linking discriminator module to distinguish the valid question-schema linkings. On the decoder side, we define 6-type relationships to describe the connections between tables and columns.
arXiv Detail & Related papers (2024-03-09T01:13:37Z)
Schema-adaptable Knowledge Graph Construction [47.772335354080795]
Conventional Knowledge Graph Construction (KGC) approaches typically follow the static information extraction paradigm with a closed set of pre-defined schema. We propose a new task called schema-adaptable KGC, which aims to continually extract entity, relation, and event based on a dynamically changing schema graph without re-training.
arXiv Detail & Related papers (2023-05-15T15:06:20Z)
STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing [64.80483736666123]
We propose a novel pre-training framework STAR for context-dependent text-to- parsing. In addition, we construct a large-scale context-dependent text-to-the-art conversation corpus to pre-train STAR. Extensive experiments show that STAR achieves new state-of-the-art performance on two downstream benchmarks.
arXiv Detail & Related papers (2022-10-21T11:30:07Z)
Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric. Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences. Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z)
SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL [29.328698264910596]
One of the most challenging problems of Text-to-Graph is how to generalize the trained model to the unseen database schemas. We propose a Structure-Aware Dual Graph Aggregation Network (SADGA) for cross-domain Text-to-Graph. We achieve 3rd place on the challenging Text-to-Graph benchmark Spider at the time of writing.
arXiv Detail & Related papers (2021-11-01T01:50:28Z)
ShadowGNN: Graph Projection Neural Network for Text-to-SQL Parser [36.12921337235763]
We propose a new architecture, ShadowGNN, which processes schemas at abstract and semantic levels. On the challenging Text-to-Spider benchmark, empirical results show that ShadowGNN outperforms state-of-the-art models.
arXiv Detail & Related papers (2021-04-10T05:48:28Z)
IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation [61.09660709356527]
We propose a database schema interaction graph encoder to utilize historicalal information of database schema items. We evaluate our model on the benchmark SParC and Co datasets.
arXiv Detail & Related papers (2020-11-11T12:56:21Z)
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing. We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar. To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.