Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for
Text-to-SQL Parsing
- URL: http://arxiv.org/abs/2301.07507v1
- Date: Wed, 18 Jan 2023 13:29:05 GMT
- Title: Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for
Text-to-SQL Parsing
- Authors: Jinyang Li, Binyuan Hui, Reynold Cheng, Bowen Qin, Chenhao Ma, Nan
Huo, Fei Huang, Wenyu Du, Luo Si, Yongbin Li
- Abstract summary: One of the major challenges in text-to-text parsing is domain generalization, i.e., how to well generalize to unseen databases.
In this work, we explore ways to further augment the pre-trained text-to-text transformer model with specialized components for text-to-text parsing.
To this end, we propose a new architecture GRAPHIX-T5, augmented by some specially-designed graph-aware model with layers.
- Score: 56.232873134174056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of text-to-SQL parsing, which aims at converting natural language
questions into executable SQL queries, has garnered increasing attention in
recent years, as it can assist end users in efficiently extracting vital
information from databases without the need for technical background. One of
the major challenges in text-to-SQL parsing is domain generalization, i.e., how
to generalize well to unseen databases. Recently, the pre-trained text-to-text
transformer model, namely T5, though not specialized for text-to-SQL parsing,
has achieved state-of-the-art performance on standard benchmarks targeting
domain generalization. In this work, we explore ways to further augment the
pre-trained T5 model with specialized components for text-to-SQL parsing. Such
components are expected to introduce structural inductive bias into text-to-SQL
parsers thus improving model's capacity on (potentially multi-hop) reasoning,
which is critical for generating structure-rich SQLs. To this end, we propose a
new architecture GRAPHIX-T5, a mixed model with the standard pre-trained
transformer model augmented by some specially-designed graph-aware layers.
Extensive experiments and analysis demonstrate the effectiveness of GRAPHIX-T5
across four text-to-SQL benchmarks: SPIDER, SYN, REALISTIC and DK. GRAPHIX-T5
surpass all other T5-based parsers with a significant margin, achieving new
state-of-the-art performance. Notably, GRAPHIX-T5-large reach performance
superior to the original T5-large by 5.7% on exact match (EM) accuracy and 6.6%
on execution accuracy (EX). This even outperforms the T5-3B by 1.2% on EM and
1.5% on EX.
Related papers
- Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement [1.392448435105643]
Text-to-s enables non-expert users to effortlessly retrieve desired information from databases using natural language queries.
Current state-of-the-art (SOTA) models like GPT4 and T5 have shown impressive performance on large-scale benchmarks like BIRD.
This paper proposed a novel approach that only needs SQL Quality to enhance Text-to-s performance.
arXiv Detail & Related papers (2024-10-02T17:21:51Z) - SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation [16.07396492960869]
We introduce a novel Transformer architecture specifically crafted to perform text-to-gressive translation tasks.
Our model predicts queries as abstract syntax trees (ASTs) in an autore way, incorporating structural inductive bias in the executable and decoder layers.
arXiv Detail & Related papers (2023-10-27T00:13:59Z) - A Multilingual Translator to SQL with Database Schema Pruning to Improve
Self-Attention [0.0]
We present techniques that allow long text sequences to be handled by transformers with up to 512 input tokens.
In addition, we used a multilingual approach with the mT5-large model fine-tuned with a data-augmented Spider dataset in four languages simultaneously.
arXiv Detail & Related papers (2023-06-25T14:28:12Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - Conversational Text-to-SQL: An Odyssey into State-of-the-Art and
Challenges Ahead [6.966624873109535]
State-of-the-art (SOTA) systems use large, pre-trained and finetuned language models, such as the T5-family.
With multi-tasking (MT) over coherent tasks with discrete prompts during training, we improve over specialized text-to-three models.
We conduct studies to tease apart errors attributable to domain and compositional generalization.
arXiv Detail & Related papers (2023-02-21T23:15:33Z) - Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages.
We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic
Parsing [110.97778888305506]
BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question.
BRIDGE attained state-of-the-art performance on popular cross-DB text-to- relational benchmarks.
Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks.
arXiv Detail & Related papers (2020-12-23T12:33:52Z) - mT5: A massively multilingual pre-trained text-to-text transformer [60.0210636815514]
"Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on English-language NLP tasks.
We introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages.
arXiv Detail & Related papers (2020-10-22T17:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.