Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning
- URL: http://arxiv.org/abs/2109.05395v1
- Date: Sun, 12 Sep 2021 01:01:28 GMT
- Title: Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning
- Authors: Yongrui Chen, Xinnan Guo, Chaojie Wang, Jian Qiu, Guilin Qi, Meng
Wang, Huiying Li
- Abstract summary: Single-table text-to-one aims to transform a natural language question into a query according to one single table.
We propose a new approach for the zero-shot text-to-one task which does not rely on any additional manual annotations.
We conduct extensive experiments on a public open-domain text-to-one dataset and a domain-specific dataset E.
- Score: 25.69875174742935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Single-table text-to-SQL aims to transform a natural language question into a
SQL query according to one single table. Recent work has made promising
progress on this task by pre-trained language models and a multi-submodule
framework. However, zero-shot table, that is, the invisible table in the
training set, is currently the most critical bottleneck restricting the
application of existing approaches to real-world scenarios. Although some work
has utilized auxiliary tasks to help handle zero-shot tables, expensive extra
manual annotation limits their practicality. In this paper, we propose a new
approach for the zero-shot text-to-SQL task which does not rely on any
additional manual annotations. Our approach consists of two parts. First, we
propose a new model that leverages the abundant information of table content to
help establish the mapping between questions and zero-shot tables. Further, we
propose a simple but efficient meta-learning strategy to train our model. The
strategy utilizes the two-step gradient update to force the model to learn a
generalization ability towards zero-shot tables. We conduct extensive
experiments on a public open-domain text-to-SQL dataset WikiSQL and a
domain-specific dataset ESQL. Compared to existing approaches using the same
pre-trained model, our approach achieves significant improvements on both
datasets. Compared to the larger pre-trained model and the tabular-specific
pre-trained model, our approach is still competitive. More importantly, on the
zero-shot subsets of both the datasets, our approach further increases the
improvements.
Related papers
- LaTable: Towards Large Tabular Models [63.995130144110156]
Tabular generative foundation models are hard to build due to the heterogeneous feature spaces of different datasets.
LaTable is a novel diffusion model that addresses these challenges and can be trained across different datasets.
We find that LaTable outperforms baselines on in-distribution generation, and that finetuning LaTable can generate out-of-distribution datasets better with fewer samples.
arXiv Detail & Related papers (2024-06-25T16:03:50Z) - AnnotatedTables: A Large Tabular Dataset with Language Model Annotations [8.602181445598776]
We show how machine learning can be used to automate the annotation of large volumes of diverse tabular data.
We release AnnotatedTables, a collection of 32,119 databases with LLM-generated annotations.
We evaluate the performance of TabPFN, a recent neural classifier trained on Bayesian priors, on 2,720 tables with input-target columns identified by LLMs.
arXiv Detail & Related papers (2024-06-24T06:44:14Z) - Testing the Limits of Unified Sequence to Sequence LLM Pretraining on
Diverse Table Data Tasks [2.690048852269647]
We study the advantages of a unified approach to table specific pretraining when scaled from 770M to 11B sequence to sequence models.
Our work is the first attempt at studying the advantages of a unified approach to table specific pretraining when scaled from 770M to 11B sequence to sequence models.
arXiv Detail & Related papers (2023-10-01T21:06:15Z) - Few-Shot Data-to-Text Generation via Unified Representation and
Multi-Source Learning [114.54944761345594]
We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods.
Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2023-08-10T03:09:12Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for
Cross-lingual Text-to-SQL Semantic Parsing [70.40401197026925]
In-context learning using large language models has recently shown surprising results for semantic parsing tasks.
This work introduces the XRICL framework, which learns to retrieve relevant English exemplars for a given query.
We also include global translation exemplars for a target language to facilitate the translation process for large language models.
arXiv Detail & Related papers (2022-10-25T01:33:49Z) - Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play [46.07002748587857]
We explore augmenting the training datasets using self-play, which leverages contextual information to synthesize new interactions.
We find that self-play improves the accuracy of a strong baseline on SParC and Co, two widely used text-to-domain datasets.
arXiv Detail & Related papers (2022-10-21T16:40:07Z) - Making Table Understanding Work in Practice [9.352813774921655]
We discuss three challenges of deploying table understanding models and propose a framework to address them.
We present SigmaTyper which encapsulates a hybrid model trained on GitTables and integrates a lightweight human-in-the-loop approach to customize the model.
arXiv Detail & Related papers (2021-09-11T03:38:24Z) - IGSQL: Database Schema Interaction Graph Based Neural Model for
Context-Dependent Text-to-SQL Generation [61.09660709356527]
We propose a database schema interaction graph encoder to utilize historicalal information of database schema items.
We evaluate our model on the benchmark SParC and Co datasets.
arXiv Detail & Related papers (2020-11-11T12:56:21Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.