Multi-Hop Table Retrieval for Open-Domain Text-to-SQL
- URL: http://arxiv.org/abs/2402.10666v2
- Date: Wed, 19 Jun 2024 15:19:16 GMT
- Title: Multi-Hop Table Retrieval for Open-Domain Text-to-SQL
- Authors: Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che,
- Abstract summary: We propose a multi-hop table retrieval with rewrite and beam search (Murre)
We conduct experiments on SpiderUnion and BirdUnion+, reaching new state-of-the-art results with an average improvement of 6.38%.
- Score: 51.48239006107272
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Open-domain text-to-SQL is an important task that retrieves question-relevant tables from massive databases and then generates SQL. However, existing retrieval methods that retrieve in a single hop do not pay attention to the text-to-SQL challenge of schema linking, which is aligning the entities in the question with table entities, reflected in two aspects: similar irrelevant entity and domain mismatch entity. Therefore, we propose our method, the multi-hop table retrieval with rewrite and beam search (Murre). To reduce the effect of the similar irrelevant entity, our method focuses on unretrieved entities at each hop and considers the low-ranked tables by beam search. To alleviate the limitation of domain mismatch entity, Murre rewrites the question based on retrieved tables in multiple hops, decreasing the domain gap with relevant tables. We conduct experiments on SpiderUnion and BirdUnion+, reaching new state-of-the-art results with an average improvement of 6.38%.
Related papers
- Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval [52.592071689901196]
We introduce a method that uncovers useful join relations for any query and database during table retrieval.
Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.
arXiv Detail & Related papers (2024-04-15T15:55:01Z) - Schema-Aware Multi-Task Learning for Complex Text-to-SQL [4.913409359995421]
We present a schema-aware multi-task learning framework (named MT) for complicatedsql queries.
Specifically, we design a schema linking discriminator module to distinguish the valid question-schema linkings.
On the decoder side, we define 6-type relationships to describe the connections between tables and columns.
arXiv Detail & Related papers (2024-03-09T01:13:37Z) - Enhancing Open-Domain Table Question Answering via Syntax- and
Structure-aware Dense Retrieval [21.585255812861632]
Open-domain table question answering aims to provide answers to a question by retrieving and extracting information from a large collection of tables.
Existing studies of open-domain table QA either directly adopt text retrieval methods or consider the table structure only in the encoding layer for table retrieval.
We propose a syntax- and structure-aware retrieval method for the open-domain table QA task.
arXiv Detail & Related papers (2023-09-19T10:40:09Z) - Towards Multi-Modal DBMSs for Seamless Querying of Texts and Tables [14.249508312922334]
We propose to extend databases with so-called multi-modal relational operators (MMOps)
MMOps allow text collections to be treated as tables without the need to manually transform the data.
Our MMDB prototype can not only outperform state-of-the-art approaches such as text-to-table in terms of accuracy and performance.
arXiv Detail & Related papers (2023-04-26T13:31:04Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z) - DBTagger: Multi-Task Learning for Keyword Mapping in NLIDBs Using
Bi-Directional Recurrent Neural Networks [0.2578242050187029]
We propose a novel deep learning based supervised approach that utilizes POS tags of NLQs.
We evaluate our approach on eight different datasets, and report new state-of-the-art accuracy results, $92.4%$ on the average.
arXiv Detail & Related papers (2021-01-11T22:54:39Z) - Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic
Parsing [110.97778888305506]
BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question.
BRIDGE attained state-of-the-art performance on popular cross-DB text-to- relational benchmarks.
Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks.
arXiv Detail & Related papers (2020-12-23T12:33:52Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.