Multi-Hop Table Retrieval for Open-Domain Text-to-SQL
- URL: http://arxiv.org/abs/2402.10666v2
- Date: Wed, 19 Jun 2024 15:19:16 GMT
- Title: Multi-Hop Table Retrieval for Open-Domain Text-to-SQL
- Authors: Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che,
- Abstract summary: We propose a multi-hop table retrieval with rewrite and beam search (Murre)
We conduct experiments on SpiderUnion and BirdUnion+, reaching new state-of-the-art results with an average improvement of 6.38%.
- Score: 51.48239006107272
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Open-domain text-to-SQL is an important task that retrieves question-relevant tables from massive databases and then generates SQL. However, existing retrieval methods that retrieve in a single hop do not pay attention to the text-to-SQL challenge of schema linking, which is aligning the entities in the question with table entities, reflected in two aspects: similar irrelevant entity and domain mismatch entity. Therefore, we propose our method, the multi-hop table retrieval with rewrite and beam search (Murre). To reduce the effect of the similar irrelevant entity, our method focuses on unretrieved entities at each hop and considers the low-ranked tables by beam search. To alleviate the limitation of domain mismatch entity, Murre rewrites the question based on retrieved tables in multiple hops, decreasing the domain gap with relevant tables. We conduct experiments on SpiderUnion and BirdUnion+, reaching new state-of-the-art results with an average improvement of 6.38%.
Related papers
- Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method [48.14236175156835]
ARM aims to better align the question with the organization of the data collection by exploring relationships among data objects.
It outperforms standard RAG with query decomposition by up to 5.2 pt in execution accuracy and agentic RAG (ReAct) by up to 15.9 pt.
It achieves up to 5.5 pt and 19.3 pt higher F1 match scores compared to these approaches.
arXiv Detail & Related papers (2025-01-30T18:07:19Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval [52.592071689901196]
We introduce a method that uncovers useful join relations for any query and database during table retrieval.
Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.
arXiv Detail & Related papers (2024-04-15T15:55:01Z) - Denoising Table-Text Retrieval for Open-Domain Question Answering [6.711626456283439]
In table-text open-domain question answering, a retriever system retrieves relevant evidence from tables and text to answer questions.
Previous studies have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets.
We propose Denoised Table-Text Retriever (DoTTeR) to overcome these issues.
arXiv Detail & Related papers (2024-03-26T11:44:49Z) - Enhancing Open-Domain Table Question Answering via Syntax- and
Structure-aware Dense Retrieval [21.585255812861632]
Open-domain table question answering aims to provide answers to a question by retrieving and extracting information from a large collection of tables.
Existing studies of open-domain table QA either directly adopt text retrieval methods or consider the table structure only in the encoding layer for table retrieval.
We propose a syntax- and structure-aware retrieval method for the open-domain table QA task.
arXiv Detail & Related papers (2023-09-19T10:40:09Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.