Denoising Table-Text Retrieval for Open-Domain Question Answering
- URL: http://arxiv.org/abs/2403.17611v1
- Date: Tue, 26 Mar 2024 11:44:49 GMT
- Title: Denoising Table-Text Retrieval for Open-Domain Question Answering
- Authors: Deokhyung Kang, Baikjin Jung, Yunsu Kim, Gary Geunbae Lee,
- Abstract summary: In table-text open-domain question answering, a retriever system retrieves relevant evidence from tables and text to answer questions.
Previous studies have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets.
We propose Denoised Table-Text Retriever (DoTTeR) to overcome these issues.
- Score: 6.711626456283439
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In table-text open-domain question answering, a retriever system retrieves relevant evidence from tables and text to answer questions. Previous studies in table-text open-domain question answering have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets; secondly, they may struggle to provide appropriate evidence for questions that require reasoning across the table. To address these issues, we propose Denoised Table-Text Retriever (DoTTeR). Our approach involves utilizing a denoised training dataset with fewer false positive labels by discarding instances with lower question-relevance scores measured through a false positive detection model. Subsequently, we integrate table-level ranking information into the retriever to assist in finding evidence for questions that demand reasoning across the table. To encode this ranking information, we fine-tune a rank-aware column encoder to identify minimum and maximum values within a column. Experimental results demonstrate that DoTTeR significantly outperforms strong baselines on both retrieval recall and downstream QA tasks. Our code is available at https://github.com/deokhk/DoTTeR.
Related papers
- KET-QA: A Dataset for Knowledge Enhanced Table Question Answering [63.56707527868466]
We propose to use a knowledge base (KB) as the external knowledge source for TableQA.
Every question requires the integration of information from both the table and the sub-graph to be answered.
We design a retriever-reasoner structured pipeline model to extract pertinent information from the vast knowledge sub-graph.
arXiv Detail & Related papers (2024-05-13T18:26:32Z) - Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval [52.592071689901196]
We introduce a method that uncovers useful join relations for any query and database during table retrieval.
Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.
arXiv Detail & Related papers (2024-04-15T15:55:01Z) - MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering [3.1651118728570635]
In today's fast-paced industry, professionals face the challenge of summarizing a large number of documents and extracting vital information from them on a daily basis.
To address this challenge, the approach of Table Question Answering (QA) has been developed to extract the relevant information.
Recent advancements in Large Language Models (LLMs) have opened up new possibilities for extracting information from tabular data using prompts.
arXiv Detail & Related papers (2024-03-28T03:14:18Z) - MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL [51.48239006107272]
Multi-hop table retrieval with removal (MURRE) removes previously retrieved information from the question to guide towards unretrieved relevant tables.
Experiments on two open-domain text-to- retriever datasets demonstrate an average improvement of 5.7% over the previous state-of-the-art results.
arXiv Detail & Related papers (2024-02-16T13:14:35Z) - Open-Set Knowledge-Based Visual Question Answering with Inference Paths [79.55742631375063]
The purpose of Knowledge-Based Visual Question Answering (KB-VQA) is to provide a correct answer to the question with the aid of external knowledge bases.
We propose a new retriever-ranker paradigm of KB-VQA, Graph pATH rankER (GATHER for brevity)
Specifically, it contains graph constructing, pruning, and path-level ranking, which not only retrieves accurate answers but also provides inference paths that explain the reasoning process.
arXiv Detail & Related papers (2023-10-12T09:12:50Z) - S$^3$HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering [27.66777544627217]
Existing models mainly adopt a retriever-reader framework, which have several deficiencies.
We propose a three-stage TextTableQA framework S3HQA, which comprises of retriever, selector, and reasoner.
When trained on the full dataset, our approach outperforms all baseline methods, ranking first on the HybridQA leaderboard.
arXiv Detail & Related papers (2023-05-19T15:01:48Z) - Mixed-modality Representation Learning and Pre-training for Joint
Table-and-Text Retrieval in OpenQA [85.17249272519626]
An optimized OpenQA Table-Text Retriever (OTTeR) is proposed.
We conduct retrieval-centric mixed-modality synthetic pre-training.
OTTeR substantially improves the performance of table-and-text retrieval on the OTT-QA dataset.
arXiv Detail & Related papers (2022-10-11T07:04:39Z) - Open Domain Question Answering over Tables via Dense Retrieval [8.963951462217421]
We present an effective pre-training procedure for our retriever and improve retrieval quality with mined hard negatives.
We find that our retriever improves retrieval results from 72.0 to 81.1@10 and end-to-end QA results from 33.8 to 37.7 exact match, over a BERT based retriever.
arXiv Detail & Related papers (2021-03-22T17:01:04Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.