In Situ Answer Sentence Selection at Web-scale
- URL: http://arxiv.org/abs/2201.05984v1
- Date: Sun, 16 Jan 2022 06:36:00 GMT
- Title: In Situ Answer Sentence Selection at Web-scale
- Authors: Zeyu Zhang, Thuy Vu, Alessandro Moschitti
- Abstract summary: Passage-based Extracting Answer Sentence In-place (PEASI) is a novel design for AS2 optimized for Web-scale setting.
We train PEASI in a multi-task learning framework that encourages feature sharing between the components: passage reranker and passage-based answer sentence extractor.
Experiments show PEASI effectively outperforms the current state-of-the-art setting for AS2, i.e., a point-wise model for ranking sentences independently, by 6.51% in accuracy.
- Score: 120.69820139008138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current answer sentence selection (AS2) applied in open-domain question
answering (ODQA) selects answers by ranking a large set of possible candidates,
i.e., sentences, extracted from the retrieved text. In this paper, we present
Passage-based Extracting Answer Sentence In-place (PEASI), a novel design for
AS2 optimized for Web-scale setting, that, instead, computes such answer
without processing each candidate individually. Specifically, we design a
Transformer-based framework that jointly (i) reranks passages retrieved for a
question and (ii) identifies a probable answer from the top passages in place.
We train PEASI in a multi-task learning framework that encourages feature
sharing between the components: passage reranker and passage-based answer
sentence extractor. To facilitate our development, we construct a new
Web-sourced large-scale QA dataset consisting of 800,000+ labeled
passages/sentences for 60,000+ questions. The experiments show that our
proposed design effectively outperforms the current state-of-the-art setting
for AS2, i.e., a point-wise model for ranking sentences independently, by 6.51%
in accuracy, from 48.86% to 55.37%. In addition, PEASI is exceptionally
efficient in computing answer sentences, requiring only ~20% inferences
compared to the standard setting, i.e., reranking all possible candidates. We
believe the release of PEASI, both the dataset and our proposed design, can
contribute to advancing the research and development in deploying question
answering services at Web scale.
Related papers
- Question-Context Alignment and Answer-Context Dependencies for Effective
Answer Sentence Selection [38.661155271311515]
We propose to improve the candidate scoring by explicitly incorporating the dependencies between question-context and answer-context into the final representation of a candidate.
Our proposed model achieves significant improvements on popular AS2 benchmarks, i.e., WikiQA and WDRASS, obtaining new state-of-the-art on all benchmarks.
arXiv Detail & Related papers (2023-06-03T20:59:19Z) - Context-Aware Transformer Pre-Training for Answer Sentence Selection [102.7383811376319]
We propose three pre-training objectives designed to mimic the downstream fine-tuning task of contextual AS2.
Our experiments show that our pre-training approaches can improve baseline contextual AS2 accuracy by up to 8% on some datasets.
arXiv Detail & Related papers (2023-05-24T17:10:45Z) - Pre-training Transformer Models with Sentence-Level Objectives for
Answer Sentence Selection [99.59693674455582]
We propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents.
Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models.
arXiv Detail & Related papers (2022-05-20T22:39:00Z) - Exploring Dense Retrieval for Dialogue Response Selection [42.89426092886912]
We present a solution to directly select proper responses from a large corpus or even a nonparallel corpus, using a dense retrieval model.
For re-rank setting, the superiority is quite surprising given its simplicity. For full-rank setting, we can emphasize that we are the first to do such evaluation.
arXiv Detail & Related papers (2021-10-13T10:10:32Z) - Modeling Context in Answer Sentence Selection Systems on a Latency
Budget [87.45819843513598]
We present an approach to efficiently incorporate contextual information in AS2 models.
For each answer candidate, we first use unsupervised similarity techniques to extract relevant sentences from its source document.
Our best approach, which leverages a multi-way attention architecture to efficiently encode context, improves 6% to 11% over nonanswer state of the art in AS2 with minimal impact on system latency.
arXiv Detail & Related papers (2021-01-28T16:24:48Z) - Context-based Transformer Models for Answer Sentence Selection [109.96739477808134]
In this paper, we analyze the role of the contextual information in the sentence selection task.
We propose a Transformer based architecture that leverages two types of contexts, local and global.
The results show that the combination of local and global contexts in a Transformer model significantly improves the accuracy in Answer Sentence Selection.
arXiv Detail & Related papers (2020-06-01T21:52:19Z) - A Study on Efficiency, Accuracy and Document Structure for Answer
Sentence Selection [112.0514737686492]
In this paper, we argue that by exploiting the intrinsic structure of the original rank together with an effective word-relatedness encoder, we can achieve competitive results.
Our model takes 9.5 seconds to train on the WikiQA dataset, i.e., very fast in comparison with the $sim 18$ minutes required by a standard BERT-base fine-tuning.
arXiv Detail & Related papers (2020-03-04T22:12:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.