Reference-based Weak Supervision for Answer Sentence Selection using Web
Data
- URL: http://arxiv.org/abs/2104.08943v1
- Date: Sun, 18 Apr 2021 19:41:17 GMT
- Title: Reference-based Weak Supervision for Answer Sentence Selection using Web
Data
- Authors: Vivek Krishnamurthy, Thuy Vu, Alessandro Moschitti
- Abstract summary: We introduce Reference-based Weak Supervision (RWS), a fully automatic large-scale data pipeline.
RWS harvests high-quality weakly-supervised answers from abundant Web data.
Our experiments indicate that the produced data consistently bolsters TANDA.
- Score: 87.18646699292293
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Answer sentence selection (AS2) modeling requires annotated data, i.e.,
hand-labeled question-answer pairs. We present a strategy to collect weakly
supervised answers for a question based on its reference to improve AS2
modeling. Specifically, we introduce Reference-based Weak Supervision (RWS), a
fully automatic large-scale data pipeline that harvests high-quality
weakly-supervised answers from abundant Web data requiring only a
question-reference pair as input. We study the efficacy and robustness of RWS
in the setting of TANDA, a recent state-of-the-art fine-tuning approach
specialized for AS2. Our experiments indicate that the produced data
consistently bolsters TANDA. We achieve the state of the art in terms of P@1,
90.1%, and MAP, 92.9%, on WikiQA.
Related papers
- Extracting Psychological Indicators Using Question Answering [0.0]
We propose a method for extracting text spans that may indicate one of the BIG5 psychological traits using a question-answering task with examples that have no answer for the asked question.
We utilized the RoBERTa model fine-tuned on SQuAD 2.0 dataset.
arXiv Detail & Related papers (2023-05-24T08:41:23Z) - Knowledge Transfer from Answer Ranking to Answer Generation [97.38378660163414]
We propose to train a GenQA model by transferring knowledge from a trained AS2 model.
We also propose to use the AS2 model prediction scores for loss weighting and score-conditioned input/output shaping.
arXiv Detail & Related papers (2022-10-23T21:51:27Z) - Question-Answer Sentence Graph for Joint Modeling Answer Selection [122.29142965960138]
We train and integrate state-of-the-art (SOTA) models for computing scores between question-question, question-answer, and answer-answer pairs.
Online inference is then performed to solve the AS2 task on unseen queries.
arXiv Detail & Related papers (2022-02-16T05:59:53Z) - Joint Models for Answer Verification in Question Answering Systems [85.93456768689404]
We build a three-way multi-classifier, which decides if an answer supports, refutes, or is neutral with respect to another one.
We tested our models on WikiQA, TREC-QA, and a real-world dataset.
arXiv Detail & Related papers (2021-07-09T05:34:36Z) - Modeling Context in Answer Sentence Selection Systems on a Latency
Budget [87.45819843513598]
We present an approach to efficiently incorporate contextual information in AS2 models.
For each answer candidate, we first use unsupervised similarity techniques to extract relevant sentences from its source document.
Our best approach, which leverages a multi-way attention architecture to efficiently encode context, improves 6% to 11% over nonanswer state of the art in AS2 with minimal impact on system latency.
arXiv Detail & Related papers (2021-01-28T16:24:48Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.