Pre-training Transformer Models with Sentence-Level Objectives for
Answer Sentence Selection
- URL: http://arxiv.org/abs/2205.10455v1
- Date: Fri, 20 May 2022 22:39:00 GMT
- Title: Pre-training Transformer Models with Sentence-Level Objectives for
Answer Sentence Selection
- Authors: Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti
- Abstract summary: We propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents.
Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models.
- Score: 99.59693674455582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An important task for designing QA systems is answer sentence selection
(AS2): selecting the sentence containing (or constituting) the answer to a
question from a set of retrieved relevant documents. In this paper, we propose
three novel sentence-level transformer pre-training objectives that incorporate
paragraph-level semantics within and across documents, to improve the
performance of transformers for AS2, and mitigate the requirement of large
labeled datasets. Our experiments on three public and one industrial AS2
datasets demonstrate the empirical superiority of our pre-trained transformers
over baseline models such as RoBERTa and ELECTRA for AS2.
Related papers
- Datasets for Multilingual Answer Sentence Selection [59.28492975191415]
We introduce new high-quality datasets for AS2 in five European languages (French, German, Italian, Portuguese, and Spanish)
Results indicate that our datasets are pivotal in producing robust and powerful multilingual AS2 models.
arXiv Detail & Related papers (2024-06-14T16:50:29Z) - Context-Aware Transformer Pre-Training for Answer Sentence Selection [102.7383811376319]
We propose three pre-training objectives designed to mimic the downstream fine-tuning task of contextual AS2.
Our experiments show that our pre-training approaches can improve baseline contextual AS2 accuracy by up to 8% on some datasets.
arXiv Detail & Related papers (2023-05-24T17:10:45Z) - Paragraph-based Transformer Pre-training for Multi-Sentence Inference [99.59693674455582]
We show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.
We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences.
arXiv Detail & Related papers (2022-05-02T21:41:14Z) - Boosting Transformers for Job Expression Extraction and Classification
in a Low-Resource Setting [12.489741131691737]
We present our approaches to tackle the extraction and classification of job expressions in Spanish texts.
As neither language nor domain experts, we experiment with the multilingual XLM-R transformer model.
Our results show strong improvements using these methods by up to 5.3 F1 points compared to a fine-tuned XLM-R model.
arXiv Detail & Related papers (2021-09-17T15:21:02Z) - Answer Generation for Retrieval-based Question Answering Systems [80.28727681633096]
We train a sequence to sequence transformer model to generate an answer from a candidate set.
Our tests on three English AS2 datasets show improvement up to 32 absolute points in accuracy over the state of the art.
arXiv Detail & Related papers (2021-06-02T05:45:49Z) - Utilizing Bidirectional Encoder Representations from Transformers for
Answer Selection [16.048329028104643]
We adopt a transformer-based model for the language modeling task in a large dataset and fine-tune it for downstream tasks.
We find that fine-tuning the BERT model for the answer selection task is very effective and observe a maximum improvement of 13.1% in the QA datasets and 18.7% in the CQA datasets.
arXiv Detail & Related papers (2020-11-14T03:15:26Z) - Context-based Transformer Models for Answer Sentence Selection [109.96739477808134]
In this paper, we analyze the role of the contextual information in the sentence selection task.
We propose a Transformer based architecture that leverages two types of contexts, local and global.
The results show that the combination of local and global contexts in a Transformer model significantly improves the accuracy in Answer Sentence Selection.
arXiv Detail & Related papers (2020-06-01T21:52:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.