Related papers: Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

URL: http://arxiv.org/abs/2205.01228v1
Date: Mon, 2 May 2022 21:41:14 GMT
Title: Paragraph-based Transformer Pre-training for Multi-Sentence Inference
Authors: Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti
Abstract summary: We show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences.
Score: 99.59693674455582
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers. Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly. In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences. Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks.

Related papers

Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection [99.59693674455582]
We propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents. Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models.
arXiv Detail & Related papers (2022-05-20T22:39:00Z)
Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer [13.896506220470748]
Legal Judgment Prediction involves a series of sub-tasks such as predicting violated law articles, charges and term of penalty. We propose leveraging a unified text-to-text Transformer for LJP. We show that this unified transformer, albeit pretrained on general-domain text, outperforms pretrained models tailored specifically for the legal domain.
arXiv Detail & Related papers (2021-12-13T01:38:37Z)
Discriminative and Generative Transformer-based Models For Situation Entity Classification [8.029049649310211]
We re-examine the situation entity (SE) classification task with varying amounts of available training data. We exploit a Transformer-based variational autoencoder to encode sentences into a lower dimensional latent space.
arXiv Detail & Related papers (2021-09-15T17:07:07Z)
Consistency Regularization for Cross-Lingual Fine-Tuning [61.08704789561351]
We propose to improve cross-lingual fine-tuning with consistency regularization. Specifically, we use example consistency regularization to penalize the prediction sensitivity to four types of data augmentations. Experimental results on the XTREME benchmark show that our method significantly improves cross-lingual fine-tuning across various tasks.
arXiv Detail & Related papers (2021-06-15T15:35:44Z)
Utilizing Bidirectional Encoder Representations from Transformers for Answer Selection [16.048329028104643]
We adopt a transformer-based model for the language modeling task in a large dataset and fine-tune it for downstream tasks. We find that fine-tuning the BERT model for the answer selection task is very effective and observe a maximum improvement of 13.1% in the QA datasets and 18.7% in the CQA datasets.
arXiv Detail & Related papers (2020-11-14T03:15:26Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)
The Cascade Transformer: an Application for Efficient Answer Sentence Selection [116.09532365093659]
We introduce the Cascade Transformer, a technique to adapt transformer-based models into a cascade of rankers. When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy.
arXiv Detail & Related papers (2020-05-05T23:32:01Z)
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks. We introduce a new scoring method that casts a plausibility ranking task in a full-text format. We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.