Related papers: Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning

Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning

URL: http://arxiv.org/abs/2205.08221v1
Date: Tue, 17 May 2022 10:34:28 GMT
Title: Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning
Authors: Demian Gholipour Ghalandari, Chris Hokamp, Georgiana Ifrim
Abstract summary: Sentence compression reduces the length of text by removing non-essential content. Unsupervised objective driven methods for sentence compression can be used to create customized models.
Score: 10.380414189465347
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sentence compression reduces the length of text by removing non-essential content while preserving important facts and grammaticality. Unsupervised objective driven methods for sentence compression can be used to create customized models without the need for ground-truth training data, while allowing flexibility in the objective function(s) that are used for learning and inference. Recent unsupervised sentence compression approaches use custom objectives to guide discrete search; however, guided search is expensive at inference time. In this work, we explore the use of reinforcement learning to train effective sentence compression models that are also fast when generating predictions. In particular, we cast the task as binary sequence labelling and fine-tune a pre-trained transformer using a simple policy gradient approach. Our approach outperforms other unsupervised models while also being more efficient at inference time.

Related papers

Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior [118.92747171905727]
This paper introduces a novel frequency-based trigger injection model for launching backdoor attacks with multiple triggers on learned image compression models. We design attack objectives tailored to diverse scenarios, including: 1) degrading compression quality in terms of bit-rate and reconstruction accuracy; 2) targeting task-driven measures like face recognition and semantic segmentation. Experiments show that our trigger injection models, combined with minor modifications to encoder parameters, successfully inject multiple backdoors and their triggers into a single compression model.
arXiv Detail & Related papers (2024-12-02T15:58:40Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
Once-for-All Sequence Compression for Self-Supervised Speech Models [62.60723685118747]
We introduce a once-for-all sequence compression framework for self-supervised speech models. The framework is evaluated on various tasks, showing marginal degradation compared to the fixed compressing rate variants. We also explore adaptive compressing rate learning, demonstrating the ability to select task-specific preferred frame periods without needing a grid search.
arXiv Detail & Related papers (2022-11-04T09:19:13Z)
Sentence Representation Learning with Generative Objective rather than Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction. Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z)
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs [16.968490007064872]
Applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence performance and the information captured in the latent space.
arXiv Detail & Related papers (2022-09-26T11:21:19Z)
Learning Non-Autoregressive Models from Search for Unsupervised Sentence Summarization [20.87460375478907]
Text summarization aims to generate a short summary for an input text. In this work, we propose a Non-Autoregressive Unsupervised Summarization approach. Experiments show that NAUS achieves state-of-the-art performance for unsupervised summarization.
arXiv Detail & Related papers (2022-05-28T21:09:23Z)
Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification [12.517161466778655]
Distributed model training suffers from communication bottlenecks due to frequent model updates transmitted across compute nodes. To alleviate these bottlenecks, practitioners use gradient compression techniques like sparsification, quantization, or low-rank updates. In this work, we show that such performance degradation due to choosing a high compression ratio is not fundamental. An adaptive compression strategy can reduce communication while maintaining final test accuracy.
arXiv Detail & Related papers (2020-10-29T16:41:44Z)
Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers [107.12125265675483]
Unsupervised extractive document summarization aims to select important sentences from a document without using labeled summaries during training. Existing methods are mostly graph-based with sentences as nodes and edge weights measured by sentence similarities. We find that transformer attentions can be used to rank sentences for unsupervised extractive summarization.
arXiv Detail & Related papers (2020-10-16T08:44:09Z)
Self-training Improves Pre-training for Natural Language Understanding [63.78927366363178]
We study self-training as another way to leverage unlabeled data through semi-supervised learning. We introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data. Our approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks.
arXiv Detail & Related papers (2020-10-05T17:52:25Z)
A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff. cutoff relies on sampling consistency and thus adds little computational overhead. cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z)
End-to-end Learning of Compressible Features [35.40108701875527]
Pre-trained convolutional neural networks (CNNs) are powerful off-the-shelf feature generators. CNNs are powerful off-the-shelf feature generators and have been shown to perform very well on a variety of tasks. Unfortunately, the generated features are high dimensional and expensive to store. We propose a learned method that jointly optimize for compressibility along with the task objective.
arXiv Detail & Related papers (2020-07-23T05:17:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.