Improving Sentence Similarity Estimation for Unsupervised Extractive
Summarization
- URL: http://arxiv.org/abs/2302.12490v1
- Date: Fri, 24 Feb 2023 07:10:33 GMT
- Title: Improving Sentence Similarity Estimation for Unsupervised Extractive
Summarization
- Authors: Shichao Sun, Ruifeng Yuan, Wenjie Li, Sujian Li
- Abstract summary: We propose two novel strategies to improve sentence similarity estimation for unsupervised extractive summarization.
We use contrastive learning to optimize a document-level objective that sentences from the same document are more similar than those from different documents.
We also use mutual learning to enhance the relationship between sentence similarity estimation and sentence salience ranking.
- Score: 21.602394765472386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised extractive summarization aims to extract salient sentences from
a document as the summary without labeled data. Recent literatures mostly
research how to leverage sentence similarity to rank sentences in the order of
salience. However, sentence similarity estimation using pre-trained language
models mostly takes little account of document-level information and has a weak
correlation with sentence salience ranking. In this paper, we proposed two
novel strategies to improve sentence similarity estimation for unsupervised
extractive summarization. We use contrastive learning to optimize a
document-level objective that sentences from the same document are more similar
than those from different documents. Moreover, we use mutual learning to
enhance the relationship between sentence similarity estimation and sentence
salience ranking, where an extra signal amplifier is used to refine the pivotal
information. Experimental results demonstrate the effectiveness of our
strategies.
Related papers
- RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - Enhancing Coherence of Extractive Summarization with Multitask Learning [40.349019691412465]
This study proposes a multitask learning architecture for extractive summarization with coherence boosting.
The architecture contains an extractive summarizer and coherent discriminator module.
Experiments show that our proposed method significantly improves the proportion of consecutive sentences in the extracted summaries.
arXiv Detail & Related papers (2023-05-22T09:20:58Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Text Summarization with Oracle Expectation [88.39032981994535]
Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document.
Most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy.
We propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels.
arXiv Detail & Related papers (2022-09-26T14:10:08Z) - Unsupervised Extractive Summarization using Pointwise Mutual Information [5.544401446569243]
We propose new metrics of relevance and redundancy using pointwise mutual information (PMI) between sentences.
We show that our method outperforms similarity-based methods on datasets in a range of domains including news, medical journal articles, and personal anecdotes.
arXiv Detail & Related papers (2021-02-11T21:05:50Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z) - Unsupervised Extractive Summarization by Pre-training Hierarchical
Transformers [107.12125265675483]
Unsupervised extractive document summarization aims to select important sentences from a document without using labeled summaries during training.
Existing methods are mostly graph-based with sentences as nodes and edge weights measured by sentence similarities.
We find that transformer attentions can be used to rank sentences for unsupervised extractive summarization.
arXiv Detail & Related papers (2020-10-16T08:44:09Z) - Unsupervised Summarization by Jointly Extracting Sentences and Keywords [12.387378783627762]
RepRank is an unsupervised graph-based ranking model for extractive multi-document summarization.
We show that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using our learned representations.
Experiment results with multiple benchmark datasets show that RepRank achieved the best or comparable performance in ROUGE.
arXiv Detail & Related papers (2020-09-16T05:58:00Z) - Discrete Optimization for Unsupervised Sentence Summarization with
Word-Level Extraction [31.648764677078837]
Automatic sentence summarization produces a shorter version of a sentence, while preserving its most important information.
We model these two aspects in an unsupervised objective function, consisting of language modeling and semantic similarity metrics.
Our proposed method achieves a new state-of-the art for unsupervised sentence summarization according to ROUGE scores.
arXiv Detail & Related papers (2020-05-04T19:01:55Z) - Combining Word Embeddings and N-grams for Unsupervised Document
Summarization [2.1591018627187286]
Graph-based extractive document summarization relies on the quality of the sentence similarity graph.
We employ off-the-shelf deep embedding features and tf-idf features, and introduce a new text similarity metric.
Our approach can outperform the tf-idf based approach and achieve state-of-the-art performance on the DUC04 dataset.
arXiv Detail & Related papers (2020-04-25T00:22:46Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.