Contextual Networks and Unsupervised Ranking of Sentences
- URL: http://arxiv.org/abs/2203.04459v1
- Date: Wed, 9 Mar 2022 00:47:20 GMT
- Title: Contextual Networks and Unsupervised Ranking of Sentences
- Authors: Hao Zhang, You Zhou, Jie Wang
- Abstract summary: We devise an unsupervised algorithm called CNATAR (Contextual Network Text Analysis Rank) to score sentences.
We show that CNATAR outperforms the combined ranking of the three human judges provided on SummBank dataset.
We also compare the performance of CNATAR and the latest supervised neural-network summarization models and compute oracle results.
- Score: 9.198786220570096
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We construct a contextual network to represent a document with syntactic and
semantic relations between word-sentence pairs, based on which we devise an
unsupervised algorithm called CNATAR (Contextual Network And Text Analysis
Rank) to score sentences, and rank them through a bi-objective 0-1 knapsack
maximization problem over topic analysis and sentence scores. We show that
CNATAR outperforms the combined ranking of the three human judges provided on
the SummBank dataset under both ROUGE and BLEU metrics, which in term
significantly outperforms each individual judge's ranking. Moreover, CNATAR
produces so far the highest ROUGE scores over DUC-02, and outperforms previous
supervised algorithms on the CNN/DailyMail and NYT datasets. We also compare
the performance of CNATAR and the latest supervised neural-network
summarization models and compute oracle results.
Related papers
- Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z) - Learning to Paraphrase Sentences to Different Complexity Levels [3.0273878903284275]
Sentence simplification is an active research topic in NLP, but its adjacent tasks of sentence complexification and same-level paraphrasing are not.
To train models on all three tasks, we present two new unsupervised datasets.
arXiv Detail & Related papers (2023-08-04T09:43:37Z) - Influence of various text embeddings on clustering performance in NLP [0.0]
A clustering approach can be used to relabel the correct star ratings by grouping the text reviews into individual groups.
In this work, we explore the task of choosing different text embeddings to represent these reviews and also explore the impact the embedding choice has on the performance of various classes of clustering algorithms.
arXiv Detail & Related papers (2023-05-04T20:53:19Z) - RankDNN: Learning to Rank for Few-shot Learning [70.49494297554537]
This paper introduces a new few-shot learning pipeline that casts relevance ranking for image retrieval as binary ranking relation classification.
It provides a new perspective on few-shot learning and is complementary to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-28T13:59:31Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed
Graph Neural Networks [68.61934077627085]
We introduce GNNRank, a modeling framework compatible with any GNN capable of learning digraph embeddings.
We show that our methods attain competitive and often superior performance compared with existing approaches.
arXiv Detail & Related papers (2022-02-01T04:19:50Z) - DisCoDisCo at the DISRPT2021 Shared Task: A System for Discourse
Segmentation, Classification, and Connective Detection [4.371388370559826]
Our system, called DisCoDisCo, enhances contextualized word embeddings with hand-crafted features.
Results on relation classification suggest strong performance on the new 2021 benchmark.
A partial evaluation of multiple pre-trained Transformer-based language models indicates that models pre-trained on the Next Sentence Prediction task are optimal for relation classification.
arXiv Detail & Related papers (2021-09-20T18:11:05Z) - Evaluating Text Coherence at Sentence and Paragraph Levels [17.99797111176988]
We investigate the adaptation of existing sentence ordering methods to a paragraph ordering task.
We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets.
We conclude that the recurrent graph neural network-based model is an optimal choice for coherence modeling.
arXiv Detail & Related papers (2020-06-05T03:31:49Z) - An Unsupervised Semantic Sentence Ranking Scheme for Text Documents [9.272728720669846]
Semantic SentenceRank (SSR) is an unsupervised scheme for ranking sentences in a single document according to their relative importance.
It extracts essential words and phrases from a text document, and uses semantic measures to construct, respectively, a semantic phrase graph over phrases and words, and a semantic sentence graph over sentences.
arXiv Detail & Related papers (2020-04-28T20:17:51Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.