Unsupervised Opinion Summarization Using Approximate Geodesics
- URL: http://arxiv.org/abs/2209.07496v3
- Date: Mon, 20 Nov 2023 15:57:43 GMT
- Title: Unsupervised Opinion Summarization Using Approximate Geodesics
- Authors: Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Amr Ahmed,
Snigdha Chaturvedi
- Abstract summary: We introduce Geodesic Summarizer (GeoSumm), a novel system to perform unsupervised extractive opinion summarization.
GeoSumm involves an encoder-decoder based representation learning model, that generates representations of text as a distribution over latent semantic units.
We use these representations to quantify the relevance of review sentences using a novel approximate geodesic distance based scoring mechanism.
- Score: 38.19696482602788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Opinion summarization is the task of creating summaries capturing popular
opinions from user reviews. In this paper, we introduce Geodesic Summarizer
(GeoSumm), a novel system to perform unsupervised extractive opinion
summarization. GeoSumm involves an encoder-decoder based representation
learning model, that generates representations of text as a distribution over
latent semantic units. GeoSumm generates these representations by performing
dictionary learning over pre-trained text representations at multiple decoder
layers. We then use these representations to quantify the relevance of review
sentences using a novel approximate geodesic distance based scoring mechanism.
We use the relevance scores to identify popular opinions in order to compose
general and aspect-specific summaries. Our proposed model, GeoSumm, achieves
state-of-the-art performance on three opinion summarization datasets. We
perform additional experiments to analyze the functioning of our model and
showcase the generalization ability of {\X} across different domains.
Related papers
- Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings.
Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process.
It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z) - Unsupervised Opinion Summarisation in the Wasserstein Space [22.634245146129857]
We present WassOS, an unsupervised abstractive summarization model which makes use of the Wasserstein distance.
We show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries according to human evaluations.
arXiv Detail & Related papers (2022-11-27T19:45:38Z) - Unsupervised Extractive Opinion Summarization Using Sparse Coding [19.598936651505067]
We present Semantic Autoencoder (SemAE) to perform extractive opinion summarization in an unsupervised manner.
SemAE uses dictionary learning to implicitly capture semantic information from the review and learns a latent representation of each sentence over semantic units.
We report strong performance on SPACE and AMAZON datasets, and perform experiments to investigate the functioning of our model.
arXiv Detail & Related papers (2022-03-15T14:03:35Z) - WikiAsp: A Dataset for Multi-domain Aspect-based Summarization [69.13865812754058]
We propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization.
Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation.
Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.
arXiv Detail & Related papers (2020-11-16T10:02:52Z) - Hierarchical Bi-Directional Self-Attention Networks for Paper Review
Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation.
Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three)
We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z) - Topic-Guided Abstractive Text Summarization: a Joint Learning Approach [19.623946402970933]
We introduce a new approach for abstractive text summarization, Topic-Guided Abstractive Summarization.
The idea is to incorporate neural topic modeling with a Transformer-based sequence-to-sequence (seq2seq) model in a joint learning framework.
arXiv Detail & Related papers (2020-10-20T14:45:25Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
Cloze Reward [42.925345819778656]
We present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.
We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities.
Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets.
arXiv Detail & Related papers (2020-05-03T18:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.