Unsupervised Extractive Opinion Summarization Using Sparse Coding
- URL: http://arxiv.org/abs/2203.07921v1
- Date: Tue, 15 Mar 2022 14:03:35 GMT
- Title: Unsupervised Extractive Opinion Summarization Using Sparse Coding
- Authors: Somnath Basu Roy Chowdhury, Chao Zhao, Snigdha Chaturvedi
- Abstract summary: We present Semantic Autoencoder (SemAE) to perform extractive opinion summarization in an unsupervised manner.
SemAE uses dictionary learning to implicitly capture semantic information from the review and learns a latent representation of each sentence over semantic units.
We report strong performance on SPACE and AMAZON datasets, and perform experiments to investigate the functioning of our model.
- Score: 19.598936651505067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Opinion summarization is the task of automatically generating summaries that
encapsulate information from multiple user reviews. We present Semantic
Autoencoder (SemAE) to perform extractive opinion summarization in an
unsupervised manner. SemAE uses dictionary learning to implicitly capture
semantic information from the review and learns a latent representation of each
sentence over semantic units. A semantic unit is supposed to capture an
abstract semantic concept. Our extractive summarization algorithm leverages the
representations to identify representative opinions among hundreds of reviews.
SemAE is also able to perform controllable summarization to generate
aspect-specific summaries. We report strong performance on SPACE and AMAZON
datasets, and perform experiments to investigate the functioning of our model.
Our code is publicly available at https://github.com/brcsomnath/SemAE.
Related papers
- AugSumm: towards generalizable speech summarization using synthetic
labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech.
conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary.
We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z) - Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings.
Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process.
It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z) - Unsupervised Opinion Summarisation in the Wasserstein Space [22.634245146129857]
We present WassOS, an unsupervised abstractive summarization model which makes use of the Wasserstein distance.
We show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries according to human evaluations.
arXiv Detail & Related papers (2022-11-27T19:45:38Z) - Summarization Programs: Interpretable Abstractive Summarization with
Neural Modular Trees [89.60269205320431]
Current abstractive summarization models either suffer from a lack of clear interpretability or provide incomplete rationales.
We propose the Summarization Program (SP), an interpretable modular framework consisting of an (ordered) list of binary trees.
A Summarization Program contains one root node per summary sentence, and a distinct tree connects each summary sentence to the document sentences.
arXiv Detail & Related papers (2022-09-21T16:50:22Z) - Unsupervised Opinion Summarization Using Approximate Geodesics [38.19696482602788]
We introduce Geodesic Summarizer (GeoSumm), a novel system to perform unsupervised extractive opinion summarization.
GeoSumm involves an encoder-decoder based representation learning model, that generates representations of text as a distribution over latent semantic units.
We use these representations to quantify the relevance of review sentences using a novel approximate geodesic distance based scoring mechanism.
arXiv Detail & Related papers (2022-09-15T17:37:08Z) - Reinforcing Semantic-Symmetry for Document Summarization [15.113768658584979]
Document summarization condenses a long document into a short version with salient information and accurate semantic descriptions.
This paper introduces a new textbfreinforcing stextbfemantic-textbfsymmetry learning textbfmodel is proposed for document summarization.
A series of experiments have been conducted on two wildly used benchmark datasets CNN/Daily Mail and BigPatent.
arXiv Detail & Related papers (2021-12-14T17:41:37Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z) - At Which Level Should We Extract? An Empirical Analysis on Extractive
Document Summarization [110.54963847339775]
We show that unnecessity and redundancy issues exist when extracting full sentences.
We propose extracting sub-sentential units based on the constituency parsing tree.
arXiv Detail & Related papers (2020-04-06T13:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.