Related papers: Unsupervised Extractive Opinion Summarization Using Sparse Coding

Unsupervised Extractive Opinion Summarization Using Sparse Coding

URL: http://arxiv.org/abs/2203.07921v1
Date: Tue, 15 Mar 2022 14:03:35 GMT
Title: Unsupervised Extractive Opinion Summarization Using Sparse Coding
Authors: Somnath Basu Roy Chowdhury, Chao Zhao, Snigdha Chaturvedi
Abstract summary: We present Semantic Autoencoder (SemAE) to perform extractive opinion summarization in an unsupervised manner. SemAE uses dictionary learning to implicitly capture semantic information from the review and learns a latent representation of each sentence over semantic units. We report strong performance on SPACE and AMAZON datasets, and perform experiments to investigate the functioning of our model.
Score: 19.598936651505067
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Opinion summarization is the task of automatically generating summaries that encapsulate information from multiple user reviews. We present Semantic Autoencoder (SemAE) to perform extractive opinion summarization in an unsupervised manner. SemAE uses dictionary learning to implicitly capture semantic information from the review and learns a latent representation of each sentence over semantic units. A semantic unit is supposed to capture an abstract semantic concept. Our extractive summarization algorithm leverages the representations to identify representative opinions among hundreds of reviews. SemAE is also able to perform controllable summarization to generate aspect-specific summaries. We report strong performance on SPACE and AMAZON datasets, and perform experiments to investigate the functioning of our model. Our code is publicly available at https://github.com/brcsomnath/SemAE.

Related papers

AugSumm: towards generalizable speech summarization using synthetic labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech. conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary. We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z)
Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings. Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process. It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z)
Unsupervised Opinion Summarisation in the Wasserstein Space [22.634245146129857]
We present WassOS, an unsupervised abstractive summarization model which makes use of the Wasserstein distance. We show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries according to human evaluations.
arXiv Detail & Related papers (2022-11-27T19:45:38Z)
Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees [89.60269205320431]
Current abstractive summarization models either suffer from a lack of clear interpretability or provide incomplete rationales. We propose the Summarization Program (SP), an interpretable modular framework consisting of an (ordered) list of binary trees. A Summarization Program contains one root node per summary sentence, and a distinct tree connects each summary sentence to the document sentences.
arXiv Detail & Related papers (2022-09-21T16:50:22Z)
Unsupervised Opinion Summarization Using Approximate Geodesics [38.19696482602788]
We introduce Geodesic Summarizer (GeoSumm), a novel system to perform unsupervised extractive opinion summarization. GeoSumm involves an encoder-decoder based representation learning model, that generates representations of text as a distribution over latent semantic units. We use these representations to quantify the relevance of review sentences using a novel approximate geodesic distance based scoring mechanism.
arXiv Detail & Related papers (2022-09-15T17:37:08Z)
Reinforcing Semantic-Symmetry for Document Summarization [15.113768658584979]
Document summarization condenses a long document into a short version with salient information and accurate semantic descriptions. This paper introduces a new textbfreinforcing stextbfemantic-textbfsymmetry learning textbfmodel is proposed for document summarization. A series of experiments have been conducted on two wildly used benchmark datasets CNN/Daily Mail and BigPatent.
arXiv Detail & Related papers (2021-12-14T17:41:37Z)
Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents. In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text. Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z)
Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. We formulate the extractive summarization task as a semantic text matching problem. We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization [110.54963847339775]
We show that unnecessity and redundancy issues exist when extracting full sentences. We propose extracting sub-sentential units based on the constituency parsing tree.
arXiv Detail & Related papers (2020-04-06T13:35:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.