Simple Yet Effective Synthetic Dataset Construction for Unsupervised
Opinion Summarization
- URL: http://arxiv.org/abs/2303.11660v1
- Date: Tue, 21 Mar 2023 08:08:04 GMT
- Title: Simple Yet Effective Synthetic Dataset Construction for Unsupervised
Opinion Summarization
- Authors: Ming Shen, Jie Ma, Shuai Wang, Yogarshi Vyas, Kalpit Dixit, Miguel
Ballesteros, Yassine Benajiba
- Abstract summary: We propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries.
Our first approach, Seed Words Based Leave-One-Out (SW-LOO), identifies aspect-related portions of reviews simply by exact-matching aspect seed words.
Our second approach, Natural Language Inference Based Leave-One-Out (NLI-LOO), identifies aspect-related sentences utilizing an NLI model in a more general setting without using seed words.
- Score: 28.52201592634964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Opinion summarization provides an important solution for summarizing opinions
expressed among a large number of reviews. However, generating aspect-specific
and general summaries is challenging due to the lack of annotated data. In this
work, we propose two simple yet effective unsupervised approaches to generate
both aspect-specific and general opinion summaries by training on synthetic
datasets constructed with aspect-related review contents. Our first approach,
Seed Words Based Leave-One-Out (SW-LOO), identifies aspect-related portions of
reviews simply by exact-matching aspect seed words and outperforms existing
methods by 3.4 ROUGE-L points on SPACE and 0.5 ROUGE-1 point on OPOSUM+ for
aspect-specific opinion summarization. Our second approach, Natural Language
Inference Based Leave-One-Out (NLI-LOO) identifies aspect-related sentences
utilizing an NLI model in a more general setting without using seed words and
outperforms existing approaches by 1.2 ROUGE-L points on SPACE for
aspect-specific opinion summarization and remains competitive on other metrics.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction [85.26780391682894]
We propose Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction (FENICE)
FENICE leverages an NLI-based alignment between information in the source document and a set of atomic facts, referred to as claims, extracted from the summary.
Our metric sets a new state of the art on AGGREFACT, the de-facto benchmark for factuality evaluation.
arXiv Detail & Related papers (2024-03-04T17:57:18Z) - Aspect-based Meeting Transcript Summarization: A Two-Stage Approach with
Weak Supervision on Sentence Classification [91.13086984529706]
Aspect-based meeting transcript summarization aims to produce multiple summaries.
Traditional summarization methods produce one summary mixing information of all aspects.
We propose a two-stage method for aspect-based meeting transcript summarization.
arXiv Detail & Related papers (2023-11-07T19:06:31Z) - Prompted Opinion Summarization with GPT-3.5 [115.95460650578678]
We show that GPT-3.5 models achieve very strong performance in human evaluation.
We argue that standard evaluation metrics do not reflect this, and introduce three new metrics targeting faithfulness, factuality, and genericity.
arXiv Detail & Related papers (2022-11-29T04:06:21Z) - Constrained Abstractive Summarization: Preserving Factual Consistency
with Constrained Generation [93.87095877617968]
We propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization.
We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS.
We observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.
arXiv Detail & Related papers (2020-10-24T00:27:44Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - A Multi-task Learning Framework for Opinion Triplet Extraction [24.983625011760328]
We present a novel view of ABSA as an opinion triplet extraction task.
We propose a multi-task learning framework to jointly extract aspect terms and opinion terms.
We evaluate the proposed framework on four SemEval benchmarks for ASBA.
arXiv Detail & Related papers (2020-10-04T08:31:54Z) - SueNes: A Weakly Supervised Approach to Evaluating Single-Document
Summarization via Negative Sampling [25.299937353444854]
We present a proof-of-concept study to a weakly supervised summary evaluation approach without the presence of reference summaries.
Massive data in existing summarization datasets are transformed for training by pairing documents with corrupted reference summaries.
arXiv Detail & Related papers (2020-05-13T15:40:13Z) - Aspect and Opinion Aware Abstractive Review Summarization with
Reinforced Hard Typed Decoder [18.894655634326423]
We propose a two-stage reinforcement learning approach, which first predicts the output word type from the three types, and then leverages the predicted word type to generate the final word distribution.
Results on two Amazon product review datasets demonstrate that our method can consistently outperform several strong baseline approaches based on ROUGE scores.
arXiv Detail & Related papers (2020-04-13T03:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.