Do You Hear The People Sing? Key Point Analysis via Iterative Clustering
and Abstractive Summarisation
- URL: http://arxiv.org/abs/2305.16000v1
- Date: Thu, 25 May 2023 12:43:29 GMT
- Title: Do You Hear The People Sing? Key Point Analysis via Iterative Clustering
and Abstractive Summarisation
- Authors: Hao Li, Viktor Schlegel, Riza Batista-Navarro, Goran Nenadic
- Abstract summary: Argument summarisation is a promising but currently under-explored field.
One of the main challenges in Key Point Analysis is finding high-quality key point candidates.
evaluating key points is crucial in ensuring that the automatically generated summaries are useful.
- Score: 12.548947151123555
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Argument summarisation is a promising but currently under-explored field.
Recent work has aimed to provide textual summaries in the form of concise and
salient short texts, i.e., key points (KPs), in a task known as Key Point
Analysis (KPA). One of the main challenges in KPA is finding high-quality key
point candidates from dozens of arguments even in a small corpus. Furthermore,
evaluating key points is crucial in ensuring that the automatically generated
summaries are useful. Although automatic methods for evaluating summarisation
have considerably advanced over the years, they mainly focus on sentence-level
comparison, making it difficult to measure the quality of a summary (a set of
KPs) as a whole. Aggravating this problem is the fact that human evaluation is
costly and unreproducible. To address the above issues, we propose a two-step
abstractive summarisation framework based on neural topic modelling with an
iterative clustering procedure, to generate key points which are aligned with
how humans identify key points. Our experiments show that our framework
advances the state of the art in KPA, with performance improvement of up to 14
(absolute) percentage points, in terms of both ROUGE and our own proposed
evaluation metrics. Furthermore, we evaluate the generated summaries using a
novel set-based evaluation toolkit. Our quantitative analysis demonstrates the
effectiveness of our proposed evaluation metrics in assessing the quality of
generated KPs. Human evaluation further demonstrates the advantages of our
approach and validates that our proposed evaluation metric is more consistent
with human judgment than ROUGE scores.
Related papers
- JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking [81.88787401178378]
We introduce JudgeRank, a novel agentic reranker that emulates human cognitive processes when assessing document relevance.
We evaluate JudgeRank on the reasoning-intensive BRIGHT benchmark, demonstrating substantial performance improvements over first-stage retrieval methods.
In addition, JudgeRank performs on par with fine-tuned state-of-the-art rerankers on the popular BEIR benchmark, validating its zero-shot generalization capability.
arXiv Detail & Related papers (2024-10-31T18:43:12Z) - Enhancing Argument Summarization: Prioritizing Exhaustiveness in Key Point Generation and Introducing an Automatic Coverage Evaluation Metric [3.0754290232284562]
Key Point Analysis (KPA) task formulates argument summarization as representing the summary of a large collection of arguments.
A sub-task of KPA, called Key Point Generation (KPG), focuses on generating these key points given the arguments.
This paper introduces a novel extractive approach for key point generation, that outperforms previous state-of-the-art methods for the task.
arXiv Detail & Related papers (2024-04-17T23:00:29Z) - KPEval: Towards Fine-Grained Semantic-Based Keyphrase Evaluation [69.57018875757622]
We propose KPEval, a comprehensive evaluation framework consisting of four critical aspects: reference agreement, faithfulness, diversity, and utility.
Using KPEval, we re-evaluate 23 keyphrase systems and discover that established model comparison results have blind-spots.
arXiv Detail & Related papers (2023-03-27T17:45:38Z) - Large Language Models are Diverse Role-Players for Summarization
Evaluation [82.31575622685902]
A document summary's quality can be assessed by human annotators on various criteria, both objective ones like grammar and correctness, and subjective ones like informativeness, succinctness, and appeal.
Most of the automatic evaluation methods like BLUE/ROUGE may be not able to adequately capture the above dimensions.
We propose a new evaluation framework based on LLMs, which provides a comprehensive evaluation framework by comparing generated text and reference text from both objective and subjective aspects.
arXiv Detail & Related papers (2023-03-27T10:40:59Z) - Revisiting the Gold Standard: Grounding Summarization Evaluation with
Robust Human Evaluation [136.16507050034755]
Existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient scale.
We propose a modified summarization salience protocol, Atomic Content Units (ACUs), which is based on fine-grained semantic units.
We curate the Robust Summarization Evaluation (RoSE) benchmark, a large human evaluation dataset consisting of 22,000 summary-level annotations over 28 top-performing systems.
arXiv Detail & Related papers (2022-12-15T17:26:05Z) - Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively.
We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z) - A Training-free and Reference-free Summarization Evaluation Metric via
Centrality-weighted Relevance and Self-referenced Redundancy [60.419107377879925]
We propose a training-free and reference-free summarization evaluation metric.
Our metric consists of a centrality-weighted relevance score and a self-referenced redundancy score.
Our methods can significantly outperform existing methods on both multi-document and single-document summarization evaluation.
arXiv Detail & Related papers (2021-06-26T05:11:27Z) - Every Bite Is an Experience: Key Point Analysis of Business Reviews [12.364867281334096]
Key Point Analysis (KPA) has been proposed as a summarization framework that provides both textual and quantitative summary of the main points in the data.
We show empirically that these novel extensions of KPA substantially improve its performance.
arXiv Detail & Related papers (2021-06-12T12:22:12Z) - Quantitative Argument Summarization and Beyond: Cross-Domain Key Point
Analysis [17.875273745811775]
We develop a method for automatic extraction of key points, which enables fully automatic analysis.
We demonstrate that the applicability of key point analysis goes well beyond argumentation data.
An additional contribution is an in-depth evaluation of argument-to-key point matching models.
arXiv Detail & Related papers (2020-10-11T23:01:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.