Comparing Methods for Extractive Summarization of Call Centre Dialogue
- URL: http://arxiv.org/abs/2209.02472v1
- Date: Tue, 6 Sep 2022 13:16:02 GMT
- Title: Comparing Methods for Extractive Summarization of Call Centre Dialogue
- Authors: Alexandra N. Uma, Dmitry Sityaev
- Abstract summary: We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively.
We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper provides results of evaluating some text summarisation techniques
for the purpose of producing call summaries for contact centre solutions. We
specifically focus on extractive summarisation methods, as they do not require
any labelled data and are fairly quick and easy to implement for production
use. We experimentally compare several such methods by using them to produce
summaries of calls, and evaluating these summaries objectively (using ROUGE-L)
and subjectively (by aggregating the judgements of several annotators). We
found that TopicSum and Lead-N outperform the other summarisation methods,
whilst BERTSum received comparatively lower scores in both subjective and
objective evaluations. The results demonstrate that even such simple
heuristics-based methods like Lead-N ca n produce meaningful and useful
summaries of call centre dialogues.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - A Comparative Study of Quality Evaluation Methods for Text Summarization [0.5512295869673147]
This paper proposes a novel method based on large language models (LLMs) for evaluating text summarization.
Our results show that LLMs evaluation aligns closely with human evaluation, while widely-used automatic metrics such as ROUGE-2, BERTScore, and SummaC do not and also lack consistency.
arXiv Detail & Related papers (2024-06-30T16:12:37Z) - CASPR: Automated Evaluation Metric for Contrastive Summarization [4.310460539747285]
We propose an automated evaluation metric CASPR to better measure contrast between a pair of summaries.
Our results on a prior dataset CoCoTRIP demonstrate that CASPR can more reliably capture the contrastiveness of the summary pairs compared to the baselines.
arXiv Detail & Related papers (2024-04-23T23:27:29Z) - Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - Little Giants: Exploring the Potential of Small LLMs as Evaluation
Metrics in Summarization in the Eval4NLP 2023 Shared Task [53.163534619649866]
This paper focuses on assessing the effectiveness of prompt-based techniques to empower Large Language Models to handle the task of quality estimation.
We conducted systematic experiments with various prompting techniques, including standard prompting, prompts informed by annotator instructions, and innovative chain-of-thought prompting.
Our work reveals that combining these approaches using a "small", open source model (orca_mini_v3_7B) yields competitive results.
arXiv Detail & Related papers (2023-11-01T17:44:35Z) - Do You Hear The People Sing? Key Point Analysis via Iterative Clustering
and Abstractive Summarisation [12.548947151123555]
Argument summarisation is a promising but currently under-explored field.
One of the main challenges in Key Point Analysis is finding high-quality key point candidates.
evaluating key points is crucial in ensuring that the automatically generated summaries are useful.
arXiv Detail & Related papers (2023-05-25T12:43:29Z) - A Training-free and Reference-free Summarization Evaluation Metric via
Centrality-weighted Relevance and Self-referenced Redundancy [60.419107377879925]
We propose a training-free and reference-free summarization evaluation metric.
Our metric consists of a centrality-weighted relevance score and a self-referenced redundancy score.
Our methods can significantly outperform existing methods on both multi-document and single-document summarization evaluation.
arXiv Detail & Related papers (2021-06-26T05:11:27Z) - Understanding the Extent to which Summarization Evaluation Metrics
Measure the Information Quality of Summaries [74.28810048824519]
We analyze the token alignments used by ROUGE and BERTScore to compare summaries.
We argue that their scores largely cannot be interpreted as measuring information overlap.
arXiv Detail & Related papers (2020-10-23T15:55:15Z) - SueNes: A Weakly Supervised Approach to Evaluating Single-Document
Summarization via Negative Sampling [25.299937353444854]
We present a proof-of-concept study to a weakly supervised summary evaluation approach without the presence of reference summaries.
Massive data in existing summarization datasets are transformed for training by pairing documents with corrupted reference summaries.
arXiv Detail & Related papers (2020-05-13T15:40:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.