ArgCMV: An Argument Summarization Benchmark for the LLM-era
- URL: http://arxiv.org/abs/2508.19580v1
- Date: Wed, 27 Aug 2025 05:26:36 GMT
- Title: ArgCMV: An Argument Summarization Benchmark for the LLM-era
- Authors: Omkar Gurjar, Agam Goyal, Eshwar Chandrasekharan,
- Abstract summary: Key point extraction is an important task in argument summarization.<n>Existing approaches for KP extraction have been mostly evaluated on the popular ArgKP21 dataset.<n>Using SoTA large language models (LLMs), we curate a new argument key point extraction dataset called ArgCMV.
- Score: 7.80304437242923
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Key point extraction is an important task in argument summarization which involves extracting high-level short summaries from arguments. Existing approaches for KP extraction have been mostly evaluated on the popular ArgKP21 dataset. In this paper, we highlight some of the major limitations of the ArgKP21 dataset and demonstrate the need for new benchmarks that are more representative of actual human conversations. Using SoTA large language models (LLMs), we curate a new argument key point extraction dataset called ArgCMV comprising of around 12K arguments from actual online human debates spread across over 3K topics. Our dataset exhibits higher complexity such as longer, co-referencing arguments, higher presence of subjective discourse units, and a larger range of topics over ArgKP21. We show that existing methods do not adapt well to ArgCMV and provide extensive benchmark results by experimenting with existing baselines and latest open source models. This work introduces a novel KP extraction dataset for long-context online discussions, setting the stage for the next generation of LLM-driven summarization research.
Related papers
- Enhancing Argument Summarization: Prioritizing Exhaustiveness in Key Point Generation and Introducing an Automatic Coverage Evaluation Metric [3.0754290232284562]
Key Point Analysis (KPA) task formulates argument summarization as representing the summary of a large collection of arguments.
A sub-task of KPA, called Key Point Generation (KPG), focuses on generating these key points given the arguments.
This paper introduces a novel extractive approach for key point generation, that outperforms previous state-of-the-art methods for the task.
arXiv Detail & Related papers (2024-04-17T23:00:29Z) - Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning [61.73411954056032]
Key Point Analysis (KPA) continues to be a significant and unresolved issue within the field of argument mining.
We propose a novel approach for KPA with pairwise generation and graph partitioning.
arXiv Detail & Related papers (2024-04-17T13:44:29Z) - Argue with Me Tersely: Towards Sentence-Level Counter-Argument
Generation [62.069374456021016]
We present the ArgTersely benchmark for sentence-level counter-argument generation.
We also propose Arg-LlaMA for generating high-quality counter-argument.
arXiv Detail & Related papers (2023-12-21T06:51:34Z) - Exploring the Potential of Large Language Models in Computational Argumentation [54.85665903448207]
Large language models (LLMs) have demonstrated impressive capabilities in understanding context and generating natural language.
This work aims to embark on an assessment of LLMs, such as ChatGPT, Flan models, and LLaMA2 models, in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-11-15T15:12:15Z) - Indicative Summarization of Long Discussions [37.80285705350554]
This paper presents a novel unsupervised approach using large language models (LLMs) to generate indicative summaries for long discussions.
Our approach first clusters argument sentences, generates cluster labels as abstractive summaries, and classifies the generated cluster labels into argumentation frames.
Based on an extensively optimized prompt engineering approach, we evaluate 19LLMs for generative cluster labeling and frame classification.
arXiv Detail & Related papers (2023-11-03T12:44:59Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets [49.65208986436848]
We investigate the effect of Argument Mining dataset composition in few- and zero-shot settings.
Our findings show that, while fine-tuning is mandatory to achieve acceptable model performance, using carefully composed training samples and reducing the training sample size by up to almost 90% can still yield 95% of the maximum performance.
arXiv Detail & Related papers (2022-05-23T17:14:32Z) - IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument
Mining Tasks [59.457948080207174]
In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks.
Near 70k sentences in the dataset are fully annotated based on their argument properties.
We propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE)
arXiv Detail & Related papers (2022-03-23T08:07:32Z) - ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive
Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads.
We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z) - Quantitative Argument Summarization and Beyond: Cross-Domain Key Point
Analysis [17.875273745811775]
We develop a method for automatic extraction of key points, which enables fully automatic analysis.
We demonstrate that the applicability of key point analysis goes well beyond argumentation data.
An additional contribution is an in-depth evaluation of argument-to-key point matching models.
arXiv Detail & Related papers (2020-10-11T23:01:51Z) - From Arguments to Key Points: Towards Automatic Argument Summarization [17.875273745811775]
We show that a small number of key points per topic is typically sufficient for covering the vast majority of the arguments.
Furthermore, we found that a domain expert can often predict these key points in advance.
arXiv Detail & Related papers (2020-05-04T16:24:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.