SAD: A Large-Scale Strategic Argumentative Dialogue Dataset
- URL: http://arxiv.org/abs/2601.07423v1
- Date: Mon, 12 Jan 2026 11:11:37 GMT
- Title: SAD: A Large-Scale Strategic Argumentative Dialogue Dataset
- Authors: Yongkang Liu, Jiayang Yu, Mingyang Wang, Yiqun Zhang, Ercong Nie, Shi Feng, Daling Wang, Kaisong Song, Hinrich Schütze,
- Abstract summary: In practice, argumentation is often realized as multi-turn dialogue.<n>We present the first large-scale textbfStrategic textbfArgumentative textbfDialogue dataset, consisting of 392,822 examples.
- Score: 60.33125467375306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Argumentation generation has attracted substantial research interest due to its central role in human reasoning and decision-making. However, most existing argumentative corpora focus on non-interactive, single-turn settings, either generating arguments from a given topic or refuting an existing argument. In practice, however, argumentation is often realized as multi-turn dialogue, where speakers defend their stances and employ diverse argumentative strategies to strengthen persuasiveness. To support deeper modeling of argumentation dialogue, we present the first large-scale \textbf{S}trategic \textbf{A}rgumentative \textbf{D}ialogue dataset, SAD, consisting of 392,822 examples. Grounded in argumentation theories, we annotate each utterance with five strategy types, allowing multiple strategies per utterance. Unlike prior datasets, SAD requires models to generate contextually appropriate arguments conditioned on the dialogue history, a specified stance on the topic, and targeted argumentation strategies. We further benchmark a range of pretrained generative models on SAD and present in-depth analysis of strategy usage patterns in argumentation.
Related papers
- A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling [35.2732875767252]
We propose a novel framework that leverages large language models (LLMs) to automatically generate and label synthetic debate data based on a four-part rhetorical typology (causal, empirical, emotional, moral)<n>Our model achieves high performance and strong generalization across topical domains.<n>We illustrate two applications with the fine-tuned model: (1) the improvement in persuasiveness prediction from incorporating rhetorical strategy labels, and (2) analyzing temporal and partisan shifts in rhetorical strategies in U.S. Presidential debates (1960-2020)
arXiv Detail & Related papers (2025-10-16T18:51:23Z) - Towards Comprehensive Argument Analysis in Education: Dataset, Tasks, and Method [14.718309497236694]
We propose 14 fine-grained relation types from both vertical and horizontal dimensions.<n>We conduct experiments on three tasks: argument component detection, relation prediction, and automated essay grading.<n>The findings highlight the importance of fine-grained argumentative annotations for argumentative writing quality assessment and encourage multi-dimensional argument analysis.
arXiv Detail & Related papers (2025-05-17T14:36:51Z) - DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs [54.4857963044859]
We propose DialogueReason, a reasoning paradigm that uncovers the lost roles in monologue-style reasoning models.<n>Our work consists of an analysis of monologue reasoning patterns and the development of a dialogue-based reasoning approach.
arXiv Detail & Related papers (2025-05-11T16:39:58Z) - An Empirical Analysis of Diversity in Argument Summarization [4.128725138940779]
We introduce three aspects of diversity: those of opinions, annotators, and sources.
We evaluate approaches to a popular argument summarization task called Key Point Analysis.
arXiv Detail & Related papers (2024-02-02T16:26:52Z) - A Unifying Framework for Learning Argumentation Semantics [47.84663434179473]
We present a novel framework, which uses an Inductive Logic Programming approach to learn the acceptability semantics for several abstract and structured argumentation frameworks in an interpretable way.<n>Our framework outperforms existing argumentation solvers, thus opening up new future research directions in the area of formal argumentation and human-machine dialogues.
arXiv Detail & Related papers (2023-10-18T20:18:05Z) - Strategic Argumentation Dialogues for Persuasion: Framework and
Experiments Based on Modelling the Beliefs and Concerns of the Persuadee [6.091096843566857]
Two key dimensions for determining whether an argument is good in a particular dialogue are the degree to which the intended audience believes the argument and counterarguments, and the impact that the argument has on the concerns of the intended audience.
We present a framework for modelling persuadees in terms of their beliefs and concerns, and for harnessing these models in optimizing the choice of move in persuasion dialogues.
arXiv Detail & Related papers (2021-01-28T08:49:24Z) - I like fish, especially dolphins: Addressing Contradictions in Dialogue
Modeling [104.09033240889106]
We introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.
We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
arXiv Detail & Related papers (2020-12-24T18:47:49Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z) - Aspect-Controlled Neural Argument Generation [65.91772010586605]
We train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect.
Our evaluation shows that our generation model is able to generate high-quality, aspect-specific arguments.
These arguments can be used to improve the performance of stance detection models via data augmentation and to generate counter-arguments.
arXiv Detail & Related papers (2020-04-30T20:17:22Z) - AMPERSAND: Argument Mining for PERSuAsive oNline Discussions [41.06165177604387]
We propose a computational model for argument mining in online persuasive discussion forums.
Our approach relies on identifying relations between components of arguments in a discussion thread.
Our models obtain significant improvements compared to recent state-of-the-art approaches.
arXiv Detail & Related papers (2020-04-30T10:33:40Z) - Dialogue-Based Relation Extraction [53.2896545819799]
We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE.
We argue that speaker-related information plays a critical role in the proposed task, based on an analysis of similarities and differences between dialogue-based and traditional RE tasks.
Experimental results demonstrate that a speaker-aware extension on the best-performing model leads to gains in both the standard and conversational evaluation settings.
arXiv Detail & Related papers (2020-04-17T03:51:57Z) - The Role of Pragmatic and Discourse Context in Determining Argument
Impact [39.70446357000737]
This paper presents a new dataset to initiate the study of this aspect of argumentation.
It consists of a diverse collection of arguments covering 741 controversial topics and comprising over 47,000 claims.
We propose predictive models that incorporate the pragmatic and discourse context of argumentative claims and show that they outperform models that rely on claim-specific linguistic features for predicting the perceived impact of individual claims within a particular line of argument.
arXiv Detail & Related papers (2020-04-06T23:00:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.