A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling
- URL: http://arxiv.org/abs/2510.15081v1
- Date: Thu, 16 Oct 2025 18:51:23 GMT
- Title: A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling
- Authors: Shiyu Ji, Farnoosh Hashemi, Joice Chen, Juanwen Pan, Weicheng Ma, Hefan Zhang, Sophia Pan, Ming Cheng, Shubham Mohole, Saeed Hassanpour, Soroush Vosoughi, Michael Macy,
- Abstract summary: We propose a novel framework that leverages large language models (LLMs) to automatically generate and label synthetic debate data based on a four-part rhetorical typology (causal, empirical, emotional, moral)<n>Our model achieves high performance and strong generalization across topical domains.<n>We illustrate two applications with the fine-tuned model: (1) the improvement in persuasiveness prediction from incorporating rhetorical strategy labels, and (2) analyzing temporal and partisan shifts in rhetorical strategies in U.S. Presidential debates (1960-2020)
- Score: 35.2732875767252
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Rhetorical strategies are central to persuasive communication, from political discourse and marketing to legal argumentation. However, analysis of rhetorical strategies has been limited by reliance on human annotation, which is costly, inconsistent, difficult to scale. Their associated datasets are often limited to specific topics and strategies, posing challenges for robust model development. We propose a novel framework that leverages large language models (LLMs) to automatically generate and label synthetic debate data based on a four-part rhetorical typology (causal, empirical, emotional, moral). We fine-tune transformer-based classifiers on this LLM-labeled dataset and validate its performance against human-labeled data on this dataset and on multiple external corpora. Our model achieves high performance and strong generalization across topical domains. We illustrate two applications with the fine-tuned model: (1) the improvement in persuasiveness prediction from incorporating rhetorical strategy labels, and (2) analyzing temporal and partisan shifts in rhetorical strategies in U.S. Presidential debates (1960-2020), revealing increased use of affective over cognitive argument in U.S. Presidential debates.
Related papers
- SAD: A Large-Scale Strategic Argumentative Dialogue Dataset [60.33125467375306]
In practice, argumentation is often realized as multi-turn dialogue.<n>We present the first large-scale textbfStrategic textbfArgumentative textbfDialogue dataset, consisting of 392,822 examples.
arXiv Detail & Related papers (2026-01-12T11:11:37Z) - AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing [79.0112532518727]
We release TeleSalesCorpus, the first real-world-grounded dialogue dataset for this domain.<n>We then propose AI-Salesman, a novel framework featuring a dual-stage architecture.<n>We show that our proposed AI-Salesman significantly outperforms baseline models in both automatic metrics and comprehensive human evaluations.
arXiv Detail & Related papers (2025-11-15T09:44:42Z) - Latent Topic Synthesis: Leveraging LLMs for Electoral Ad Analysis [51.95395936342771]
We introduce an end-to-end framework for automatically generating an interpretable topic taxonomy from an unlabeled corpus.<n>We apply this framework to a large corpus of Meta political ads from the month ahead of the 2024 U.S. Presidential election.<n>Our approach uncovers latent discourse structures, synthesizes semantically rich topic labels, and annotates topics with moral framing dimensions.
arXiv Detail & Related papers (2025-10-16T20:30:20Z) - Joint Effects of Argumentation Theory, Audio Modality and Data Enrichment on LLM-Based Fallacy Classification [0.038233569758620044]
This study investigates how context and emotional tone metadata influence large language model (LLM) reasoning and performance in fallacy classification tasks.<n>Using data from U.S. presidential debates, we classify six fallacy types through various prompting strategies applied to the Qwen-3 (8B) model.
arXiv Detail & Related papers (2025-09-14T06:35:34Z) - Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms [6.550725258692423]
Social media platforms frequently impose restrictive policies to moderate user content, prompting the emergence of creative evasion language strategies.<n>This paper presents a multi-agent framework based on Large Language Models (LLMs) to simulate the iterative evolution of language strategies under regulatory constraints.
arXiv Detail & Related papers (2025-02-26T14:59:27Z) - Steering Conversational Large Language Models for Long Emotional Support Conversations [4.984018914962973]
We focus on the steerability of the Llama-2 and Llama-3 suite of models, examining their ability to maintain these strategies throughout interactions.
To assess this, we introduce the Strategy Relevant Attention (SRA) metric, which quantifies the model's adherence to the prompted strategy through attention maps.
arXiv Detail & Related papers (2024-02-16T05:03:01Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Persuasion Strategies in Advertisements [68.70313043201882]
We introduce an extensive vocabulary of persuasion strategies and build the first ad image corpus annotated with persuasion strategies.
We then formulate the task of persuasion strategy prediction with multi-modal learning.
We conduct a real-world case study on 1600 advertising campaigns of 30 Fortune-500 companies.
arXiv Detail & Related papers (2022-08-20T07:33:13Z) - RESPER: Computationally Modelling Resisting Strategies in Persuasive
Conversations [0.7505101297221454]
We propose a generalised framework for identifying resisting strategies in persuasive conversations.
Our experiments reveal the asymmetry of power roles in non-collaborative goal-directed conversations.
We also investigate the role of different resisting strategies on the conversation outcome.
arXiv Detail & Related papers (2021-01-26T03:44:17Z) - Weakly-Supervised Hierarchical Models for Predicting Persuasive
Strategies in Good-faith Textual Requests [22.58861442978803]
We introduce a large-scale multi-domain text corpus for modeling persuasive strategies in good-faith text requests.
We design a hierarchical weakly-supervised latent variable model that can leverage partially labeled data to predict such associated persuasive strategies for each sentence.
Experimental results showed that our proposed method outperformed existing semi-supervised baselines significantly.
arXiv Detail & Related papers (2021-01-16T02:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.