"Reasoning" with Rhetoric: On the Style-Evidence Tradeoff in LLM-Generated Counter-Arguments
- URL: http://arxiv.org/abs/2402.08498v6
- Date: Fri, 23 May 2025 13:44:41 GMT
- Title: "Reasoning" with Rhetoric: On the Style-Evidence Tradeoff in LLM-Generated Counter-Arguments
- Authors: Preetika Verma, Kokil Jaidka, Svetlana Churina,
- Abstract summary: Large language models (LLMs) play a key role in generating evidence-based and stylistic counter-arguments.<n>Previous research often neglects the balance between evidentiality and style, which are crucial for persuasive arguments.<n>We evaluated the effectiveness of stylized evidence-based counter-argument generation in Counterfire, a new dataset of 38,000 counter-arguments.
- Score: 11.243184875465788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) play a key role in generating evidence-based and stylistic counter-arguments, yet their effectiveness in real-world applications has been underexplored. Previous research often neglects the balance between evidentiality and style, which are crucial for persuasive arguments. To address this, we evaluated the effectiveness of stylized evidence-based counter-argument generation in Counterfire, a new dataset of 38,000 counter-arguments generated by revising counter-arguments to Reddit's ChangeMyView community to follow different discursive styles. We evaluated generic and stylized counter-arguments from basic and fine-tuned models such as GPT-3.5, PaLM-2, and Koala-13B, as well as newer models (GPT-4o, Claude Haiku, LLaMA-3.1) focusing on rhetorical quality and persuasiveness. Our findings reveals that humans prefer stylized counter-arguments over the original outputs, with GPT-3.5 Turbo performing well, though still not reaching human standards of rhetorical quality nor persuasiveness indicating a persisting style-evidence tradeoff in counter-argument generation by LLMs. We conclude with an examination of ethical considerations in LLM persuasion research, addressing potential risks of deceptive practices and the need for transparent deployment methodologies to safeguard against misuse in public discourse. The code and dataset are available at https://github.com/Preetika764/Style_control/.
Related papers
- Debating Truth: Debate-driven Claim Verification with Multiple Large Language Model Agents [13.626715532559079]
We propose DebateCV, the first claim verification framework that adopts a debate-driven methodology using multiple LLM agents.<n>In our framework, two Debaters take opposing stances on a claim and engage in multi-round argumentation, while a Moderator evaluates the arguments and renders a verdict with justifications.<n> Experimental results show that our method outperforms existing claim verification methods under varying levels of evidence quality.
arXiv Detail & Related papers (2025-07-25T09:19:25Z) - Mitigating Manipulation and Enhancing Persuasion: A Reflective Multi-Agent Approach for Legal Argument Generation [4.329583019758787]
Large Language Models (LLMs) are increasingly explored for legal argument generation.<n>LLMs pose significant risks of manipulation through hallucination and ungrounded persuasion.<n>This paper introduces a novel reflective multi-agent method designed to address these challenges.
arXiv Detail & Related papers (2025-06-03T15:28:30Z) - Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation [65.92001420372007]
This paper systematically evaluates state-of-the-art MLLMs across diverse benchmarks.<n>We introduce the first benchmark GaslightingBench, specifically designed to evaluate the vulnerability of MLLMs to negation arguments.
arXiv Detail & Related papers (2025-01-31T10:37:48Z) - Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions [51.51850981481236]
We introduce POATE, a novel jailbreak technique that harnesses contrastive reasoning to provoke unethical responses.<n>PoATE crafts semantically opposing intents and integrates them with adversarial templates, steering models toward harmful outputs with remarkable subtlety.<n>To counter this, we propose Intent-Aware CoT and Reverse Thinking CoT, which decompose queries to detect malicious intent and reason in reverse to evaluate and reject harmful responses.
arXiv Detail & Related papers (2025-01-03T15:40:03Z) - Missci: Reconstructing Fallacies in Misrepresented Science [84.32990746227385]
Health-related misinformation on social networks can lead to poor decision-making and real-world dangers.
Missci is a novel argumentation theoretical model for fallacious reasoning.
We present Missci as a dataset to test the critical reasoning abilities of large language models.
arXiv Detail & Related papers (2024-06-05T12:11:10Z) - What Evidence Do Language Models Find Convincing? [94.90663008214918]
We build a dataset that pairs controversial queries with a series of real-world evidence documents that contain different facts.
We use this dataset to perform sensitivity and counterfactual analyses to explore which text features most affect LLM predictions.
Overall, we find that current models rely heavily on the relevance of a website to the query, while largely ignoring stylistic features that humans find important.
arXiv Detail & Related papers (2024-02-19T02:15:34Z) - Argue with Me Tersely: Towards Sentence-Level Counter-Argument
Generation [62.069374456021016]
We present the ArgTersely benchmark for sentence-level counter-argument generation.
We also propose Arg-LlaMA for generating high-quality counter-argument.
arXiv Detail & Related papers (2023-12-21T06:51:34Z) - CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation [87.44350003888646]
Eval-Instruct can acquire pointwise grading critiques with pseudo references and revise these critiques via multi-path prompting.
CritiqueLLM is empirically shown to outperform ChatGPT and all the open-source baselines.
arXiv Detail & Related papers (2023-11-30T16:52:42Z) - Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals [70.22179850619519]
In many domains of argumentation, people's arguments are driven by so-called attitude roots.
Recent work in psychology suggests that instead of directly countering surface-level reasoning, one should follow an argumentation style inspired by the Jiu-Jitsu'soft' combat system.
We are the first to explore Jiu-Jitsu argumentation for peer review by proposing the novel task of attitude and theme-guided rebuttal generation.
arXiv Detail & Related papers (2023-11-07T13:54:01Z) - Contextualizing Argument Quality Assessment with Relevant Knowledge [11.367297319588411]
SPARK is a novel method for scoring argument quality based on contextualization via relevant knowledge.
We devise four augmentations that leverage large language models to provide feedback, infer hidden assumptions, supply a similar-quality argument, or give a counter-argument.
arXiv Detail & Related papers (2023-05-20T21:04:58Z) - ArgU: A Controllable Factual Argument Generator [0.0]
ArgU is a neural argument generator capable of producing factual arguments from input facts and real-world concepts.
We have compiled and released an annotated corpora of 69,428 arguments spanning six topics and six argument schemes.
arXiv Detail & Related papers (2023-05-09T10:49:45Z) - Persua: A Visual Interactive System to Enhance the Persuasiveness of
Arguments in Online Discussion [52.49981085431061]
Enhancing people's ability to write persuasive arguments could contribute to the effectiveness and civility in online communication.
We derived four design goals for a tool that helps users improve the persuasiveness of arguments in online discussions.
Persua is an interactive visual system that provides example-based guidance on persuasive strategies to enhance the persuasiveness of arguments.
arXiv Detail & Related papers (2022-04-16T08:07:53Z) - Argument Undermining: Counter-Argument Generation by Attacking Weak
Premises [31.463885580010192]
We explore argument undermining, that is, countering an argument by attacking one of its premises.
We propose a pipeline approach that first assesses the premises' strength and then generates a counter-argument targeting the weak ones.
arXiv Detail & Related papers (2021-05-25T08:39:14Z) - Aspect-Controlled Neural Argument Generation [65.91772010586605]
We train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect.
Our evaluation shows that our generation model is able to generate high-quality, aspect-specific arguments.
These arguments can be used to improve the performance of stance detection models via data augmentation and to generate counter-arguments.
arXiv Detail & Related papers (2020-04-30T20:17:22Z) - AMPERSAND: Argument Mining for PERSuAsive oNline Discussions [41.06165177604387]
We propose a computational model for argument mining in online persuasive discussion forums.
Our approach relies on identifying relations between components of arguments in a discussion thread.
Our models obtain significant improvements compared to recent state-of-the-art approaches.
arXiv Detail & Related papers (2020-04-30T10:33:40Z) - What Changed Your Mind: The Roles of Dynamic Topics and Discourse in
Argumentation Process [78.4766663287415]
This paper presents a study that automatically analyzes the key factors in argument persuasiveness.
We propose a novel neural model that is able to track the changes of latent topics and discourse in argumentative conversations.
arXiv Detail & Related papers (2020-02-10T04:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.