Strategic Discourse Assessment: The Crooked Path to Innocence
- URL: http://arxiv.org/abs/2506.01195v2
- Date: Tue, 02 Sep 2025 14:36:01 GMT
- Title: Strategic Discourse Assessment: The Crooked Path to Innocence
- Authors: Anshun Asher Zheng, Junyi Jessy Li, David I. Beaver,
- Abstract summary: We introduce SDA, a framework grounded in Gricean and game-theoretic pragmatics to assess strategic use of language.<n>Our approach incorporates two key adaptations: a commitment-based taxonomy of discourse moves, and the use of estimable proxies grounded in Gricean maxims.<n>We derive three interpretable metrics-Benefit at Turn (BAT), Penalty at Turn (PAT), and Normalized Relative Benefit at Turn (NRBAT)-to quantify the perceived strategic effects of discourse moves.
- Score: 23.337770980661677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language is often used strategically, particularly in high-stakes, adversarial settings, yet most work on pragmatics and LLMs centers on cooperativity. This leaves a gap in the systematic understanding of strategic communication in adversarial settings. To address this, we introduce SDA (Strategic Discourse Assessment), a framework grounded in Gricean and game-theoretic pragmatics to assess strategic use of language. It adapts the ME Game jury function to make it empirically estimable for analyzing dialogue. Our approach incorporates two key adaptations: a commitment-based taxonomy of discourse moves, which provides a finer-grained account of strategic effects, and the use of estimable proxies grounded in Gricean maxims to operationalize abstract constructs such as credibility. Together, these adaptations build on discourse theory by treating discourse as the strategic management of commitments, enabling systematic evaluation of how conversational moves advance or undermine discourse goals. We further derive three interpretable metrics-Benefit at Turn (BAT), Penalty at Turn (PAT), and Normalized Relative Benefit at Turn (NRBAT)-to quantify the perceived strategic effects of discourse moves. We also present CPD (the Crooked Path Dataset), an annotated dataset of real courtroom cross-examinations, to demonstrate the framework's effectiveness. Using these tools, we evaluate a range of LLMs and show that LLMs generally exhibit limited pragmatic understanding of strategic language. While model size shows an increase in performance on our metrics, reasoning ability does not help and largely hurts, introducing overcomplication and internal confusion.
Related papers
- On Theoretically-Driven LLM Agents for Multi-Dimensional Discourse Analysis [0.0]
This paper presents a comparative multi-agent framework designed to quantify the benefits of incorporating explicit theoretical knowledge for this task.<n>We utilise a dataset of annotated political debates to establish a new standard encompassing four distinct rephrase functions.<n>We evaluate two parallel LLM-based agent systems: one enhanced by argumentation theory via Retrieval-Augmented Generation (RAG), and an identical zero-shot baseline.
arXiv Detail & Related papers (2026-02-14T10:30:39Z) - More at Stake: How Payoff and Language Shape LLM Agent Strategies in Cooperation Dilemmas [1.6487772637295166]
LLMs increasingly act as autonomous agents in interactive and multi-agent settings.<n>We investigate how payoff magnitude and linguistic context shape LLM strategies in repeated social dilemmas.<n>Across models and languages, we observe consistent behavioral patterns, including incentive-sensitive conditional strategies and cross-linguistic divergence.
arXiv Detail & Related papers (2026-01-27T01:36:50Z) - Understanding Textual Capability Degradation in Speech LLMs via Parameter Importance Analysis [54.53152524778821]
integration of speech into Large Language Models (LLMs) has substantially expanded their capabilities, but often at the cost of weakening their core textual competence.<n>We propose an analytical framework based on parameter importance estimation, which reveals that fine-tuning for speech introduces a textual importance distribution shift.<n>We investigate two mitigation strategies: layer-wise learning rate scheduling and Low-Rank Adaptation (LoRA)<n> Experimental results show that both approaches better maintain textual competence than full fine-tuning, while also improving downstream spoken question answering performance.
arXiv Detail & Related papers (2025-09-28T09:04:40Z) - CHBench: A Cognitive Hierarchy Benchmark for Evaluating Strategic Reasoning Capability of LLMs [10.29314561183905]
Game-playing ability serves as an indicator for evaluating the strategic reasoning capability of large language models.<n>We propose CHBench, a novel evaluation framework inspired by the cognitive hierarchy models from behavioral economics.
arXiv Detail & Related papers (2025-08-16T07:10:26Z) - LLM Agents for Bargaining with Utility-based Feedback [23.357706450282002]
We introduce a comprehensive framework centered on utility-based feedback.<n>Our contributions are threefold: (1) BargainArena, a novel benchmark dataset; (2) human-aligned, economically-grounded evaluation metrics inspired by utility theory; and (3) a structured feedback mechanism enabling LLMs to iteratively refine their bargaining strategies.
arXiv Detail & Related papers (2025-05-29T02:07:27Z) - EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning [69.55982246413046]
We propose explicit policy optimization (EPO) for strategic reasoning.<n>We train the strategic reasoning model via multi-turn reinforcement learning (RL),utilizing process rewards and iterative self-play.<n>Our findings reveal various collaborative reasoning mechanisms emergent in EPO and its effectiveness in generating novel strategies.
arXiv Detail & Related papers (2025-02-18T03:15:55Z) - Scoring with Large Language Models: A Study on Measuring Empathy of Responses in Dialogues [3.2162648244439684]
We develop a framework for investigating how effective Large Language Models are at measuring and scoring empathy of responses in dialogues.<n>Our strategy is to approximate the performance of state-of-the-art and fine-tuned LLMs with explicit and explainable features.<n>Our results show that when only using embeddings, it is possible to achieve performance close to that of generic LLMs.
arXiv Detail & Related papers (2024-12-28T20:37:57Z) - Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions [49.36683223327633]
Large Language Models (LLMs) encapsulate extensive world knowledge and exhibit powerful context modeling capabilities.<n>We propose a novel framework that synergizes the strengths of LLMs with robust knowledge representation to enable effective and efficient KGC.<n>We achieve a 47% relative improvement over previous methods based on non-fine-tuned LLMs and, to our knowledge, are the first to achieve classification performance comparable to fine-tuned LLMs.
arXiv Detail & Related papers (2024-08-13T10:15:55Z) - ICLEval: Evaluating In-Context Learning Ability of Large Language Models [68.7494310749199]
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs.<n>Existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability.<n>We introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning.
arXiv Detail & Related papers (2024-06-21T08:06:10Z) - On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models [25.029579061612456]
Large Language Models (LLMs) are increasingly being employed in real-world applications in critical domains such as healthcare.
It is important to ensure that the Chain-of-Thought (CoT) reasoning generated by these models faithfully captures their underlying behavior.
arXiv Detail & Related papers (2024-06-15T13:16:44Z) - LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models [75.89014602596673]
Strategic reasoning requires understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly.
We explore the scopes, applications, methodologies, and evaluation metrics related to strategic reasoning with Large Language Models.
It underscores the importance of strategic reasoning as a critical cognitive capability and offers insights into future research directions and potential improvements.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.<n>This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.<n>We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning [76.3114831562989]
It requires Large Language Model (LLM) agents to adapt their strategies dynamically in multi-agent environments.
We propose a novel framework: "K-Level Reasoning with Large Language Models (K-R)"
arXiv Detail & Related papers (2024-02-02T16:07:05Z) - Sparsity-Guided Holistic Explanation for LLMs with Interpretable
Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications.
We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z) - In-Context Exemplars as Clues to Retrieving from Large Associative
Memory [1.2952137350423816]
In-context learning (ICL) enables large language models (LLMs) to learn patterns from in-context exemplars without training.
How to choose exemplars remains unclear due to the lack of understanding of how in-context learning works.
Our study sheds new light on the mechanism of ICL by connecting it to memory retrieval.
arXiv Detail & Related papers (2023-11-06T20:13:29Z) - From Heuristic to Analytic: Cognitively Motivated Strategies for
Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions.
Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters
for Implicature Resolution by LLMs [26.118193748582197]
We evaluate four categories of widely used state-of-the-art models.
We find that, despite only evaluating on utterances that require a binary inference, models in three of these categories perform close to random.
These results suggest that certain fine-tuning strategies are far better at inducing pragmatic understanding in models.
arXiv Detail & Related papers (2022-10-26T19:04:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.