How Persuasive is Your Context?
- URL: http://arxiv.org/abs/2509.17879v1
- Date: Mon, 22 Sep 2025 15:15:40 GMT
- Title: How Persuasive is Your Context?
- Authors: Tu Nguyen, Kevin Du, Alexander Miserlis Hoyle, Ryan Cotterell,
- Abstract summary: We introduce targeted persuasion score (TPS) to quantify how persuasive a given context is to an LM.<n>TPS measures how much a context shifts a model's original answer distribution toward a target distribution.<n> Empirically, through a series of experiments, we show that TPS captures a more nuanced notion of persuasiveness than previously proposed metrics.
- Score: 85.2011141143185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two central capabilities of language models (LMs) are: (i) drawing on prior knowledge about entities, which allows them to answer queries such as "What's the official language of Austria?", and (ii) adapting to new information provided in context, e.g., "Pretend the official language of Austria is Tagalog.", that is pre-pended to the question. In this article, we introduce targeted persuasion score (TPS), designed to quantify how persuasive a given context is to an LM where persuasion is operationalized as the ability of the context to alter the LM's answer to the question. In contrast to evaluating persuasiveness only by inspecting the greedily decoded answer under the model, TPS provides a more fine-grained view of model behavior. Based on the Wasserstein distance, TPS measures how much a context shifts a model's original answer distribution toward a target distribution. Empirically, through a series of experiments, we show that TPS captures a more nuanced notion of persuasiveness than previously proposed metrics.
Related papers
- Align to the Pivot: Dual Alignment with Self-Feedback for Multilingual Math Reasoning [71.4175109189942]
We present Pivot-Aligned Self-Feedback Multilingual Reasoning (PASMR)<n>This approach designates the model's primary language as the pivot language.<n>It establishes a cross-lingual self-feedback mechanism without relying on external correct answers or reward models.
arXiv Detail & Related papers (2026-01-25T03:20:00Z) - Seeing to Act, Prompting to Specify: A Bayesian Factorization of Vision Language Action Policy [59.44168425139687]
BayesVLA is a Bayesian factorization that decomposes the policy into a visual-action prior, supporting seeing-to-act, and a language-conditioned likelihood, enabling prompt-to-specify.<n>Experiments show superior generalization to unseen instructions, objects, and environments compared to existing methods.
arXiv Detail & Related papers (2025-12-12T01:59:23Z) - Context is Enough: Empirical Validation of $\ extit{Sequentiality}$ on Essays [1.338174941551702]
We show that the contextual version of sequentiality aligns more closely with human assessments of discourse-level traits.<n>Our findings support the use of context-based sequentiality as a validated, interpretable, and complementary feature for automated essay scoring and related NLP tasks.
arXiv Detail & Related papers (2025-11-12T10:31:07Z) - "Lost-in-the-Later": Framework for Quantifying Contextual Grounding in Large Language Models [4.712325494028972]
We introduce CoPE, a novel evaluation framework that measures contextual knowledge across models and languages.<n>We analyze how large language models integrate context, prioritize information, and incorporate PK in open-ended question answering.<n>We find that reasoning models, as well as non-reasoning models prompted with chain-of-thought (CoT), use context even less than non-reasoning models without CoT and fail to mitigate the lost-in-the-later effect.
arXiv Detail & Related papers (2025-07-07T19:13:20Z) - Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers [12.94303673025761]
We analyze latent representations (latents) during a word-translation task in transformer-based language models.<n>We find that the output language is encoded in the latent at an earlier layer than the concept to be translated.<n>We show that patching with the mean representation of a concept across different languages does not affect the models' ability to translate it.
arXiv Detail & Related papers (2024-11-13T16:26:19Z) - Uncovering Differences in Persuasive Language in Russian versus English Wikipedia [40.61046400448044]
We study how differences in persuasive language across Wikipedia articles, written in either English and Russian, can uncover each culture's distinct perspective on different subjects.
We develop a large language model (LLM) powered system to identify instances of persuasive language in multilingual texts.
arXiv Detail & Related papers (2024-09-27T21:23:19Z) - Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language [41.052284715017606]
We study the ability of Large Language Models (LLMs) to produce persuasive text.<n>As opposed to prior work which focuses on particular domains or types of persuasion, we conduct a general study across various domains.<n>We construct the new dataset Persuasive-Pairs of pairs of pairs of a short text and its rewrite by an LLM to amplify or diminish persuasive language.
arXiv Detail & Related papers (2024-06-25T17:40:47Z) - Prosody in Cascade and Direct Speech-to-Text Translation: a case study
on Korean Wh-Phrases [79.07111754406841]
This work proposes using contrastive evaluation to measure the ability of direct S2TT systems to disambiguate utterances where prosody plays a crucial role.
Our results clearly demonstrate the value of direct translation systems over cascade translation models.
arXiv Detail & Related papers (2024-02-01T14:46:35Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - GreaseLM: Graph REASoning Enhanced Language Models for Question
Answering [159.9645181522436]
GreaseLM is a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations.
We show that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.
arXiv Detail & Related papers (2022-01-21T19:00:05Z) - Do Context-Aware Translation Models Pay the Right Attention? [61.25804242929533]
Context-aware machine translation models are designed to leverage contextual information, but often fail to do so.
In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words?
We introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations.
Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words.
arXiv Detail & Related papers (2021-05-14T17:32:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.