Related papers: On the Same Wavelength? Evaluating Pragmatic Reasoning in Language Models across Broad Concepts

On the Same Wavelength? Evaluating Pragmatic Reasoning in Language Models across Broad Concepts

URL: http://arxiv.org/abs/2509.06952v2
Date: Sat, 27 Sep 2025 03:48:29 GMT
Title: On the Same Wavelength? Evaluating Pragmatic Reasoning in Language Models across Broad Concepts
Authors: Linlu Qiu, Cedegao E. Zhang, Joshua B. Tenenbaum, Yoon Kim, Roger P. Levy,
Abstract summary: We study a range of LMs on both language comprehension and language production.<n>We find that state-of-the-art LMs, but not smaller ones, achieve strong performance on language comprehension.
Score: 69.69818198773244
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Language use is shaped by pragmatics -- i.e., reasoning about communicative goals and norms in context. As language models (LMs) are increasingly used as conversational agents, it becomes ever more important to understand their pragmatic reasoning abilities. We propose an evaluation framework derived from Wavelength, a popular communication game where a speaker and a listener communicate about a broad range of concepts in a granular manner. We study a range of LMs on both language comprehension and language production using direct and Chain-of-Thought (CoT) prompting, and further explore a Rational Speech Act (RSA) approach to incorporating Bayesian pragmatic reasoning into LM inference. We find that state-of-the-art LMs, but not smaller ones, achieve strong performance on language comprehension, obtaining similar-to-human accuracy and exhibiting high correlations with human judgments even without CoT prompting or RSA. On language production, CoT can outperform direct prompting, and using RSA provides significant improvements over both approaches. Our study helps identify the strengths and limitations in LMs' pragmatic reasoning abilities and demonstrates the potential for improving them with RSA, opening up future avenues for understanding conceptual representation, language understanding, and social reasoning in LMs and humans.

Related papers

Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs [8.05894553698894]
The ability to accurately interpret implied meanings plays a crucial role in human communication and language use.<n>This study demonstrates that providing language models with pragmatic theories as prompts is an effective in-context learning approach.
arXiv Detail & Related papers (2025-10-30T08:35:52Z)
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence. LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer. Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z)
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions. Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear. In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT [72.83383437501577]
Large language models (LLMs) have recently demonstrated significant potential in mathematical abilities. LLMs currently have difficulty in bridging perception, language understanding and reasoning capabilities. This paper presents a novel method for integrating LLMs into the abductive learning framework.
arXiv Detail & Related papers (2023-04-21T16:23:47Z)
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs [26.118193748582197]
We evaluate four categories of widely used state-of-the-art models. We find that, despite only evaluating on utterances that require a binary inference, models in three of these categories perform close to random. These results suggest that certain fine-tuning strategies are far better at inducing pragmatic understanding in models.
arXiv Detail & Related papers (2022-10-26T19:04:23Z)
Learning to refer informatively by amortizing pragmatic reasoning [35.71540493379324]
We explore the idea that speakers might learn to amortize the cost of Rational Speech Acts over time. We find that our amortized model is able to quickly generate language that is effective and concise across a range of contexts.
arXiv Detail & Related papers (2020-05-31T02:52:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.