A fine-grained comparison of pragmatic language understanding in humans
and language models
- URL: http://arxiv.org/abs/2212.06801v2
- Date: Tue, 23 May 2023 18:35:34 GMT
- Title: A fine-grained comparison of pragmatic language understanding in humans
and language models
- Authors: Jennifer Hu, Sammy Floyd, Olessia Jouravlev, Evelina Fedorenko, Edward
Gibson
- Abstract summary: We compare language models and humans on seven pragmatic phenomena.
We find that the largest models achieve high accuracy and match human error patterns.
Preliminary evidence that models and humans are sensitive to similar linguistic cues.
- Score: 2.231167375820083
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pragmatics and non-literal language understanding are essential to human
communication, and present a long-standing challenge for artificial language
models. We perform a fine-grained comparison of language models and humans on
seven pragmatic phenomena, using zero-shot prompting on an expert-curated set
of English materials. We ask whether models (1) select pragmatic
interpretations of speaker utterances, (2) make similar error patterns as
humans, and (3) use similar linguistic cues as humans to solve the tasks. We
find that the largest models achieve high accuracy and match human error
patterns: within incorrect responses, models favor literal interpretations over
heuristic-based distractors. We also find preliminary evidence that models and
humans are sensitive to similar linguistic cues. Our results suggest that
pragmatic behaviors can emerge in models without explicitly constructed
representations of mental states. However, models tend to struggle with
phenomena relying on social expectation violations.
Related papers
- A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument Roles [0.06554326244334868]
We evaluate large language models' sensitivity to argument roles by replicating psycholinguistic studies on human argument role processing.
We find that language models are able to distinguish verbs that appear in plausible and implausible contexts, where plausibility is determined through the relation between the verb and its preceding arguments.
This indicates that language models' capacity to detect verb plausibility does not arise from the same mechanism that underlies human real-time sentence processing.
arXiv Detail & Related papers (2024-10-21T16:05:58Z) - Perceptions of Linguistic Uncertainty by Language Models and Humans [26.69714008538173]
We investigate how language models map linguistic expressions of uncertainty to numerical responses.
We find that 7 out of 10 models are able to map uncertainty expressions to probabilistic responses in a human-like manner.
This sensitivity indicates that language models are substantially more susceptible to bias based on their prior knowledge.
arXiv Detail & Related papers (2024-07-22T17:26:12Z) - UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations [62.71847873326847]
We investigate the ability to model unusual, unexpected, and unlikely situations.
Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate an explanation.
We release a new English language corpus called UNcommonsense.
arXiv Detail & Related papers (2023-11-14T19:00:55Z) - Beyond the limitations of any imaginable mechanism: large language
models and psycholinguistics [0.0]
Large language models provide a model for language.
They are useful as a practical tool, as an illustrative comparative, and philosophical, as a basis for recasting the relationship between language and thought.
arXiv Detail & Related papers (2023-02-28T20:49:38Z) - Chain of Hindsight Aligns Language Models with Feedback [62.68665658130472]
We propose a novel technique, Chain of Hindsight, that is easy to optimize and can learn from any form of feedback, regardless of its polarity.
We convert all types of feedback into sequences of sentences, which are then used to fine-tune the model.
By doing so, the model is trained to generate outputs based on feedback, while learning to identify and correct negative attributes or errors.
arXiv Detail & Related papers (2023-02-06T10:28:16Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics.
Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding.
We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z) - It's not Rocket Science : Interpreting Figurative Language in Narratives [48.84507467131819]
We study the interpretation of two non-compositional figurative languages (idioms and similes)
Our experiments show that models based solely on pre-trained language models perform substantially worse than humans on these tasks.
We additionally propose knowledge-enhanced models, adopting human strategies for interpreting figurative language.
arXiv Detail & Related papers (2021-08-31T21:46:35Z) - Uncovering Constraint-Based Behavior in Neural Models via Targeted
Fine-Tuning [9.391375268580806]
We show that competing linguistic processes within a language obscure underlying linguistic knowledge.
While human behavior has been found to be similar across languages, we find cross-linguistic variation in model behavior.
Our results suggest that models need to learn both the linguistic constraints in a language and their relative ranking, with mismatches in either producing non-human-like behavior.
arXiv Detail & Related papers (2021-06-02T14:52:11Z) - The Sensitivity of Language Models and Humans to Winograd Schema
Perturbations [36.47219885590433]
We show that large-scale pretrained language models are sensitive to linguistic perturbations that minimally affect human understanding.
Our results highlight interesting differences between humans and language models.
arXiv Detail & Related papers (2020-05-04T09:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.