Psychologically-informed chain-of-thought prompts for metaphor
understanding in large language models
- URL: http://arxiv.org/abs/2209.08141v2
- Date: Fri, 19 May 2023 18:17:24 GMT
- Title: Psychologically-informed chain-of-thought prompts for metaphor
understanding in large language models
- Authors: Ben Prystawski, Paul Thibodeau, Christopher Potts, Noah D. Goodman
- Abstract summary: We use chain-of-thought prompts to introduce structures from probabilistic models into large language models.
Our prompts lead language models to infer latent variables and reason about their relationships in order to choose appropriate paraphrases for metaphors.
- Score: 29.993190226231793
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Probabilistic models of language understanding are valuable tools for
investigating human language use. However, they need to be hand-designed for a
particular domain. In contrast, large language models (LLMs) are trained on
text that spans a wide array of domains, but they lack the structure and
interpretability of probabilistic models. In this paper, we use
chain-of-thought prompts to introduce structures from probabilistic models into
LLMs. We explore this approach in the case of metaphor understanding. Our
chain-of-thought prompts lead language models to infer latent variables and
reason about their relationships in order to choose appropriate paraphrases for
metaphors. The latent variables and relationships chosen are informed by
theories of metaphor understanding from cognitive psychology. We apply these
prompts to the two largest versions of GPT-3 and show that they can improve
performance in a paraphrase selection task.
Related papers
- What Kinds of Tokens Benefit from Distant Text? An Analysis on Long Context Language Modeling [27.75379365518913]
We study which kinds of words benefit more from long contexts in language models.
We find that content words (e.g., nouns, adjectives) and the initial tokens of words benefit the most.
We also observe that language models become more confident with longer contexts, resulting in sharper probability distributions.
arXiv Detail & Related papers (2024-06-17T06:07:29Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Towards a Fully Interpretable and More Scalable RSA Model for Metaphor Understanding [0.8437187555622164]
The Rational Speech Act (RSA) model provides a flexible framework to model pragmatic reasoning in computational terms.
Here, we introduce a new RSA framework for metaphor understanding that addresses limitations by providing an explicit formula.
The model was tested against 24 metaphors, not limited to the conventional $textitJohn-is-a-shark$ type.
arXiv Detail & Related papers (2024-04-03T18:09:33Z) - From Word Models to World Models: Translating from Natural Language to
the Probabilistic Language of Thought [124.40905824051079]
We propose rational meaning construction, a computational framework for language-informed thinking.
We frame linguistic meaning as a context-sensitive mapping from natural language into a probabilistic language of thought.
We show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings.
We extend our framework to integrate cognitively-motivated symbolic modules.
arXiv Detail & Related papers (2023-06-22T05:14:00Z) - Large Language Models are In-Context Semantic Reasoners rather than
Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years.
Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear.
In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z) - Black-box language model explanation by context length probing [7.526153863886609]
We present context length probing, a novel explanation technique for causal language models.
The technique is model-agnostic and does not rely on access to model internals beyond computing token-level probabilities.
We apply context length probing to large pre-trained language models and offer some initial analyses and insights.
arXiv Detail & Related papers (2022-12-30T16:24:10Z) - The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters
for Implicature Resolution by LLMs [26.118193748582197]
We evaluate four categories of widely used state-of-the-art models.
We find that, despite only evaluating on utterances that require a binary inference, models in three of these categories perform close to random.
These results suggest that certain fine-tuning strategies are far better at inducing pragmatic understanding in models.
arXiv Detail & Related papers (2022-10-26T19:04:23Z) - Structured, flexible, and robust: benchmarking and improving large
language models towards more human-like behavior in out-of-distribution
reasoning tasks [39.39138995087475]
We ask how much of human-like thinking can be captured by learning statistical patterns in language alone.
Our benchmark contains two problem-solving domains (planning and explanation generation) and is designed to require generalization.
We find that humans are far more robust than LLMs on this benchmark.
arXiv Detail & Related papers (2022-05-11T18:14:33Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought.
Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.