Systematicity in GPT-3's Interpretation of Novel English Noun Compounds
- URL: http://arxiv.org/abs/2210.09492v1
- Date: Tue, 18 Oct 2022 00:25:24 GMT
- Title: Systematicity in GPT-3's Interpretation of Novel English Noun Compounds
- Authors: Siyan Li, Riley Carlson, Christopher Potts
- Abstract summary: We compare Levin et al.'s experimental data with GPT-3 generations, finding a high degree of similarity.
We fail to find convincing evidence that GPT-3 is reasoning about more than just individual lexical items.
These results highlight the importance of controlling for low-level distributional regularities when assessing whether a large language model latently encodes a deeper theory.
- Score: 7.039267642892591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Levin et al. (2019) show experimentally that the interpretations of novel
English noun compounds (e.g., stew skillet), while not fully compositional, are
highly predictable based on whether the modifier and head refer to artifacts or
natural kinds. Is the large language model GPT-3 governed by the same
interpretive principles? To address this question, we first compare Levin et
al.'s experimental data with GPT-3 generations, finding a high degree of
similarity. However, this evidence is consistent with GPT3 reasoning only about
specific lexical items rather than the more abstract conceptual categories of
Levin et al.'s theory. To probe more deeply, we construct prompts that require
the relevant kind of conceptual reasoning. Here, we fail to find convincing
evidence that GPT-3 is reasoning about more than just individual lexical items.
These results highlight the importance of controlling for low-level
distributional regularities when assessing whether a large language model
latently encodes a deeper theory.
Related papers
- Graph-Guided Textual Explanation Generation Framework [57.2027753204786]
Natural language explanations (NLEs) are commonly used to provide plausible free-text explanations of a model's reasoning about its predictions.
We propose G-Tex, a Graph-Guided Textual Explanation Generation framework designed to enhance the faithfulness of NLEs.
arXiv Detail & Related papers (2024-12-16T19:35:55Z) - Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks.
However, their mastery of underlying inferential rules still falls short of human capabilities.
We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z) - BOOST: Harnessing Black-Box Control to Boost Commonsense in LMs'
Generation [60.77990074569754]
We present a computation-efficient framework that steers a frozen Pre-Trained Language Model towards more commonsensical generation.
Specifically, we first construct a reference-free evaluator that assigns a sentence with a commonsensical score.
We then use the scorer as the oracle for commonsense knowledge, and extend the controllable generation method called NADO to train an auxiliary head.
arXiv Detail & Related papers (2023-10-25T23:32:12Z) - LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - Testing the Predictions of Surprisal Theory in 11 Languages [77.45204595614]
We investigate the relationship between surprisal and reading times in eleven different languages.
By focusing on a more diverse set of languages, we argue that these results offer the most robust link to-date between information theory and incremental language processing across languages.
arXiv Detail & Related papers (2023-07-07T15:37:50Z) - FLamE: Few-shot Learning from Natural Language Explanations [12.496665033682202]
We present FLamE, a framework for learning from natural language explanations.
Experiments on natural language inference demonstrate effectiveness over strong baselines.
Human evaluation surprisingly reveals that the majority of generated explanations does not adequately justify classification decisions.
arXiv Detail & Related papers (2023-06-13T18:01:46Z) - Testing Causal Models of Word Meaning in GPT-3 and -4 [18.654373173232205]
This paper evaluates the lexical representations of GPT-3 and GPT-4 through the lens of HIPE theory.
We find no evidence that GPT-3 encodes the causal structure hypothesized by HIPE, but do find evidence that GPT-4 encodes such structure.
arXiv Detail & Related papers (2023-05-24T02:03:23Z) - NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning [59.16962123636579]
This paper proposes a new take on Prolog-based inference engines.
We replace handcrafted rules with a combination of neural language modeling, guided generation, and semi dense retrieval.
Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA.
arXiv Detail & Related papers (2022-09-16T00:54:44Z) - On Reality and the Limits of Language Data: Aligning LLMs with Human
Norms [10.02997544238235]
Large Language Models (LLMs) harness linguistic associations in vast natural language data for practical applications.
We explore this question using a novel and tightly controlled reasoning test (ART) and compare human norms against versions of GPT-3.
Our findings highlight the categories of common-sense relations models that could learn directly from data and areas of weakness.
arXiv Detail & Related papers (2022-08-25T10:21:23Z) - Smoothing Entailment Graphs with Language Models [15.499215600170238]
We present a theory of optimal smoothing of Entailment Graphs built by Open Relation Extraction (ORE)
We demonstrate an efficient, open-domain, and unsupervised smoothing method using an off-the-shelf Language Model to find approximations of missing premise predicates.
In a QA task we show that EG smoothing is most useful for answering questions with lesser supporting text, where missing premise predicates are more costly.
arXiv Detail & Related papers (2022-07-30T22:15:22Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.