Probing Neural Language Models for Human Tacit Assumptions
- URL: http://arxiv.org/abs/2004.04877v2
- Date: Tue, 16 Jun 2020 15:55:51 GMT
- Title: Probing Neural Language Models for Human Tacit Assumptions
- Authors: Nathaniel Weir, Adam Poliak, Benjamin Van Durme
- Abstract summary: Humans carry stereotypic tacit assumptions (STAs) or propositional beliefs about generic concepts.
We construct a diagnostic set of word prediction prompts to evaluate whether recent neural contextualized language models trained on large text corpora capture STAs.
- Score: 36.63841251126978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans carry stereotypic tacit assumptions (STAs) (Prince, 1978), or
propositional beliefs about generic concepts. Such associations are crucial for
understanding natural language. We construct a diagnostic set of word
prediction prompts to evaluate whether recent neural contextualized language
models trained on large text corpora capture STAs. Our prompts are based on
human responses in a psychological study of conceptual associations. We find
models to be profoundly effective at retrieving concepts given associated
properties. Our results demonstrate empirical evidence that stereotypic
conceptual representations are captured in neural models derived from
semi-supervised linguistic exposure.
Related papers
- Large Language Models as Neurolinguistic Subjects: Identifying Internal Representations for Form and Meaning [49.60849499134362]
This study investigates the linguistic understanding of Large Language Models (LLMs) regarding signifier (form) and signified (meaning)
Traditional psycholinguistic evaluations often reflect statistical biases that may misrepresent LLMs' true linguistic capabilities.
We introduce a neurolinguistic approach, utilizing a novel method that combines minimal pair and diagnostic probing to analyze activation patterns across model layers.
arXiv Detail & Related papers (2024-11-12T04:16:44Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Human-like Few-Shot Learning via Bayesian Reasoning over Natural
Language [7.11993673836973]
Humans can efficiently learn a broad range of concepts.
We introduce a model of inductive learning that seeks to be human-like in that sense.
arXiv Detail & Related papers (2023-06-05T11:46:45Z) - Language Models are Bounded Pragmatic Speakers: Understanding RLHF from
a Bayesian Cognitive Modeling Perspective [2.8282906214258805]
This paper formulates a probabilistic cognitive model called the bounded pragmatic speaker.
We demonstrate that large language models fine-tuned with reinforcement learning from human feedback embody a model of thought that resembles a fast-and-slow model.
arXiv Detail & Related papers (2023-05-28T16:04:48Z) - Localization vs. Semantics: Visual Representations in Unimodal and
Multimodal Models [57.08925810659545]
We conduct a comparative analysis of the visual representations in existing vision-and-language models and vision-only models.
Our empirical observations suggest that vision-and-language models are better at label prediction tasks.
We hope our study sheds light on the role of language in visual learning, and serves as an empirical guide for various pretrained models.
arXiv Detail & Related papers (2022-12-01T05:00:18Z) - ConceptX: A Framework for Latent Concept Analysis [21.760620298330235]
We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in Language Models (pLMs)
We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts.
arXiv Detail & Related papers (2022-11-12T11:31:09Z) - Learnable Visual Words for Interpretable Image Recognition [70.85686267987744]
We propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules.
The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories.
Our experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation.
arXiv Detail & Related papers (2022-05-22T03:24:45Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.