Related papers: A hierarchical Bayesian model for syntactic priming

A hierarchical Bayesian model for syntactic priming

URL: http://arxiv.org/abs/2405.15964v1
Date: Fri, 24 May 2024 22:26:53 GMT
Title: A hierarchical Bayesian model for syntactic priming
Authors: Weijie Xu, Richard Futrell,
Abstract summary: The effect of syntactic priming exhibits three well-documented empirical properties. We show how these three phenomena can be reconciled in a general learning framework. We also discuss the model's implications for the lexical basis of syntactic priming.
Score: 5.765747251519448
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The effect of syntactic priming exhibits three well-documented empirical properties: the lexical boost, the inverse frequency effect, and the asymmetrical decay. We aim to show how these three empirical phenomena can be reconciled in a general learning framework, the hierarchical Bayesian model (HBM). The model represents syntactic knowledge in a hierarchical structure of syntactic statistics, where a lower level represents the verb-specific biases of syntactic decisions, and a higher level represents the abstract bias as an aggregation of verb-specific biases. This knowledge is updated in response to experience by Bayesian inference. In simulations, we show that the HBM captures the above-mentioned properties of syntactic priming. The results indicate that some properties of priming which are usually explained by a residual activation account can also be explained by an implicit learning account. We also discuss the model's implications for the lexical basis of syntactic priming.

Related papers

Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs [36.89422086121058]
We show that errors result from a structured yet flawed mechanism that we term class-based (mis)generalization.<n>Experiments on Llama-3, Mistral, and Pythia reveal that this behavior is reflected in the model's internal computations.
arXiv Detail & Related papers (2025-05-28T17:47:52Z)
On the Role of Model Prior in Real-World Inductive Reasoning [7.962140902232628]
In real-world applications, Large Language Models' hypothesis generation is shaped by task-specific model priors. removing demonstrations results in minimal loss of hypothesis quality and downstream usage. These insights advance our understanding of the dynamics of hypothesis generation in LLMs.
arXiv Detail & Related papers (2024-12-18T09:22:08Z)
Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers [49.80959223722325]
We study the distinction between feed-forward and attention layers in large language models. We find that feed-forward layers tend to learn simple distributional associations such as bigrams, while attention layers focus on in-context reasoning.
arXiv Detail & Related papers (2024-06-05T08:51:08Z)
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial. We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments. The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z)
Dual Mechanism Priming Effects in Hindi Word Order [14.88833412862455]
We test the hypothesis that priming is driven by multiple different sources. We permute the preverbal constituents of corpus sentences, and then use a logistic regression model to predict which sentences actually occurred in the corpus. By showing that different priming influences are separable from one another, our results support the hypothesis that multiple different cognitive mechanisms underlie priming.
arXiv Detail & Related papers (2022-10-25T11:49:22Z)
Does BERT really agree ? Fine-grained Analysis of Lexical Dependence on a Syntactic Task [70.29624135819884]
We study the extent to which BERT is able to perform lexically-independent subject-verb number agreement (NA) on targeted syntactic templates. Our results on nonce sentences suggest that the model generalizes well for simple templates, but fails to perform lexically-independent syntactic generalization when as little as one attractor is present.
arXiv Detail & Related papers (2022-04-14T11:33:15Z)
Syntactic Persistence in Language Models: Priming as a Window into Abstract Language Representations [0.38498574327875945]
We investigate the extent to which modern, neural language models are susceptible to syntactic priming. We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors which interact with priming strength. We report surprisingly strong priming effects when priming with multiple sentences, each with different words and meaning but with identical syntactic structure.
arXiv Detail & Related papers (2021-09-30T10:38:38Z)
A comprehensive comparative evaluation and analysis of Distributional Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT. The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous. We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z)
Decomposing lexical and compositional syntax and semantics with deep language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension. Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations. The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z)
Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences. Models that understand entailment should encode both, the premise and the hypothesis. Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z)
Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models [47.42249565529833]
Humans can learn structural properties about a word from minimal experience. We assess the ability of modern neural language models to reproduce this behavior in English.
arXiv Detail & Related papers (2020-10-12T14:12:37Z)
Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining. We distill the approximate marginal distribution over words in context from the syntactic LM. Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z)
Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? [41.649440404203595]
We introduce a method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language. We consider four aspects of monotonicity inferences and test whether the models can systematically interpret lexical and logical phenomena on different training/test splits.
arXiv Detail & Related papers (2020-04-30T14:48:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.