Modeling structure-building in the brain with CCG parsing and large
language models
- URL: http://arxiv.org/abs/2210.16147v3
- Date: Sun, 16 Apr 2023 21:49:47 GMT
- Title: Modeling structure-building in the brain with CCG parsing and large
language models
- Authors: Milo\v{s} Stanojevi\'c and Jonathan R. Brennan and Donald Dunagan and
Mark Steedman and John T. Hale
- Abstract summary: Combinatory Categorial Grammars (CCGs) are sufficiently expressive directly compositional models of grammar.
We evaluate whether a more expressive CCG provides a better model than a context-free grammar for human neural signals collected with fMRI.
- Score: 9.17816011606258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To model behavioral and neural correlates of language comprehension in
naturalistic environments researchers have turned to broad-coverage tools from
natural-language processing and machine learning. Where syntactic structure is
explicitly modeled, prior work has relied predominantly on context-free
grammars (CFG), yet such formalisms are not sufficiently expressive for human
languages. Combinatory Categorial Grammars (CCGs) are sufficiently expressive
directly compositional models of grammar with flexible constituency that
affords incremental interpretation. In this work we evaluate whether a more
expressive CCG provides a better model than a CFG for human neural signals
collected with fMRI while participants listen to an audiobook story. We further
test between variants of CCG that differ in how they handle optional adjuncts.
These evaluations are carried out against a baseline that includes estimates of
next-word predictability from a Transformer neural network language model. Such
a comparison reveals unique contributions of CCG structure-building
predominantly in the left posterior temporal lobe: CCG-derived measures offer a
superior fit to neural signals compared to those derived from a CFG. These
effects are spatially distinct from bilateral superior temporal effects that
are unique to predictability. Neural effects for structure-building are thus
separable from predictability during naturalistic listening, and those effects
are best characterized by a grammar whose expressive power is motivated on
independent linguistic grounds.
Related papers
- Investigating the Timescales of Language Processing with EEG and Language Models [0.0]
This study explores the temporal dynamics of language processing by examining the alignment between word representations from a pre-trained language model and EEG data.
Using a Temporal Response Function (TRF) model, we investigate how neural activity corresponds to model representations across different layers.
Our analysis reveals patterns in TRFs from distinct layers, highlighting varying contributions to lexical and compositional processing.
arXiv Detail & Related papers (2024-06-28T12:49:27Z) - Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming [11.134421799875138]
This study evaluates the performance of Recurrent Neural Network (RNN) and Transformer in replicating cross-language structural priming.
We examine how these models handle the robust phenomenon of structural priming, where exposure to a particular sentence structure increases the likelihood of selecting a similar structure subsequently.
arXiv Detail & Related papers (2024-05-15T17:01:02Z) - Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder.
The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli.
Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z) - Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge.
We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences.
We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z) - Constructing Word-Context-Coupled Space Aligned with Associative
Knowledge Relations for Interpretable Language Modeling [0.0]
The black-box structure of the deep neural network in pre-trained language models seriously limits the interpretability of the language modeling process.
A Word-Context-Coupled Space (W2CSpace) is proposed by introducing the alignment processing between uninterpretable neural representation and interpretable statistical logic.
Our language model can achieve better performance and highly credible interpretable ability compared to related state-of-the-art methods.
arXiv Detail & Related papers (2023-05-19T09:26:02Z) - Self-supervised models of audio effectively explain human cortical
responses to speech [71.57870452667369]
We capitalize on the progress of self-supervised speech representation learning to create new state-of-the-art models of the human auditory system.
We show that these results show that self-supervised models effectively capture the hierarchy of information relevant to different stages of speech processing in human cortex.
arXiv Detail & Related papers (2022-05-27T22:04:02Z) - Oracle Linguistic Graphs Complement a Pretrained Transformer Language
Model: A Cross-formalism Comparison [13.31232311913236]
We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.
We find that, overall, semantic constituency structures are most useful to language modeling performance.
arXiv Detail & Related papers (2021-12-15T04:29:02Z) - Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction.
It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition.
We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z) - Demystifying Neural Language Models' Insensitivity to Word-Order [7.72780997900827]
We investigate the insensitivity of natural language models to word-order by quantifying perturbations.
We find that neural language models require local ordering more so than the global ordering of tokens.
arXiv Detail & Related papers (2021-07-29T13:34:20Z) - High-order Semantic Role Labeling [86.29371274587146]
This paper introduces a high-order graph structure for the neural semantic role labeling model.
It enables the model to explicitly consider not only the isolated predicate-argument pairs but also the interaction between the predicate-argument pairs.
Experimental results on 7 languages of the CoNLL-2009 benchmark show that the high-order structural learning techniques are beneficial to the strong performing SRL models.
arXiv Detail & Related papers (2020-10-09T15:33:54Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.