Estimating the strength and timing of syntactic structure building in naturalistic reading
- URL: http://arxiv.org/abs/2509.23195v1
- Date: Sat, 27 Sep 2025 08:56:12 GMT
- Title: Estimating the strength and timing of syntactic structure building in naturalistic reading
- Authors: Nan Wang, Jiaxuan Li,
- Abstract summary: We show that phrase structure can precede category detection and dominate lexical influences.<n>These findings support a predictive "tree-scaffolding" account of comprehension.
- Score: 4.261343728593896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A central question in psycholinguistics is the timing of syntax in sentence processing. Much of the existing evidence comes from violation paradigms, which conflate two separable processes - syntactic category detection and phrase structure construction - and implicitly assume that phrase structure follows category detection. In this study, we use co-registered EEG and eye-tracking data from the ZuCo corpus to disentangle these processes and test their temporal order under naturalistic reading conditions. Analyses of gaze transitions showed that readers preferentially moved between syntactic heads, suggesting that phrase structures, rather than serial word order, organize scanpaths. Bayesian network modeling further revealed that structural depth was the strongest driver of deviations from linear reading, outweighing lexical familiarity and surprisal. Finally, fixation-related potentials demonstrated that syntactic surprisal influences neural activity before word onset (-184 to -10 ms) and during early integration (48 to 300 ms). These findings extend current models of syntactic timing by showing that phrase structure construction can precede category detection and dominate lexical influences, supporting a predictive "tree-scaffolding" account of comprehension.
Related papers
- Discovering Semantic Latent Structures in Psychological Scales: A Response-Free Pathway to Efficient Simplification [7.405170407676887]
We introduce a topic-modeling framework that operationalizes semantic latent structure for scale simplification.<n>Items are encoded using contextual sentence embeddings and grouped via density-based clustering.<n>We benchmarked the framework across DASS, IPIP, and EPOCH, evaluating structural recovery, internal consistency, factor congruence, correlation preservation, and reduction efficiency.
arXiv Detail & Related papers (2026-02-13T03:37:15Z) - Emergent Structured Representations Support Flexible In-Context Inference in Large Language Models [77.98801218316505]
Large language models (LLMs) exhibit emergent behaviors suggestive of human-like reasoning.<n>We investigate the internal processing of LLMs during in-context concept inference.
arXiv Detail & Related papers (2026-02-08T03:14:39Z) - Vocabulary embeddings organize linguistic structure early in language model training [3.2661767443292646]
Large language models (LLMs) work by manipulating the geometry of input embedding vectors over multiple layers.<n>Here, we ask: how are the input vocabulary representations of language models structured, and how does this structure evolve over training?<n>We run a suite of experiments that correlate the geometric structure of the input embeddings and output embeddings of two open-source models with semantic, syntactic, and frequency-based metrics over the course of training.
arXiv Detail & Related papers (2025-10-08T23:26:22Z) - Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning Dynamics [69.00587226225232]
We introduce a state-aware transition framework that abstracts CoT trajectories into structured latent dynamics.<n>To characterize the global structure of reasoning, we model their progression as a Markov chain.<n>This abstraction supports a range of analyses, including semantic role identification, temporal pattern visualization, and consistency evaluation.
arXiv Detail & Related papers (2025-08-29T18:53:31Z) - Probing Syntax in Large Language Models: Successes and Remaining Challenges [7.9494253785082405]
It remains unclear whether structural and/or statistical factors systematically affect these syntactic representations.<n>We conduct an in-depth analysis of structural probes on three controlled benchmarks.
arXiv Detail & Related papers (2025-08-05T08:41:14Z) - Derivational Probing: Unveiling the Layer-wise Derivation of Syntactic Structures in Neural Language Models [16.97687131562374]
We propose Derivational Probing to investigate how micro-syntactic structures and macro-syntactic structures are constructed.<n>Our experiments on BERT reveal a clear bottom-up derivation: micro-syntactic structures emerge in lower layers and are gradually integrated into a coherent macro-syntactic structure in higher layers.
arXiv Detail & Related papers (2025-06-27T02:29:30Z) - Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives [84.03001845263]
Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management.<n>We propose two novel dynamic macrostructural approaches to measure cross-modal consistency between speech and visual stimuli.<n> Experimental results validated the efficiency of proposed approaches in NCD detection, with TITAN achieving superior performance both on the CU-MARVEL-RABBIT corpus and the ADReSS corpus.
arXiv Detail & Related papers (2025-01-07T12:16:26Z) - Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z) - Modeling structure-building in the brain with CCG parsing and large
language models [9.17816011606258]
Combinatory Categorial Grammars (CCGs) are sufficiently expressive directly compositional models of grammar.
We evaluate whether a more expressive CCG provides a better model than a context-free grammar for human neural signals collected with fMRI.
arXiv Detail & Related papers (2022-10-28T14:21:29Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - An Empirical Study: Extensive Deep Temporal Point Process [61.14164208094238]
We first review recent research emphasis and difficulties in modeling asynchronous event sequences with deep temporal point process.<n>We propose a Granger causality discovery framework for exploiting the relations among multi-types of events.
arXiv Detail & Related papers (2021-10-19T10:15:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.