Eye-tracking based classification of Mandarin Chinese readers with and
without dyslexia using neural sequence models
- URL: http://arxiv.org/abs/2210.09819v1
- Date: Tue, 18 Oct 2022 12:57:30 GMT
- Title: Eye-tracking based classification of Mandarin Chinese readers with and
without dyslexia using neural sequence models
- Authors: Patrick Haller, Andreas S\"auberli, Sarah Elisabeth Kiener, Jinger
Pan, Ming Yan, Lena J\"ager
- Abstract summary: We propose two simple sequence models that process eye movements on the entire stimulus without the need of aggregating features across the sentence.
We incorporate the linguistic stimulus into the model in two ways -- contextualized word embeddings and manually extracted linguistic features.
Our results show that (i) even for a logographic script such as Chinese, sequence models are able to classify dyslexia on eye gaze sequences, reaching state-of-the-art performance.
- Score: 7.639036130018945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Eye movements are known to reflect cognitive processes in reading, and
psychological reading research has shown that eye gaze patterns differ between
readers with and without dyslexia. In recent years, researchers have attempted
to classify readers with dyslexia based on their eye movements using Support
Vector Machines (SVMs). However, these approaches (i) are based on highly
aggregated features averaged over all words read by a participant, thus
disregarding the sequential nature of the eye movements, and (ii) do not
consider the linguistic stimulus and its interaction with the reader's eye
movements. In the present work, we propose two simple sequence models that
process eye movements on the entire stimulus without the need of aggregating
features across the sentence. Additionally, we incorporate the linguistic
stimulus into the model in two ways -- contextualized word embeddings and
manually extracted linguistic features. The models are evaluated on a Mandarin
Chinese dataset containing eye movements from children with and without
dyslexia. Our results show that (i) even for a logographic script such as
Chinese, sequence models are able to classify dyslexia on eye gaze sequences,
reaching state-of-the-art performance, and (ii) incorporating the linguistic
stimulus does not help to improve classification performance.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Caregiver Talk Shapes Toddler Vision: A Computational Study of Dyadic
Play [8.164232628099619]
We propose a computational model of visual representation learning during dyadic play.
We show that utterances with statistics matching those of real caregivers give rise to representations supporting improved category recognition.
arXiv Detail & Related papers (2023-12-07T08:18:40Z) - ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts [0.5520145204626482]
Eye movements in reading play a crucial role in psycholinguistic research.
The scarcity of eye movement data and its unavailability at application time poses a major challenge for this line of research.
We propose ScanDL, a novel discrete sequence-to-sequence diffusion model that generates synthetic scanpaths on texts.
arXiv Detail & Related papers (2023-10-24T07:52:19Z) - Integrating large language models and active inference to understand eye
movements in reading and dyslexia [0.0]
We present a novel computational model employing hierarchical active inference to simulate reading and eye movements.
Our model permits the exploration of maladaptive inference effects on eye movements during reading, such as in dyslexia.
arXiv Detail & Related papers (2023-08-09T13:16:30Z) - Linguistic More: Taking a Further Step toward Efficient and Accurate
Scene Text Recognition [92.6211155264297]
Vision models have gained increasing attention due to their simplicity and efficiency in Scene Text Recognition (STR) task.
Recent vision models suffer from two problems: (1) the pure vision-based query results in attention drift, which usually causes poor recognition and is summarized as linguistic insensitive drift (LID) problem in this paper.
We propose a $textbfL$inguistic $textbfP$erception $textbfV$ision model (LPV) which explores the linguistic capability of vision model for accurate text recognition.
arXiv Detail & Related papers (2023-05-09T02:52:47Z) - Eyettention: An Attention-based Dual-Sequence Model for Predicting Human
Scanpaths during Reading [3.9766585251585282]
We develop Eyettention, the first dual-sequence model that simultaneously processes the sequence of words and the chronological sequence of fixations.
We show that Eyettention outperforms state-of-the-art models in predicting scanpaths.
arXiv Detail & Related papers (2023-04-21T07:26:49Z) - Model-based analysis of brain activity reveals the hierarchy of language
in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli.
Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z) - From Two to One: A New Scene Text Recognizer with Visual Language
Modeling Network [70.47504933083218]
We propose a Visual Language Modeling Network (VisionLAN), which views the visual and linguistic information as a union.
VisionLAN significantly improves the speed by 39% and adaptively considers the linguistic information to enhance the visual features for accurate recognition.
arXiv Detail & Related papers (2021-08-22T07:56:24Z) - Decomposing lexical and compositional syntax and semantics with deep
language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension.
Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations.
The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z) - Automatic selection of eye tracking variables in visual categorization
in adults and infants [0.4194295877935867]
We propose an automated method for selecting eye tracking variables based on analyses of their usefulness to discriminate learners from non-learners of visual categories.
We found remarkable agreement between these methods in identifying a small set of discriminant variables.
arXiv Detail & Related papers (2020-10-28T15:44:57Z) - A Novel Attention-based Aggregation Function to Combine Vision and
Language [55.7633883960205]
We propose a novel fully-attentive reduction method for vision and language.
Specifically, our approach computes a set of scores for each element of each modality employing a novel variant of cross-attention.
We test our approach on image-text matching and visual question answering, building fair comparisons with other reduction choices.
arXiv Detail & Related papers (2020-04-27T18:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.