Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking
- URL: http://arxiv.org/abs/2505.18538v1
- Date: Sat, 24 May 2025 06:03:45 GMT
- Title: Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking
- Authors: Xin Wei, Huakun Liu, Yutaro Hirao, Monica Perusquia-Hernandez, Katsutoshi Masai, Hideaki Uchiyama, Kiyoshi Kiyokawa,
- Abstract summary: This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking.<n>We trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration.<n>Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings.
- Score: 12.016546264209536
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter conditions, we trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration. We assess performance in both subject-dependent and subject-independent settings to evaluate model personalization and generalizability across individuals. Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings: 96.207\% in the subject-dependent scenario and 8.882\% in the subject-independent scenario. However, generalization remains limited, with classification accuracy only marginally above chance in the subject-independent evaluations. Statistical comparisons in the subject-dependent setting confirmed that the multimodal model significantly outperformed the EOG and eye-tracking models. However, no statistically significant differences were found in the subject-independent setting. Our findings demonstrate both the potential and current limitations of eye movement data-based refractive error estimation, contributing to the development of continuous, non-invasive screening methods using EOG signals and eye-tracking data.
Related papers
- Benchmarking Foundation Models for Mitotic Figure Classification [0.37334049820361814]
Self-supervised learning techniques have enabled the use of vast amounts of unlabeled data to train large-scale neural networks.<n>In this work, we investigate the use of foundation models for mitotic figure classification.<n>We compare all models against end-to-end-trained baselines, both CNNs and Vision Transformers.
arXiv Detail & Related papers (2025-08-06T13:30:40Z) - CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z) - MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations [61.59658203704757]
We propose Multi-View Independent Component Analysis with Delays and Dilations (MVICAD2), which allows sources to differ across subjects in both temporal delays and dilations.<n>We present a model with identifiable sources, derive an approximation of its likelihood in closed form, and use regularization and optimization techniques to enhance performance.
arXiv Detail & Related papers (2025-01-13T15:47:02Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.<n>We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - Generalized Robust Fundus Photography-based Vision Loss Estimation for High Myopia [6.193135671460362]
We propose a novel, parameter-efficient framework to enhance the generalized robustness of VF estimation.
Our method significantly outperforms existing approaches in RMSE, MAE and coefficient correlation for both internal and external validation.
arXiv Detail & Related papers (2024-07-04T07:39:19Z) - Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening.
It provides a measure of confidence for each modality and elegantly integrates the multi-modality information.
Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data [3.3523758554338734]
Estimating treatment effects over time is relevant in many real-world applications, such as precision medicine, epidemiology, economy, and marketing.<n>We take a different perspective by assuming unobserved risk factors, i.e., adjustment variables that affect only the sequence of outcomes.<n>We address the challenges posed by time-varying effects and unobserved adjustment variables.
arXiv Detail & Related papers (2023-10-16T16:32:35Z) - Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video
Anomaly Detection [46.8584162860564]
We propose a novel generative model for video anomaly detection (VAD)
We consider skeletal representations and leverage state-of-the-art diffusion probabilistic models to generate multimodal future human poses.
We validate our model on 4 established benchmarks.
arXiv Detail & Related papers (2023-07-14T07:42:45Z) - Assessing Coarse-to-Fine Deep Learning Models for Optic Disc and Cup
Segmentation in Fundus Images [0.0]
coarse-to-fine deep learning algorithms are used to efficiently measure the vertical cup-to-disc ratio (vCDR) in fundus images.
We present a comprehensive analysis of different coarse-to-fine designs for OD/OC segmentation using 5 public databases.
Our analysis shows that these algorithms not necessarily outperfom standard multi-class single-stage models.
arXiv Detail & Related papers (2022-09-28T19:19:16Z) - Learnable Patchmatch and Self-Teaching for Multi-Frame Depth Estimation in Monocular Endoscopy [16.233423010425355]
We propose a novel unsupervised multi-frame monocular depth estimation model.<n>The proposed model integrates a learnable patchmatch module to adaptively increase the discriminative ability in regions with low and homogeneous textures.<n>As a byproduct of the self-teaching paradigm, the proposed model is able to improve the depth predictions when more frames are input at test time.
arXiv Detail & Related papers (2022-05-30T12:11:03Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.