Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal
Learning for Glaucoma Forecasting from Irregular Time Series Images
- URL: http://arxiv.org/abs/2402.13475v1
- Date: Wed, 21 Feb 2024 02:16:59 GMT
- Title: Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal
Learning for Glaucoma Forecasting from Irregular Time Series Images
- Authors: Xikai Yang, Jian Wu, Xi Wang, Yuchen Yuan, Ning Li Wang, Pheng-Ann
Heng
- Abstract summary: Glaucoma is one of the major eye diseases that leads to progressive optic nerve fiber damage and irreversible blindness.
We introduce the Multi-scale Spatio-temporal Transformer Network (MST-former) based on the transformer architecture tailored for sequential image inputs.
Our method shows excellent generalization capability on the Alzheimer's Disease Neuroimaging Initiative (ADNI) MRI dataset, with an accuracy of 90.3% for mild cognitive impairment and Alzheimer's disease prediction.
- Score: 45.894671834869975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Glaucoma is one of the major eye diseases that leads to progressive optic
nerve fiber damage and irreversible blindness, afflicting millions of
individuals. Glaucoma forecast is a good solution to early screening and
intervention of potential patients, which is helpful to prevent further
deterioration of the disease. It leverages a series of historical fundus images
of an eye and forecasts the likelihood of glaucoma occurrence in the future.
However, the irregular sampling nature and the imbalanced class distribution
are two challenges in the development of disease forecasting approaches. To
this end, we introduce the Multi-scale Spatio-temporal Transformer Network
(MST-former) based on the transformer architecture tailored for sequential
image inputs, which can effectively learn representative semantic information
from sequential images on both temporal and spatial dimensions. Specifically,
we employ a multi-scale structure to extract features at various resolutions,
which can largely exploit rich spatial information encoded in each image.
Besides, we design a time distance matrix to scale time attention in a
non-linear manner, which could effectively deal with the irregularly sampled
data. Furthermore, we introduce a temperature-controlled Balanced Softmax
Cross-entropy loss to address the class imbalance issue. Extensive experiments
on the Sequential fundus Images for Glaucoma Forecast (SIGF) dataset
demonstrate the superiority of the proposed MST-former method, achieving an AUC
of 98.6% for glaucoma forecasting. Besides, our method shows excellent
generalization capability on the Alzheimer's Disease Neuroimaging Initiative
(ADNI) MRI dataset, with an accuracy of 90.3% for mild cognitive impairment and
Alzheimer's disease prediction, outperforming the compared method by a large
margin.
Related papers
- Unscrambling disease progression at scale: fast inference of event permutations with optimal transport [2.9087305408570945]
Disease progression models infer group-level temporal trajectories of change in patients' features as a chronic degenerative condition plays out.
We leverage ideas from optimal transport to model disease progression as a latent permutation matrix of events belonging to the Birkhoff polytope.
Experiments demonstrate the increase in speed, accuracy and robustness to noise in simulation.
arXiv Detail & Related papers (2024-10-18T11:44:29Z) - Deep Learning to Predict Glaucoma Progression using Structural Changes in the Eye [0.20718016474717196]
Glaucoma is a chronic eye disease characterized by optic neuropathy, leading to irreversible vision loss.
Early detection is crucial to monitor atrophy and develop treatment strategies to prevent further vision impairment.
In this study, we use deep learning models to identify complex disease traits and progression criteria.
arXiv Detail & Related papers (2024-06-09T01:12:41Z) - Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling [49.52787013516891]
Our proposed Longitudinal Transformer for Survival Analysis (LTSA) enables dynamic disease prognosis from longitudinal medical imaging.
A temporal attention analysis also suggested that, while the most recent image is typically the most influential, prior imaging still provides additional prognostic value.
arXiv Detail & Related papers (2024-05-14T17:15:28Z) - AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model [59.08735812631131]
Anomaly inspection plays an important role in industrial manufacture.
Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.
We propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model.
arXiv Detail & Related papers (2023-12-10T05:13:40Z) - Diagnosing Alzheimer's Disease using Early-Late Multimodal Data Fusion
with Jacobian Maps [1.5501208213584152]
Alzheimer's disease (AD) is a prevalent and debilitating neurodegenerative disorder impacting a large aging population.
We propose an efficient early-late fusion (ELF) approach, which leverages a convolutional neural network for automated feature extraction and random forests.
To tackle the challenge of detecting subtle changes in brain volume, we transform images into the Jacobian domain (JD)
arXiv Detail & Related papers (2023-10-25T19:02:57Z) - Automatic diagnosis of knee osteoarthritis severity using Swin
transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint.
We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z) - RADNet: Ensemble Model for Robust Glaucoma Classification in Color
Fundus Images [0.0]
Glaucoma is one of the most severe eye diseases, characterized by rapid progression and leading to irreversible blindness.
Regular glaucoma screenings of the population shall improve early-stage detection, however the desirable frequency of etymological checkups is often not feasible.
In our work, we propose an advanced image pre-processing technique combined with an ensemble of deep classification networks.
arXiv Detail & Related papers (2022-05-25T16:48:00Z) - Temporal Context Matters: Enhancing Single Image Prediction with Disease
Progression Representations [8.396615243014768]
We present a deep learning approach that leverages temporal progression information to improve clinical outcome predictions from single-timepoint images.
In our method, a self-attention based Temporal Convolutional Network (TCN) is used to learn a representation that is most reflective of the disease trajectory.
A Vision Transformer is pretrained in a self-supervised fashion to extract features from single-timepoint images.
arXiv Detail & Related papers (2022-03-02T22:11:07Z) - Assessing glaucoma in retinal fundus photographs using Deep Feature
Consistent Variational Autoencoders [63.391402501241195]
glaucoma is challenging to detect since it remains asymptomatic until the symptoms are severe.
Early identification of glaucoma is generally made based on functional, structural, and clinical assessments.
Deep learning methods have partially solved this dilemma by bypassing the marker identification stage and analyzing high-level information directly to classify the data.
arXiv Detail & Related papers (2021-10-04T16:06:49Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.