Rethinking Glaucoma Calibration: Voting-Based Binocular and Metadata Integration
- URL: http://arxiv.org/abs/2503.18642v2
- Date: Sun, 02 Nov 2025 10:01:21 GMT
- Title: Rethinking Glaucoma Calibration: Voting-Based Binocular and Metadata Integration
- Authors: Taejin Jeong, Joohyeok Kim, Jaehoon Joo, Seong Jae Hwang,
- Abstract summary: Glaucoma is a major cause of irreversible blindness, with significant diagnostic subjectivity.<n>We propose V-ViT (Voting-based ViT), a framework that enhances calibration by integrating a patient's binocular information and metadata.<n>Our results demonstrate that V-ViT effectively resolves the issue of overconfidence in predictions in glaucoma diagnosis, providing highly reliable predictions for clinical use.
- Score: 7.317152109491892
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Glaucoma is a major cause of irreversible blindness, with significant diagnostic subjectivity. This inherent uncertainty, combined with the overconfidence of models optimized solely for accuracy can lead to fatal issues such as overdiagnosis or missing critical diseases. To ensure clinical trust, model calibration is essential for reliable predictions, yet study in this field remains limited. Existing calibration study have overlooked glaucoma's systemic associations and high diagnostic subjectivity. To overcome these limitations, we propose V-ViT (Voting-based ViT), a framework that enhances calibration by integrating a patient's binocular information and metadata. Furthermore, to mitigate diagnostic subjectivity, V-ViT utilizes an iterative dropout-based Voting System to maximize calibration performance. The proposed framework achieved state-of-the-art performance across all metrics, including the primary calibration metrics. Our results demonstrate that V-ViT effectively resolves the issue of overconfidence in predictions in glaucoma diagnosis, providing highly reliable predictions for clinical use. Our source code is available at https://github.com/starforTJ/V-ViT.
Related papers
- Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification [60.18369393468405]
Existing verifiers usually underperform owing to a lack of domain knowledge and limited calibration.<n>GLEAN compiles expert-curated protocols into trajectory-informed, well-calibrated correctness signals.<n>We empirically validate GLEAN with agentic clinical diagnosis across three diseases from the MIMIC-IV dataset.
arXiv Detail & Related papers (2026-03-03T09:36:43Z) - Calibrated Bayesian Deep Learning for Explainable Decision Support Systems Based on Medical Imaging [6.826979426009301]
It is imperative that models quantify uncertainty in a manner that correlates with prediction correctness, allowing clinicians to identify unreliable outputs for further review.<n>The present paper proposes a generalizable probabilistic optimization framework grounded in Bayesian deep learning.<n>Specifically, a novel Confidence-Uncertainty Boundary Loss (CUB-Loss) is introduced that imposes penalties on high-certainty errors and low-certainty correct predictions.<n>The proposed framework is validated on three distinct medical imaging tasks: automatic screening of pneumonia, diabetic retinopathy detection, and identification of skin lesions.
arXiv Detail & Related papers (2026-02-12T14:03:41Z) - MIRNet: Integrating Constrained Graph-Based Reasoning with Pre-training for Diagnostic Medical Imaging [67.74482877175797]
MIRNet is a novel framework that integrates self-supervised pre-training with constrained graph-based reasoning.<n>We introduce TongueAtlas-4K, a benchmark comprising 4,000 images annotated with 22 diagnostic labels.
arXiv Detail & Related papers (2025-11-13T06:30:41Z) - Enhancing Safety in Diabetic Retinopathy Detection: Uncertainty-Aware Deep Learning Models with Rejection Capabilities [0.0]
Diabetic retinopathy (DR) is a major cause of visual impairment.<n>Deep learning models have demonstrated great success identifying DR from retinal images.<n>This paper investigates an alternative in uncertainty-aware deep learning models, including a rejection mechanism to reject low-confidence predictions.
arXiv Detail & Related papers (2025-09-26T01:47:43Z) - DRetNet: A Novel Deep Learning Framework for Diabetic Retinopathy Diagnosis [8.234135343778993]
Current DR detection systems struggle with poor-quality images, lack interpretability, and insufficient integration of domain-specific knowledge.<n>We introduce a novel framework that integrates three innovative contributions.<n>The framework achieves an accuracy of 92.7%, a precision of 92.5%, a recall of 92.6%, an F1-score of 92.5%, an AUC of 97.8%, a mAP of 0.96, and an MCC of 0.85.
arXiv Detail & Related papers (2025-09-01T02:27:16Z) - GlaBoost: A multimodal Structured Framework for Glaucoma Risk Stratification [4.570357976534648]
GlaBoost integrates structured clinical features, fundus image embeddings, and expert-curated textual descriptions for glaucoma risk prediction.<n>Experiments conducted on a real-world annotated dataset demonstrate that GlaBoost significantly outperforms baseline models.
arXiv Detail & Related papers (2025-08-03T22:02:42Z) - Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models [52.2001050216955]
Existing methods aim to enhance the performance of Medical Vision Language Model (MedVLM) by adjusting model structure, fine-tuning with high-quality data, or through preference fine-tuning.<n>We propose an expert-in-the-loop framework named Expert-Controlled-Free Guidance (Expert-CFG) to align MedVLM with clinical expertise without additional training.
arXiv Detail & Related papers (2025-07-12T09:03:30Z) - EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model [51.66031028717933]
Medical Large Vision-Language Models (Med-LVLMs) demonstrate significant potential in healthcare.
Currently, intelligent ophthalmic diagnosis faces three major challenges: (i) Data; (ii) Benchmark; and (iii) Model.
We propose the Eyecare Kit, which tackles the aforementioned three key challenges with the tailored dataset, benchmark and model.
arXiv Detail & Related papers (2025-04-18T12:09:15Z) - Enhancing Fundus Image-based Glaucoma Screening via Dynamic Global-Local Feature Integration [26.715346685730484]
We propose a self-adaptive attention window that autonomously determines optimal boundaries for enhanced feature extraction.
We also introduce a multi-head attention mechanism to effectively fuse global and local features via feature linear readout.
Experimental results demonstrate that our method achieves superior accuracy and robustness in glaucoma classification.
arXiv Detail & Related papers (2025-04-01T05:28:14Z) - AI-Driven Approaches for Glaucoma Detection -- A Comprehensive Review [0.09320657506524149]
Computer-Aided Diagnosis (CADx) systems have emerged as promising tools to assist clinicians in accurately diagnosing glaucoma early.
This paper aims to provide a comprehensive overview of AI techniques utilized in CADx systems for glaucoma diagnosis.
arXiv Detail & Related papers (2024-10-21T12:26:53Z) - Spatial-aware Transformer-GRU Framework for Enhanced Glaucoma Diagnosis
from 3D OCT Imaging [1.8416014644193066]
We present a novel deep learning framework that leverages the diagnostic value of 3D Optical Coherence Tomography ( OCT) imaging for automated glaucoma detection.
We integrate a pre-trained Vision Transformer on retinal data for rich slice-wise feature extraction and a bidirectional Gated Recurrent Unit for capturing inter-slice spatial dependencies.
Experimental results on a large dataset demonstrate the superior performance of the proposed method over state-of-the-art ones.
arXiv Detail & Related papers (2024-03-08T22:25:15Z) - Automatic diagnosis of knee osteoarthritis severity using Swin
transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint.
We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z) - Towards Reliable Medical Image Segmentation by Modeling Evidential Calibrated Uncertainty [57.023423137202485]
Concerns regarding the reliability of medical image segmentation persist among clinicians.<n>We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.<n>By leveraging subjective logic theory, we explicitly model probability and uncertainty for medical image segmentation.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - RADNet: Ensemble Model for Robust Glaucoma Classification in Color
Fundus Images [0.0]
Glaucoma is one of the most severe eye diseases, characterized by rapid progression and leading to irreversible blindness.
Regular glaucoma screenings of the population shall improve early-stage detection, however the desirable frequency of etymological checkups is often not feasible.
In our work, we propose an advanced image pre-processing technique combined with an ensemble of deep classification networks.
arXiv Detail & Related papers (2022-05-25T16:48:00Z) - Geometric Deep Learning to Identify the Critical 3D Structural Features
of the Optic Nerve Head for Glaucoma Diagnosis [52.06403518904579]
The optic nerve head (ONH) undergoes complex and deep 3D morphological changes during the development and progression of glaucoma.
We used PointNet and dynamic graph convolutional neural network (DGCNN) to diagnose glaucoma from 3D ONH point clouds.
Our approach may have strong potential to be used in clinical applications for the diagnosis and prognosis of a wide range of ophthalmic disorders.
arXiv Detail & Related papers (2022-04-14T12:52:10Z) - GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges [48.98620387924817]
We set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus & OCT-based glaucoma grading.
The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes.
We have publicly released a glaucoma annotated dataset with both 2D fundus color photography and 3D OCT volumes, which is the first multi-modality dataset for glaucoma grading.
arXiv Detail & Related papers (2022-02-14T06:54:15Z) - Assessing glaucoma in retinal fundus photographs using Deep Feature
Consistent Variational Autoencoders [63.391402501241195]
glaucoma is challenging to detect since it remains asymptomatic until the symptoms are severe.
Early identification of glaucoma is generally made based on functional, structural, and clinical assessments.
Deep learning methods have partially solved this dilemma by bypassing the marker identification stage and analyzing high-level information directly to classify the data.
arXiv Detail & Related papers (2021-10-04T16:06:49Z) - An Interpretable Multiple-Instance Approach for the Detection of
referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images.
By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy.
We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Modeling and Enhancing Low-quality Retinal Fundus Images [167.02325845822276]
Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis.
We propose a clinically oriented fundus enhancement network (cofe-Net) to suppress global degradation factors.
Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details.
arXiv Detail & Related papers (2020-05-12T08:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.