Related papers: Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

URL: http://arxiv.org/abs/2406.13126v1
Date: Wed, 19 Jun 2024 00:42:35 GMT
Title: Guided Context Gating: Learning to leverage salient lesions in retinal fundus images
Authors: Teja Krishna Cherukuri, Nagur Shareef Shaik, Dong Hye Ye,
Abstract summary: We propose a novel attention mechanism called Guided Context Gating. It integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms.
Score: 1.8789068567093286
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to learn spatial context and channel correlations from retinal images, they often fall short in capturing localized lesion context. Addressing this limitation, we propose a novel attention mechanism called Guided Context Gating, an unique approach that integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Our qualitative evaluation against existing attention mechanisms emphasize the superiority of Guided Context Gating in terms of explainability. Notably, experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms & an impressive 6.53% improvement over the state-of-the-art Vision Transformer for assessing the severity grade of retinopathy, even with imbalanced and limited training samples for each class.

Related papers

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding [50.483761005446]
Current models struggle to associate textual descriptions with disease regions due to inefficient attention mechanisms and a lack of fine-grained token representations.<n>We introduce Disease-Aware Prompting (DAP), which uses the explainability map of a VLM to identify the appropriate image features.<n>DAP improves visual grounding accuracy by 20.74% compared to state-of-the-art methods across three major chest X-ray datasets.
arXiv Detail & Related papers (2025-05-21T05:16:45Z)
From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation [46.99748372216857]
Vision-language models (VLMs) provide semantic context through textual descriptions but lack explanation precision required. We propose a teacher-student framework that integrates both gaze and language supervision, leveraging their complementary strengths. Our method achieves Dice scores of 80.78%, 80.53%, and 84.22%, respectively, improving 3-5% over gaze baselines without increasing the annotation burden.
arXiv Detail & Related papers (2025-04-15T16:32:15Z)
GCS-M3VLT: Guided Context Self-Attention based Multi-modal Medical Vision Language Transformer for Retinal Image Captioning [3.5948668755510136]
We propose a novel vision-language model for retinal image captioning that combines visual and textual features through a guided context self-attention mechanism. Experiments on the DeepEyeNet dataset demonstrate a 0.023 BLEU@4 improvement, along with significant qualitative advancements.
arXiv Detail & Related papers (2024-12-23T03:49:29Z)
Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment [11.726600999078755]
Malenia is a novel multi-scale lesion-level mask-attribute alignment framework. It is specifically designed for 3D zero-shot lesion segmentation.
arXiv Detail & Related papers (2024-10-21T08:01:58Z)
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology [68.6831438330526]
We consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources. We curate a pathology knowledge tree that consists of 50,470 informative attributes for 4,718 diseases requiring pathology diagnosis from 32 human tissues.
arXiv Detail & Related papers (2024-04-15T17:11:25Z)
Foveated Retinotopy Improves Classification and Localization in CNNs [0.0]
We show how incorporating foveated retinotopy may benefit deep convolutional neural networks (CNNs) in image classification tasks. Our findings suggest that foveated retinotopic mapping encodes implicit knowledge about visual object geometry.
arXiv Detail & Related papers (2024-02-23T18:15:37Z)
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning. Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge. Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z)
PALM: Open Fundus Photograph Dataset with Pathologic Myopia Recognition and Anatomical Structure Annotation [41.80715213373843]
Pathologic myopia (PM) is a common myopic retinal degeneration suffered by highly blinding population. This paper provides insights about PALM, our open fundus imaging dataset for pathological recognition and anatomical structure annotation.
arXiv Detail & Related papers (2023-05-13T02:00:06Z)
Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via Bayesian Deep Learning [7.535751594024775]
Retinopathy represents a group of retinal diseases that, if not treated timely, can cause severe visual impairments or even blindness. This paper presents a novel incremental cross-domain adaptation instrument that allows any deep classification model to progressively learn abnormal retinal pathologies. The proposed framework, evaluated on six public datasets, outperforms the state-of-the-art competitors by achieving an overall accuracy and F1 score of 0.9826 and 0.9846, respectively.
arXiv Detail & Related papers (2021-10-18T13:45:21Z)
An Interpretable Multiple-Instance Approach for the Detection of referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images. By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy. We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z)
Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI) PPI leverages proactive interventions to guard against image features with no causal relevance. We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
Visualization for Histopathology Images using Graph Convolutional Neural Networks [1.8939984161954087]
We adopt an approach to model histology tissue as a graph of nuclei and develop a graph convolutional network framework for disease diagnosis. Our visualization of such networks trained to distinguish between invasive and in-situ breast cancers, and Gleason 3 and 4 prostate cancers generate interpretable visual maps.
arXiv Detail & Related papers (2020-06-16T19:14:19Z)
Improving Robustness using Joint Attention Network For Detecting Retinal Degeneration From Optical Coherence Tomography Images [0.0]
We propose the use of disease-specific feature representation as a novel architecture comprised of two joint networks. Our experimental results on publicly available datasets show the proposed joint-network significantly improves the accuracy and robustness of state-of-the-art retinal disease classification networks on unseen datasets.
arXiv Detail & Related papers (2020-05-16T20:32:49Z)
Modeling and Enhancing Low-quality Retinal Fundus Images [167.02325845822276]
Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis. We propose a clinically oriented fundus enhancement network (cofe-Net) to suppress global degradation factors. Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details.
arXiv Detail & Related papers (2020-05-12T08:01:16Z)
Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights. It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness. In recent years, there has been a significant effort to automate the diagnosis using deep learning. This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN) Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.