Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image
- URL: http://arxiv.org/abs/2406.14050v1
- Date: Thu, 20 Jun 2024 07:16:41 GMT
- Title: Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image
- Authors: Shaoxuan Wu, Xiao Zhang, Bin Wang, Zhuo Jin, Hansheng Li, Jun Feng,
- Abstract summary: We propose a novel gaze-directed Vision GNN (called GD-ViG) to leverage the visual patterns of radiologists from gaze as expert knowledge.
The experiments on two public medical image datasets demonstrate that GD-ViG outperforms the state-of-the-art methods.
- Score: 6.31072075551707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have demonstrated remarkable performance in medical image analysis. However, its susceptibility to spurious correlations due to shortcut learning raises concerns about network interpretability and reliability. Furthermore, shortcut learning is exacerbated in medical contexts where disease indicators are often subtle and sparse. In this paper, we propose a novel gaze-directed Vision GNN (called GD-ViG) to leverage the visual patterns of radiologists from gaze as expert knowledge, directing the network toward disease-relevant regions, and thereby mitigating shortcut learning. GD-ViG consists of a gaze map generator (GMG) and a gaze-directed classifier (GDC). Combining the global modelling ability of GNNs with the locality of CNNs, GMG generates the gaze map based on radiologists' visual patterns. Notably, it eliminates the need for real gaze data during inference, enhancing the network's practical applicability. Utilizing gaze as the expert knowledge, the GDC directs the construction of graph structures by incorporating both feature distances and gaze distances, enabling the network to focus on disease-relevant foregrounds. Thereby avoiding shortcut learning and improving the network's interpretability. The experiments on two public medical image datasets demonstrate that GD-ViG outperforms the state-of-the-art methods, and effectively mitigates shortcut learning. Our code is available at https://github.com/SX-SS/GD-ViG.
Related papers
- Compact & Capable: Harnessing Graph Neural Networks and Edge Convolution
for Medical Image Classification [0.0]
We introduce a novel model that combines GNNs and edge convolution, leveraging the interconnectedness of RGB channel feature values to strongly represent connections between crucial graph nodes.
Our proposed model performs on par with state-of-the-art Deep Neural Networks (DNNs) but does so with 1000 times fewer parameters, resulting in reduced training time and data requirements.
arXiv Detail & Related papers (2023-07-24T13:39:21Z) - Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray
Classification [9.266556662553345]
We propose a novel gaze-guided graph neural network (GNN), GazeGNN, to leverage raw eye-gaze data without being converted into visual attention maps (VAMs)
We develop a real-time, real-world, end-to-end disease classification algorithm for the first time in the literature.
arXiv Detail & Related papers (2023-05-29T17:01:54Z) - Simple and Efficient Heterogeneous Graph Neural Network [55.56564522532328]
Heterogeneous graph neural networks (HGNNs) have powerful capability to embed rich structural and semantic information of a heterogeneous graph into node representations.
Existing HGNNs inherit many mechanisms from graph neural networks (GNNs) over homogeneous graphs, especially the attention mechanism and the multi-layer structure.
This paper conducts an in-depth and detailed study of these mechanisms and proposes Simple and Efficient Heterogeneous Graph Neural Network (SeHGNN)
arXiv Detail & Related papers (2022-07-06T10:01:46Z) - Rectify ViT Shortcut Learning by Visual Saliency [40.55418820114868]
Shortcut learning is common but harmful to deep learning models.
In this work, we propose a novel and effective saliency-guided vision transformer (SGT) model to rectify shortcut learning.
arXiv Detail & Related papers (2022-06-17T05:54:07Z) - Automatic Relation-aware Graph Network Proliferation [182.30735195376792]
We propose Automatic Relation-aware Graph Network Proliferation (ARGNP) for efficiently searching GNNs.
These operations can extract hierarchical node/relational information and provide anisotropic guidance for message passing on a graph.
Experiments on six datasets for four graph learning tasks demonstrate that GNNs produced by our method are superior to the current state-of-the-art hand-crafted and search-based GNNs.
arXiv Detail & Related papers (2022-05-31T10:38:04Z) - Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning [42.674679049746175]
We propose to infuse human experts' intelligence and domain knowledge into the training of deep neural networks.
We propose a novel eye-gaze-guided vision transformer (EG-ViT) for diagnosis with limited medical image data.
arXiv Detail & Related papers (2022-05-25T03:29:10Z) - Follow My Eye: Using Gaze to Supervise Computer-Aided Diagnosis [54.60796004113496]
We demonstrate that the eye movement of radiologists reading medical images can be a new form of supervision to train the DNN-based computer-aided diagnosis (CAD) system.
We record the tracks of the radiologists' gaze when they are reading images.
The gaze information is processed and then used to supervise the DNN's attention via an Attention Consistency module.
arXiv Detail & Related papers (2022-04-06T08:31:05Z) - Gaze-Guided Class Activation Mapping: Leveraging Human Attention for
Network Attention in Chest X-rays Classification [3.8637285238278434]
This paper describes a gaze-guided class activation mapping (GG-CAM) method to directly regulate the formation of network attention.
GG-CAM is a lightweight ($3$ additional trainable parameters for regulating the learning process) and generic extension that can be easily applied to most classification convolutional neural networks (CNN)
Comparative experiments suggest that two standard CNNs with the GG-CAM extension achieve significantly greater classification performance.
arXiv Detail & Related papers (2022-02-15T00:33:23Z) - Node Masking: Making Graph Neural Networks Generalize and Scale Better [71.51292866945471]
Graph Neural Networks (GNNs) have received a lot of interest in the recent times.
In this paper, we utilize some theoretical tools to better visualize the operations performed by state of the art spatial GNNs.
We introduce a simple concept, Node Masking, that allows them to generalize and scale better.
arXiv Detail & Related papers (2020-01-17T06:26:40Z) - Understanding Graph Isomorphism Network for rs-fMRI Functional
Connectivity Analysis [49.05541693243502]
We develop a framework for analyzing fMRI data using the Graph Isomorphism Network (GIN)
One of the important contributions of this paper is the observation that the GIN is a dual representation of convolutional neural network (CNN) in the graph space.
We exploit CNN-based saliency map techniques for the GNN, which we tailor to the proposed GIN with one-hot encoding.
arXiv Detail & Related papers (2020-01-10T23:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.