RadFormer: Transformers with Global-Local Attention for Interpretable
and Accurate Gallbladder Cancer Detection
- URL: http://arxiv.org/abs/2211.04793v1
- Date: Wed, 9 Nov 2022 10:40:35 GMT
- Title: RadFormer: Transformers with Global-Local Attention for Interpretable
and Accurate Gallbladder Cancer Detection
- Authors: Soumen Basu, Mayank Gupta, Pratyaksha Rana, Pankaj Gupta, Chetan Arora
- Abstract summary: We propose a novel deep neural network architecture to learn interpretable representation for medical image analysis.
Our architecture generates a global attention for region of interest, and then learns bag of words style deep feature embeddings with local attention.
Our experiments indicate that the detection accuracy of our model beats even human radiologists.
- Score: 17.694219750908413
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose a novel deep neural network architecture to learn interpretable
representation for medical image analysis. Our architecture generates a global
attention for region of interest, and then learns bag of words style deep
feature embeddings with local attention. The global, and local feature maps are
combined using a contemporary transformer architecture for highly accurate
Gallbladder Cancer (GBC) detection from Ultrasound (USG) images. Our
experiments indicate that the detection accuracy of our model beats even human
radiologists, and advocates its use as the second reader for GBC diagnosis. Bag
of words embeddings allow our model to be probed for generating interpretable
explanations for GBC detection consistent with the ones reported in medical
literature. We show that the proposed model not only helps understand decisions
of neural network models but also aids in discovery of new visual features
relevant to the diagnosis of GBC. Source-code and model will be available at
https://github.com/sbasu276/RadFormer
Related papers
- Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image [6.31072075551707]
We propose a novel gaze-directed Vision GNN (called GD-ViG) to leverage the visual patterns of radiologists from gaze as expert knowledge.
The experiments on two public medical image datasets demonstrate that GD-ViG outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2024-06-20T07:16:41Z) - AWGUNET: Attention-Aided Wavelet Guided U-Net for Nuclei Segmentation in Histopathology Images [26.333686941245197]
We present a segmentation approach that combines the U-Net architecture with a DenseNet-121 backbone.
Our model introduces the Wavelet-guided channel attention module to enhance cell boundary delineation.
The experimental results conducted on two publicly accessible histopathology datasets, namely Monuseg and TNBC, underscore the superiority of our proposed model.
arXiv Detail & Related papers (2024-06-12T17:10:27Z) - DeepLOC: Deep Learning-based Bone Pathology Localization and
Classification in Wrist X-ray Images [1.45543311565555]
This paper presents a novel approach for bone pathology localization and classification in wrist X-ray images.
The proposed methodology addresses two critical challenges in wrist X-ray analysis: accurate localization of bone pathologies and precise classification of abnormalities.
arXiv Detail & Related papers (2023-08-24T12:06:10Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - Focused Decoding Enables 3D Anatomical Detection by Transformers [64.36530874341666]
We propose a novel Detection Transformer for 3D anatomical structure detection, dubbed Focused Decoder.
Focused Decoder leverages information from an anatomical region atlas to simultaneously deploy query anchors and restrict the cross-attention's field of view.
We evaluate our proposed approach on two publicly available CT datasets and demonstrate that Focused Decoder not only provides strong detection results and thus alleviates the need for a vast amount of annotated data but also exhibits exceptional and highly intuitive explainability of results via attention weights.
arXiv Detail & Related papers (2022-07-21T22:17:21Z) - Radiomics-Guided Global-Local Transformer for Weakly Supervised
Pathology Localization in Chest X-Rays [65.88435151891369]
Radiomics-Guided Transformer (RGT) fuses textitglobal image information with textitlocal knowledge-guided radiomics information.
RGT consists of an image Transformer branch, a radiomics Transformer branch, and fusion layers that aggregate image and radiomic information.
arXiv Detail & Related papers (2022-07-10T06:32:56Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Surpassing the Human Accuracy: Detecting Gallbladder Cancer from USG
Images with Curriculum Learning [17.694219750908413]
We explore the potential of CNN-based models for gallbladder cancer detection from ultrasound (USG) images.
USG is the most common diagnostic modality for GB diseases due to its low cost and accessibility.
We propose GBCNet to tackle the challenges in our problem.
arXiv Detail & Related papers (2022-04-25T04:43:33Z) - Explainable multiple abnormality classification of chest CT volumes with
AxialNet and HiResCAM [89.2175350956813]
We introduce the challenging new task of explainable multiple abnormality classification in volumetric medical images.
We propose a multiple instance learning convolutional neural network, AxialNet, that allows identification of top slices for each abnormality.
We then aim to improve the model's learning through a novel mask loss that leverages HiResCAM and 3D allowed regions.
arXiv Detail & Related papers (2021-11-24T01:14:33Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.