CheXFusion: Effective Fusion of Multi-View Features using Transformers
for Long-Tailed Chest X-Ray Classification
- URL: http://arxiv.org/abs/2308.03968v1
- Date: Tue, 8 Aug 2023 00:46:01 GMT
- Title: CheXFusion: Effective Fusion of Multi-View Features using Transformers
for Long-Tailed Chest X-Ray Classification
- Authors: Dongkyun Kim
- Abstract summary: This paper introduces our solution to the ICCV CVAMD 2023 Shared Task on CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays.
Our approach introduces CheXFusion, a transformer-based fusion module incorporating multi-view images.
Our solution achieves state-of-the-art results with 0.372 mAP in the MIMIC-CXR test set, securing 1st place in the competition.
- Score: 4.708378681950648
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical image classification poses unique challenges due to the long-tailed
distribution of diseases, the co-occurrence of diagnostic findings, and the
multiple views available for each study or patient. This paper introduces our
solution to the ICCV CVAMD 2023 Shared Task on CXR-LT: Multi-Label Long-Tailed
Classification on Chest X-Rays. Our approach introduces CheXFusion, a
transformer-based fusion module incorporating multi-view images. The fusion
module, guided by self-attention and cross-attention mechanisms, efficiently
aggregates multi-view features while considering label co-occurrence.
Furthermore, we explore data balancing and self-training methods to optimize
the model's performance. Our solution achieves state-of-the-art results with
0.372 mAP in the MIMIC-CXR test set, securing 1st place in the competition. Our
success in the task underscores the significance of considering multi-view
settings, class imbalance, and label co-occurrence in medical image
classification. Public code is available at
https://github.com/dongkyuk/CXR-LT-public-solution
Related papers
- LTCXNet: Advancing Chest X-Ray Analysis with Solutions for Long-Tailed Multi-Label Classification and Fairness Challenges [4.351007758390175]
Pruned MIMIC-CXR-LT dataset is designed to represent a long-tailed and multi-label data scenario.
We introduce LTCXNet, a novel framework that integrates the ConvNeXt model, ML-Decoder, and strategic data augmentation.
arXiv Detail & Related papers (2024-11-16T08:59:20Z) - Random Token Fusion for Multi-View Medical Diagnosis [2.3458652461211935]
In multi-view medical datasets, deep learning models often fuse information from different imaging perspectives to improve diagnosis performance.
Existing approaches are prone to overfitting and rely heavily on view-specific features, which can lead to trivial solutions.
In this work, we introduce a novel technique designed to enhance image analysis using multi-view medical transformers.
arXiv Detail & Related papers (2024-10-21T10:19:45Z) - MultiFusionNet: Multilayer Multimodal Fusion of Deep Neural Networks for
Chest X-Ray Image Classification [16.479941416339265]
Automated systems utilizing convolutional neural networks (CNNs) have shown promise in improving the accuracy and efficiency of chest X-ray image classification.
We propose a novel deep learning-based multilayer multimodal fusion model that emphasizes extracting features from different layers and fusing them.
The proposed model achieves a significantly higher accuracy of 97.21% and 99.60% for both three-class and two-class classifications, respectively.
arXiv Detail & Related papers (2024-01-01T11:50:01Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - C^2M-DoT: Cross-modal consistent multi-view medical report generation
with domain transfer network [67.97926983664676]
We propose a cross-modal consistent multi-view medical report generation with a domain transfer network (C2M-DoT)
C2M-DoT substantially outperforms state-of-the-art baselines in all metrics.
arXiv Detail & Related papers (2023-10-09T02:31:36Z) - Multi-Scale Feature Fusion using Parallel-Attention Block for COVID-19
Chest X-ray Diagnosis [2.15242029196761]
Under the global COVID-19 crisis, accurate diagnosis of COVID-19 from Chest X-ray (CXR) images is critical.
We propose a novel multi-feature fusion network using parallel attention blocks to fuse the original CXR images and local-phase feature-enhanced CXR images at multi-scales.
arXiv Detail & Related papers (2023-04-25T16:56:12Z) - Stain-invariant self supervised learning for histopathology image
analysis [74.98663573628743]
We present a self-supervised algorithm for several classification tasks within hematoxylin and eosin stained images of breast cancer.
Our method achieves the state-of-the-art performance on several publicly available breast cancer datasets.
arXiv Detail & Related papers (2022-11-14T18:16:36Z) - TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation
with Transformers [8.139069987207494]
We present TransFusion, a Transformer-based architecture to merge divergent multi-view imaging information using convolutional layers and powerful attention mechanisms.
In particular, the Divergent Fusion Attention (DiFA) module is proposed for rich cross-view context modeling and semantic dependency mining.
arXiv Detail & Related papers (2022-03-21T04:02:54Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Multi-label Thoracic Disease Image Classification with Cross-Attention
Networks [65.37531731899837]
We propose a novel scheme of Cross-Attention Networks (CAN) for automated thoracic disease classification from chest x-ray images.
We also design a new loss function that beyond cross-entropy loss to help cross-attention process and is able to overcome the imbalance between classes and easy-dominated samples within each class.
arXiv Detail & Related papers (2020-07-21T14:37:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.