DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention
- URL: http://arxiv.org/abs/2407.20843v2
- Date: Thu, 1 Aug 2024 07:43:11 GMT
- Title: DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention
- Authors: Wei Wang, Jixing He, Xin Wang,
- Abstract summary: We propose a novel network DFE-IANet based on both spectral transformation and feature interaction.
With a compact parameter of only 4M, DFE-IANet outperforms the latest and classical networks in terms of efficiency.
DFE-IANet achieves state-of-the-art (SOTA) results on the challenging Kvasir dataset, demonstrating a remarkable Top-1 accuracy of 93.94%.
- Score: 8.199299549735343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is helpful in preventing colorectal cancer to detect and treat polyps in the gastrointestinal tract early. However, there have been few studies to date on designing polyp image classification networks that balance efficiency and accuracy. This challenge is mainly attributed to the fact that polyps are similar to other pathologies and have complex features influenced by texture, color, and morphology. In this paper, we propose a novel network DFE-IANet based on both spectral transformation and feature interaction. Firstly, to extract detailed features and multi-scale features, the features are transformed by the multi-scale frequency domain feature extraction (MSFD) block to extract texture details at the fine-grained level in the frequency domain. Secondly, the multi-scale interaction attention (MSIA) block is designed to enhance the network's capability of extracting critical features. This block introduces multi-scale features into self-attention, aiming to adaptively guide the network to concentrate on vital regions. Finally, with a compact parameter of only 4M, DFE-IANet outperforms the latest and classical networks in terms of efficiency. Furthermore, DFE-IANet achieves state-of-the-art (SOTA) results on the challenging Kvasir dataset, demonstrating a remarkable Top-1 accuracy of 93.94%. This outstanding accuracy surpasses ViT by 8.94%, ResNet50 by 1.69%, and VMamba by 1.88%. Our code is publicly available at https://github.com/PURSUETHESUN/DFE-IANet.
Related papers
- Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation [11.60636221012585]
In medical images, various types of lesions often manifest significant differences in their shape and texture.
Accurate medical image segmentation demands deep learning models with robust capabilities in multi-scale and boundary feature learning.
We introduce SF-UNet, a spatial-frequency dual-domain attention network.
It comprises two main components: the Multi-scale Progressive Channel Attention (MPCA) block, which progressively extract multi-scale features across adjacent encoder layers, and the lightweight Frequency-Spatial Attention (FSA) block, with only 0.05M parameters.
arXiv Detail & Related papers (2024-06-12T07:22:05Z) - Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing [5.3598912592106345]
Deep learning has led to significant advances in bearing fault diagnosis (FD)
We propose a novel FD model by integrating multiscale quaternion convolutional neural network (MQCNN), bidirectional gated recurrent unit (BiG), and cross self-attention feature fusion (CSAFF)
arXiv Detail & Related papers (2024-05-25T07:55:02Z) - A Lightweight Attention-based Deep Network via Multi-Scale Feature Fusion for Multi-View Facial Expression Recognition [2.9581436761331017]
We introduce a lightweight attentional network incorporating multi-scale feature fusion (LANMSFF) to tackle these issues.
We present two novel components, namely mass attention (MassAtt) and point wise feature selection (PWFS) blocks.
Our proposed approach achieved results comparable to state-of-the-art methods in terms of parameter counts and robustness to pose variation.
arXiv Detail & Related papers (2024-03-21T11:40:51Z) - ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic
Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases.
In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework.
Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps.
In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - M3FPolypSegNet: Segmentation Network with Multi-frequency Feature Fusion
for Polyp Localization in Colonoscopy Images [1.389360509566256]
Multi-Frequency Feature Fusion Polyp Network (M3FPolypSegNet) was proposed to decompose the input image into low/high/full-frequency components.
We used three independent multi-frequency encoders to map multiple input images into a high-dimensional feature space.
We designed three multi-task learning (i.e., region, edge, and distance) in four decoder blocks to learn the structural characteristics of the region.
arXiv Detail & Related papers (2023-10-09T09:01:53Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation.
It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme.
This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z) - Affinity Feature Strengthening for Accurate, Complete and Robust Vessel
Segmentation [48.638327652506284]
Vessel segmentation is crucial in many medical image applications, such as detecting coronary stenoses, retinal vessel diseases and brain aneurysms.
We present a novel approach, the affinity feature strengthening network (AFN), which jointly models geometry and refines pixel-wise segmentation features using a contrast-insensitive, multiscale affinity approach.
arXiv Detail & Related papers (2022-11-12T05:39:17Z) - Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers [124.01928050651466]
We propose a new type of polyp segmentation method, named Polyp-PVT.
The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities.
arXiv Detail & Related papers (2021-08-16T07:09:06Z) - FCCDN: Feature Constraint Network for VHR Image Change Detection [12.670734830806591]
We propose a feature constraint change detection network (FCCDN) for change detection.
We constrain features both on bi-temporal feature extraction and feature fusion.
We achieve state-of-the-art performance on two building change detection datasets.
arXiv Detail & Related papers (2021-05-23T06:13:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.