Related papers: DFEN: Dual Feature Equalization Network for Medical Image Segmentation

DFEN: Dual Feature Equalization Network for Medical Image Segmentation

URL: http://arxiv.org/abs/2505.05913v1
Date: Fri, 09 May 2025 09:38:43 GMT
Title: DFEN: Dual Feature Equalization Network for Medical Image Segmentation
Authors: Jianjian Yin, Yi Chen, Chengyu Li, Zhichao Zheng, Yanhui Gu, Junsheng Zhou,
Abstract summary: We propose a dual feature equalization network based on the hybrid architecture of Swin Transformer and Convolutional Neural Network.<n>Swin Transformer is utilized as both the encoder and decoder, thereby bolstering the ability of the model to capture long-range dependencies and spatial correlations.
Score: 9.091452460153672
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current methods for medical image segmentation primarily focus on extracting contextual feature information from the perspective of the whole image. While these methods have shown effective performance, none of them take into account the fact that pixels at the boundary and regions with a low number of class pixels capture more contextual feature information from other classes, leading to misclassification of pixels by unequal contextual feature information. In this paper, we propose a dual feature equalization network based on the hybrid architecture of Swin Transformer and Convolutional Neural Network, aiming to augment the pixel feature representations by image-level equalization feature information and class-level equalization feature information. Firstly, the image-level feature equalization module is designed to equalize the contextual information of pixels within the image. Secondly, we aggregate regions of the same class to equalize the pixel feature representations of the corresponding class by class-level feature equalization module. Finally, the pixel feature representations are enhanced by learning weights for image-level equalization feature information and class-level equalization feature information. In addition, Swin Transformer is utilized as both the encoder and decoder, thereby bolstering the ability of the model to capture long-range dependencies and spatial correlations. We conducted extensive experiments on Breast Ultrasound Images (BUSI), International Skin Imaging Collaboration (ISIC2017), Automated Cardiac Diagnosis Challenge (ACDC) and PH$^2$ datasets. The experimental results demonstrate that our method have achieved state-of-the-art performance. Our code is publicly available at https://github.com/JianJianYin/DFEN.

Related papers

Learning Invariant Inter-pixel Correlations for Superpixel Generation [12.605604620139497]
Learnable features exhibit constrained discriminative capability, resulting in unsatisfactory pixel grouping performance. We propose the Content Disentangle Superpixel algorithm to selectively separate the invariant inter-pixel correlations and statistical properties. The experimental results on four benchmark datasets demonstrate the superiority of our approach to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-02-28T09:46:56Z)
Using a Conditional Generative Adversarial Network to Control the Statistical Characteristics of Generated Images for IACT Data Analysis [55.41644538483948]
We divide images into several classes according to the value of some property of the image, and then specify the required class when generating new images. In the case of images from Imaging Atmospheric Cherenkov Telescopes (IACTs), an important property is the total brightness of all image pixels (image size) We used a cGAN technique to generate images similar to whose obtained in the TAIGA-IACT experiment.
arXiv Detail & Related papers (2022-11-28T22:30:33Z)
Class Balanced PixelNet for Neurological Image Segmentation [20.56747443955369]
We propose an automatic brain tumor segmentation approach (e.g., PixelNet) using a pixel-level convolutional neural network (CNN) The proposed model has achieved promising results in brain tumor and ischemic stroke segmentation datasets.
arXiv Detail & Related papers (2022-04-23T10:57:54Z)
Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition. We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction. We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z)
CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS) CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment. Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z)
InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization [0.0]
We propose a novel method for unsupervised image representation based on mutual information between local and global high-level image features. In the first step, we segment images based on local and global features. In the second step, we maximize the mutual information between local features and high-level features of their respective class.
arXiv Detail & Related papers (2021-10-07T14:01:42Z)
ISNet: Integrate Image-Level and Semantic-Level Context for Semantic Segmentation [64.56511597220837]
Co-occurrent visual pattern makes aggregating contextual information a common paradigm to enhance the pixel representation for semantic image segmentation. Existing approaches focus on modeling the context from the perspective of the whole image, i.e., aggregating the image-level contextual information. This paper proposes to augment the pixel representations by aggregating the image-level and semantic-level contextual information.
arXiv Detail & Related papers (2021-08-27T16:38:22Z)
Mining Contextual Information Beyond Image for Semantic Segmentation [37.783233906684444]
The paper studies the context aggregation problem in semantic image segmentation. It proposes to mine the contextual information beyond individual images to further augment the pixel representations. The proposed method could be effortlessly incorporated into existing segmentation frameworks.
arXiv Detail & Related papers (2021-08-26T14:34:23Z)
Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision. We propose to leverage pixel-level similarities across different objects for learning more accurate object locations. Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z)
Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics [60.92229707497999]
We introduce a novel principle for self-supervised feature learning based on the discrimination of specific transformations of an image. We demonstrate experimentally that learning to discriminate transformations such as LCI, image warping and rotations, yields features with state of the art generalization capabilities.
arXiv Detail & Related papers (2020-04-05T22:09:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.