XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
- URL: http://arxiv.org/abs/2206.00806v1
- Date: Thu, 2 Jun 2022 00:24:52 GMT
- Title: XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
- Authors: Jiacheng Wang, Fei Chen, Yuxi Ma, Liansheng Wang, Zhaodong Fei,
Jianwei Shuai, Xiangdong Tang, Qichao Zhou, Jing Qin
- Abstract summary: We propose a novel cross-scale boundary-aware transformer, textbfXBound-Former, to address the variation and boundary problems of skin lesion segmentation.
XBound-Former is a purely attention-based network and catches boundary knowledge via three specially designed learners.
We evaluate the model on two skin lesion datasets, ISIC-2016&PH$2$ and ISIC-2018, where our model consistently outperforms other convolution- and transformer-based models.
- Score: 11.979700468758313
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Skin lesion segmentation from dermoscopy images is of great significance in
the quantitative analysis of skin cancers, which is yet challenging even for
dermatologists due to the inherent issues, i.e., considerable size, shape and
color variation, and ambiguous boundaries. Recent vision transformers have
shown promising performance in handling the variation through global context
modeling. Still, they have not thoroughly solved the problem of ambiguous
boundaries as they ignore the complementary usage of the boundary knowledge and
global contexts. In this paper, we propose a novel cross-scale boundary-aware
transformer, \textbf{XBound-Former}, to simultaneously address the variation
and boundary problems of skin lesion segmentation. XBound-Former is a purely
attention-based network and catches boundary knowledge via three specially
designed learners. We evaluate the model on two skin lesion datasets,
ISIC-2016\&PH$^2$ and ISIC-2018, where our model consistently outperforms other
convolution- and transformer-based models, especially on the boundary-wise
metrics. We extensively verify the generalization ability of polyp lesion
segmentation that has similar characteristics, and our model can also yield
significant improvement compared to the latest models.
Related papers
- SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance [0.559239450391449]
Skin lesion segmentation is a crucial method for identifying early skin cancer.
We propose a hybrid architecture based on Mamba and CNN, called SkinMamba.
It maintains linear complexity while offering powerful long-range dependency modeling and local feature extraction capabilities.
arXiv Detail & Related papers (2024-09-17T05:02:38Z) - LSSF-Net: Lightweight Segmentation with Self-Awareness, Spatial Attention, and Focal Modulation [8.566930077350184]
We propose a novel lightweight network specifically designed for skin lesion segmentation utilizing mobile devices.
Our network comprises an encoder-decoder architecture that incorporates conformer-based focal modulation attention, self-aware local and global spatial attention, and split channel-shuffle.
Empirical findings substantiate its state-of-the-art performance, notably reflected in a high Jaccard index.
arXiv Detail & Related papers (2024-09-03T03:06:32Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - LCAUnet: A skin lesion segmentation network with enhanced edge and body
fusion [4.819821513256158]
LCAUnet is proposed to improve the ability of complementary representation with fusion of edge and body features.
Experiments on public available dataset ISIC 2017, ISIC 2018, and PH2 demonstrate that LCAUnet outperforms most state-of-the-art methods.
arXiv Detail & Related papers (2023-05-01T14:05:53Z) - Masked Pre-Training of Transformers for Histology Image Analysis [4.710921988115685]
In digital pathology, whole slide images (WSIs) are widely used for applications such as cancer diagnosis and prognosis prediction.
Visual transformer models have emerged as a promising method for encoding large regions of WSIs while preserving spatial relationships among patches.
We propose a pretext task for training the transformer model without labeled data to address this problem.
Our model, MaskHIT, uses the transformer output to reconstruct masked patches and learn representative histological features based on their positions and visual features.
arXiv Detail & Related papers (2023-04-14T23:56:49Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Improving Deep Facial Phenotyping for Ultra-rare Disorder Verification
Using Model Ensembles [52.77024349608834]
We analyze the influence of replacing a DCNN with a state-of-the-art face recognition approach, iResNet with ArcFace.
Our proposed ensemble model achieves state-of-the-art performance on both seen and unseen disorders.
arXiv Detail & Related papers (2022-11-12T23:28:54Z) - Boundary-aware Transformers for Skin Lesion Segmentation [19.284634561363184]
We propose a novel boundary-aware transformer (BAT) to address the challenges of automatic skin lesion segmentation.
Specifically, we integrate a new boundary-wise attention gate (BAG) into transformers to enable the whole network to not only effectively model global long-range dependencies via transformers but also, simultaneously, capture more local details by making full use of boundary-wise prior knowledge.
arXiv Detail & Related papers (2021-10-08T02:43:34Z) - DONet: Dual Objective Networks for Skin Lesion Segmentation [77.9806410198298]
We propose a simple yet effective framework, named Dual Objective Networks (DONet), to improve the skin lesion segmentation.
Our DONet adopts two symmetric decoders to produce different predictions for approaching different objectives.
To address the challenge of large variety of lesion scales and shapes in dermoscopic images, we additionally propose a recurrent context encoding module (RCEM)
arXiv Detail & Related papers (2020-08-19T06:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.