ColonFormer: An Efficient Transformer based Method for Colon Polyp
Segmentation
- URL: http://arxiv.org/abs/2205.08473v1
- Date: Tue, 17 May 2022 16:34:04 GMT
- Title: ColonFormer: An Efficient Transformer based Method for Colon Polyp
Segmentation
- Authors: Nguyen Thanh Duc, Nguyen Thi Oanh, Nguyen Thi Thuy, Tran Minh Triet,
Dinh Viet Sang
- Abstract summary: ColonFormer is an encoder-decoder architecture with the capability of modeling long-range semantic information.
Our ColonFormer achieve state-of-the-art performance on all benchmark datasets.
- Score: 1.181206257787103
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Identifying polyps is a challenging problem for automatic analysis of
endoscopic images in computer-aided clinical support systems. Models based on
convolutional networks (CNN), transformers, and combinations of them have been
proposed to segment polyps with promising results. However, those approaches
have limitations either in modeling the local appearance of the polyps only or
lack of multi-level features for spatial dependency in the decoding process.
This paper proposes a novel network, namely ColonFormer, to address these
limitations. ColonFormer is an encoder-decoder architecture with the capability
of modeling long-range semantic information at both encoder and decoder
branches. The encoder is a lightweight architecture based on transformers for
modeling global semantic relations at multi scales. The decoder is a
hierarchical network structure designed for learning multi-level features to
enrich feature representation. Besides, a refinement module is added with a new
skip connection technique to refine the boundary of polyp objects in the global
map for accurate segmentation. Extensive experiments have been conducted on
five popular benchmark datasets for polyp segmentation, including Kvasir,
CVC-Clinic DB, CVCColonDB, EndoScene, and ETIS. Experimental results show that
our ColonFormer achieve state-of-the-art performance on all benchmark datasets.
Related papers
- CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection [1.837431956557716]
Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection.
We propose a novel decoder block that integrates feature pyramids and transformers.
Our model achieves superior performance in detecting small objects compared to existing methods.
arXiv Detail & Related papers (2024-04-23T18:46:07Z) - Multi-Layer Dense Attention Decoder for Polyp Segmentation [10.141956829529859]
We propose a novel decoder architecture aimed at hierarchically aggregating locally enhanced multi-level dense features.
Specifically, we introduce a novel module named Dense Attention Gate (DAG), which adaptively fuses all previous layers' features to establish local feature relations among all layers.
Our experiments and comparisons with nine competing segmentation models demonstrate that the proposed architecture achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-03-27T01:15:05Z) - Using DUCK-Net for Polyp Image Segmentation [0.0]
"DUCK-Net" is capable of effectively learning and generalizing from small amounts of medical images to perform accurate segmentation tasks.
We demonstrate its capabilities specifically for polyp segmentation in colonoscopy images.
arXiv Detail & Related papers (2023-11-03T20:58:44Z) - RaBiT: An Efficient Transformer using Bidirectional Feature Pyramid
Network with Reverse Attention for Colon Polyp Segmentation [0.0]
This paper introduces RaBiT, an encoder-decoder model that incorporates a lightweight Transformer-based architecture in the encoder.
Our method demonstrates high generalization capability in cross-dataset experiments, even when the training and test sets have different characteristics.
arXiv Detail & Related papers (2023-07-12T19:25:10Z) - Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation.
It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme.
This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z) - LAPFormer: A Light and Accurate Polyp Segmentation Transformer [6.352264764099531]
We propose a new model with encoder-decoder architecture named LAPFormer, which uses a hierarchical Transformer encoder to better extract global feature.
Our proposed decoder contains a progressive feature fusion module designed for fusing feature from upper scales and lower scales.
We test our model on five popular benchmark datasets for polyp segmentation.
arXiv Detail & Related papers (2022-10-10T01:52:30Z) - Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers [124.01928050651466]
We propose a new type of polyp segmentation method, named Polyp-PVT.
The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities.
arXiv Detail & Related papers (2021-08-16T07:09:06Z) - Automatic Polyp Segmentation via Multi-scale Subtraction Network [100.94922587360871]
In clinical practice, precise polyp segmentation provides important information in the early detection of colorectal cancer.
Most existing methods are based on U-shape structure and use element-wise addition or concatenation to fuse different level features progressively in decoder.
We propose a multi-scale subtraction network (MSNet) to segment polyp from colonoscopy image.
arXiv Detail & Related papers (2021-08-11T07:54:07Z) - Deep ensembles based on Stochastic Activation Selection for Polyp
Segmentation [82.61182037130406]
This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations.
Basic architecture in image segmentation consists of an encoder and a decoder.
We compare some variant of the DeepLab architecture obtained by varying the decoder backbone.
arXiv Detail & Related papers (2021-04-02T02:07:37Z) - Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
with Transformers [149.78470371525754]
We treat semantic segmentation as a sequence-to-sequence prediction task. Specifically, we deploy a pure transformer to encode an image as a sequence of patches.
With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR)
SETR achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes.
arXiv Detail & Related papers (2020-12-31T18:55:57Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.