RaBiT: An Efficient Transformer using Bidirectional Feature Pyramid
Network with Reverse Attention for Colon Polyp Segmentation
- URL: http://arxiv.org/abs/2307.06420v1
- Date: Wed, 12 Jul 2023 19:25:10 GMT
- Title: RaBiT: An Efficient Transformer using Bidirectional Feature Pyramid
Network with Reverse Attention for Colon Polyp Segmentation
- Authors: Nguyen Hoang Thuan, Nguyen Thi Oanh, Nguyen Thi Thuy, Stuart Perry,
Dinh Viet Sang
- Abstract summary: This paper introduces RaBiT, an encoder-decoder model that incorporates a lightweight Transformer-based architecture in the encoder.
Our method demonstrates high generalization capability in cross-dataset experiments, even when the training and test sets have different characteristics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automatic and accurate segmentation of colon polyps is essential for early
diagnosis of colorectal cancer. Advanced deep learning models have shown
promising results in polyp segmentation. However, they still have limitations
in representing multi-scale features and generalization capability. To address
these issues, this paper introduces RaBiT, an encoder-decoder model that
incorporates a lightweight Transformer-based architecture in the encoder to
model multiple-level global semantic relationships. The decoder consists of
several bidirectional feature pyramid layers with reverse attention modules to
better fuse feature maps at various levels and incrementally refine polyp
boundaries. We also propose ideas to lighten the reverse attention module and
make it more suitable for multi-class segmentation. Extensive experiments on
several benchmark datasets show that our method outperforms existing methods
across all datasets while maintaining low computational complexity. Moreover,
our method demonstrates high generalization capability in cross-dataset
experiments, even when the training and test sets have different
characteristics.
Related papers
- ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation [11.646574658785362]
This work presents a new novel architecture namely Adaptation of Distinct Semantics for Uncertain Areas in Polyp (ADSNet)
ADSNet modifies misclassified details and recovers weak features having the ability to vanish and not be detected at the final stage.
experimental results demonstrate the great correction and recovery ability leading to better segmentation performance compared to the other state of the art in the polyp image segmentation task.
arXiv Detail & Related papers (2024-05-13T07:41:28Z) - Edge-aware Feature Aggregation Network for Polyp Segmentation [40.3881565207086]
In this study, we present a novel Edge-aware Feature Aggregation Network (EFA-Net) for polyp segmentation.
EFA-Net can fully make use of cross-level and multi-scale features to enhance the performance of polyp segmentation.
Experimental results on five widely adopted colonoscopy datasets show that our EFA-Net outperforms state-of-the-art polyp segmentation methods in terms of generalization and effectiveness.
arXiv Detail & Related papers (2023-09-19T11:09:38Z) - SegT: A Novel Separated Edge-guidance Transformer Network for Polyp
Segmentation [10.144870911523622]
We propose a novel separated edge-guidance transformer (SegT) network that aims to build an effective polyp segmentation model.
A transformer encoder that learns a more robust representation than existing CNN-based approaches was specifically applied.
To evaluate the effectiveness of SegT, we conducted experiments with five challenging public datasets.
arXiv Detail & Related papers (2023-06-19T08:32:05Z) - Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation.
It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme.
This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z) - LAPFormer: A Light and Accurate Polyp Segmentation Transformer [6.352264764099531]
We propose a new model with encoder-decoder architecture named LAPFormer, which uses a hierarchical Transformer encoder to better extract global feature.
Our proposed decoder contains a progressive feature fusion module designed for fusing feature from upper scales and lower scales.
We test our model on five popular benchmark datasets for polyp segmentation.
arXiv Detail & Related papers (2022-10-10T01:52:30Z) - ColonFormer: An Efficient Transformer based Method for Colon Polyp
Segmentation [1.181206257787103]
ColonFormer is an encoder-decoder architecture with the capability of modeling long-range semantic information.
Our ColonFormer achieve state-of-the-art performance on all benchmark datasets.
arXiv Detail & Related papers (2022-05-17T16:34:04Z) - Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers [124.01928050651466]
We propose a new type of polyp segmentation method, named Polyp-PVT.
The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities.
arXiv Detail & Related papers (2021-08-16T07:09:06Z) - Automatic Polyp Segmentation via Multi-scale Subtraction Network [100.94922587360871]
In clinical practice, precise polyp segmentation provides important information in the early detection of colorectal cancer.
Most existing methods are based on U-shape structure and use element-wise addition or concatenation to fuse different level features progressively in decoder.
We propose a multi-scale subtraction network (MSNet) to segment polyp from colonoscopy image.
arXiv Detail & Related papers (2021-08-11T07:54:07Z) - Improving Video Instance Segmentation via Temporal Pyramid Routing [61.10753640148878]
Video Instance (VIS) is a new and inherently multi-task problem, which aims to detect, segment and track each instance in a video sequence.
We propose a Temporal Pyramid Routing (TPR) strategy to conditionally align and conduct pixel-level aggregation from a feature pyramid pair of two adjacent frames.
Our approach is a plug-and-play module and can be easily applied to existing instance segmentation methods.
arXiv Detail & Related papers (2021-07-28T03:57:12Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.