FCB-SwinV2 Transformer for Polyp Segmentation
- URL: http://arxiv.org/abs/2302.01027v1
- Date: Thu, 2 Feb 2023 11:42:26 GMT
- Title: FCB-SwinV2 Transformer for Polyp Segmentation
- Authors: Kerr Fitzgerald, Bogdan Matuszewski
- Abstract summary: Polyp segmentation within colonoscopy video frames using deep learning models has the potential to automate the workflow of clinicians.
Recent state-of-the-art deep learning polyp segmentation models have combined the outputs of Fully Convolutional Network architectures and Transformer Network architectures which work in parallel.
In this paper we propose modifications to the current state-of-the-art polyp segmentation model FCBFormer.
The performance of the FCB-SwinV2 Transformer is evaluated on the popular colonoscopy segmentation bench-marking datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Polyp segmentation within colonoscopy video frames using deep learning models
has the potential to automate the workflow of clinicians. This could help
improve the early detection rate and characterization of polyps which could
progress to colorectal cancer. Recent state-of-the-art deep learning polyp
segmentation models have combined the outputs of Fully Convolutional Network
architectures and Transformer Network architectures which work in parallel. In
this paper we propose modifications to the current state-of-the-art polyp
segmentation model FCBFormer. The transformer architecture of the FCBFormer is
replaced with a SwinV2 Transformer-UNET and minor changes to the Fully
Convolutional Network architecture are made to create the FCB-SwinV2
Transformer. The performance of the FCB-SwinV2 Transformer is evaluated on the
popular colonoscopy segmentation bench-marking datasets Kvasir-SEG and
CVC-ClinicDB. Generalizability tests are also conducted. The FCB-SwinV2
Transformer is able to consistently achieve higher mDice scores across all
tests conducted and therefore represents new state-of-the-art performance.
Issues found with how colonoscopy segmentation model performance is evaluated
within literature are also re-ported and discussed. One of the most important
issues identified is that when evaluating performance on the CVC-ClinicDB
dataset it would be preferable to ensure no data leakage from video sequences
occurs during the training/validation/test data partition.
Related papers
- ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - RTA-Former: Reverse Transformer Attention for Polyp Segmentation [1.383118997843137]
We introduce a novel network, namely RTA-Former, that employs a transformer model as the encoder backbone and innovatively adapts Reverse Attention (RA) with a transformer stage in the decoder for enhanced edge segmentation.
The results of the experiments illustrate that RTA-Former achieves state-of-the-art (SOTA) performance in five polyp segmentation datasets.
arXiv Detail & Related papers (2024-01-22T03:09:00Z) - ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical
Image Segmentation [10.727162449071155]
We build CNN-style Transformers (ConvFormer) to promote better attention convergence and thus better segmentation performance.
In contrast to positional embedding and tokenization, ConvFormer adopts 2D convolution and max-pooling for both position information preservation and feature size reduction.
arXiv Detail & Related papers (2023-09-09T02:18:17Z) - LAPFormer: A Light and Accurate Polyp Segmentation Transformer [6.352264764099531]
We propose a new model with encoder-decoder architecture named LAPFormer, which uses a hierarchical Transformer encoder to better extract global feature.
Our proposed decoder contains a progressive feature fusion module designed for fusing feature from upper scales and lower scales.
We test our model on five popular benchmark datasets for polyp segmentation.
arXiv Detail & Related papers (2022-10-10T01:52:30Z) - FCN-Transformer Feature Fusion for Polyp Segmentation [12.62213319797323]
Colonoscopy is widely recognised as the gold standard procedure for the early detection of colorectal cancer.
The manual segmentation of polyps in colonoscopy images is time-consuming.
The use of deep learning for automation of polyp segmentation has become important.
arXiv Detail & Related papers (2022-08-17T15:31:06Z) - Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot
Segmentation [58.4650849317274]
Volumetric Aggregation with Transformers (VAT) is a cost aggregation network for few-shot segmentation.
VAT attains state-of-the-art performance for semantic correspondence as well, where cost aggregation also plays a central role.
arXiv Detail & Related papers (2022-07-22T04:10:30Z) - ColonFormer: An Efficient Transformer based Method for Colon Polyp
Segmentation [1.181206257787103]
ColonFormer is an encoder-decoder architecture with the capability of modeling long-range semantic information.
Our ColonFormer achieve state-of-the-art performance on all benchmark datasets.
arXiv Detail & Related papers (2022-05-17T16:34:04Z) - nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution.
nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z) - Deep ensembles based on Stochastic Activation Selection for Polyp
Segmentation [82.61182037130406]
This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations.
Basic architecture in image segmentation consists of an encoder and a decoder.
We compare some variant of the DeepLab architecture obtained by varying the decoder backbone.
arXiv Detail & Related papers (2021-04-02T02:07:37Z) - Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
with Transformers [149.78470371525754]
We treat semantic segmentation as a sequence-to-sequence prediction task. Specifically, we deploy a pure transformer to encode an image as a sequence of patches.
With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR)
SETR achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes.
arXiv Detail & Related papers (2020-12-31T18:55:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.