Related papers: FCB-SwinV2 Transformer for Polyp Segmentation

FCB-SwinV2 Transformer for Polyp Segmentation

URL: http://arxiv.org/abs/2302.01027v1
Date: Thu, 2 Feb 2023 11:42:26 GMT
Title: FCB-SwinV2 Transformer for Polyp Segmentation
Authors: Kerr Fitzgerald, Bogdan Matuszewski
Abstract summary: Polyp segmentation within colonoscopy video frames using deep learning models has the potential to automate the workflow of clinicians. Recent state-of-the-art deep learning polyp segmentation models have combined the outputs of Fully Convolutional Network architectures and Transformer Network architectures which work in parallel. In this paper we propose modifications to the current state-of-the-art polyp segmentation model FCBFormer. The performance of the FCB-SwinV2 Transformer is evaluated on the popular colonoscopy segmentation bench-marking datasets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Polyp segmentation within colonoscopy video frames using deep learning models has the potential to automate the workflow of clinicians. This could help improve the early detection rate and characterization of polyps which could progress to colorectal cancer. Recent state-of-the-art deep learning polyp segmentation models have combined the outputs of Fully Convolutional Network architectures and Transformer Network architectures which work in parallel. In this paper we propose modifications to the current state-of-the-art polyp segmentation model FCBFormer. The transformer architecture of the FCBFormer is replaced with a SwinV2 Transformer-UNET and minor changes to the Fully Convolutional Network architecture are made to create the FCB-SwinV2 Transformer. The performance of the FCB-SwinV2 Transformer is evaluated on the popular colonoscopy segmentation bench-marking datasets Kvasir-SEG and CVC-ClinicDB. Generalizability tests are also conducted. The FCB-SwinV2 Transformer is able to consistently achieve higher mDice scores across all tests conducted and therefore represents new state-of-the-art performance. Issues found with how colonoscopy segmentation model performance is evaluated within literature are also re-ported and discussed. One of the most important issues identified is that when evaluating performance on the CVC-ClinicDB dataset it would be preferable to ensure no data leakage from video sequences occurs during the training/validation/test data partition.

Related papers

ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation. SAM's Transformer-based structure prioritizes global and low-frequency information. CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z)
RTA-Former: Reverse Transformer Attention for Polyp Segmentation [1.383118997843137]
We introduce a novel network, namely RTA-Former, that employs a transformer model as the encoder backbone and innovatively adapts Reverse Attention (RA) with a transformer stage in the decoder for enhanced edge segmentation. The results of the experiments illustrate that RTA-Former achieves state-of-the-art (SOTA) performance in five polyp segmentation datasets.
arXiv Detail & Related papers (2024-01-22T03:09:00Z)
ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation [10.727162449071155]
We build CNN-style Transformers (ConvFormer) to promote better attention convergence and thus better segmentation performance. In contrast to positional embedding and tokenization, ConvFormer adopts 2D convolution and max-pooling for both position information preservation and feature size reduction.
arXiv Detail & Related papers (2023-09-09T02:18:17Z)
LAPFormer: A Light and Accurate Polyp Segmentation Transformer [6.352264764099531]
We propose a new model with encoder-decoder architecture named LAPFormer, which uses a hierarchical Transformer encoder to better extract global feature. Our proposed decoder contains a progressive feature fusion module designed for fusing feature from upper scales and lower scales. We test our model on five popular benchmark datasets for polyp segmentation.
arXiv Detail & Related papers (2022-10-10T01:52:30Z)
FCN-Transformer Feature Fusion for Polyp Segmentation [12.62213319797323]
Colonoscopy is widely recognised as the gold standard procedure for the early detection of colorectal cancer. The manual segmentation of polyps in colonoscopy images is time-consuming. The use of deep learning for automation of polyp segmentation has become important.
arXiv Detail & Related papers (2022-08-17T15:31:06Z)
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation [58.4650849317274]
Volumetric Aggregation with Transformers (VAT) is a cost aggregation network for few-shot segmentation. VAT attains state-of-the-art performance for semantic correspondence as well, where cost aggregation also plays a central role.
arXiv Detail & Related papers (2022-07-22T04:10:30Z)
ColonFormer: An Efficient Transformer based Method for Colon Polyp Segmentation [1.181206257787103]
ColonFormer is an encoder-decoder architecture with the capability of modeling long-range semantic information. Our ColonFormer achieve state-of-the-art performance on all benchmark datasets.
arXiv Detail & Related papers (2022-05-17T16:34:04Z)
nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution. nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z)
Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD) It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches. Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z)
Deep ensembles based on Stochastic Activation Selection for Polyp Segmentation [82.61182037130406]
This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations. Basic architecture in image segmentation consists of an encoder and a decoder. We compare some variant of the DeepLab architecture obtained by varying the decoder backbone.
arXiv Detail & Related papers (2021-04-02T02:07:37Z)
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [149.78470371525754]
We treat semantic segmentation as a sequence-to-sequence prediction task. Specifically, we deploy a pure transformer to encode an image as a sequence of patches. With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR) SETR achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes.
arXiv Detail & Related papers (2020-12-31T18:55:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.