RetSeg: Retention-based Colorectal Polyps Segmentation Network
- URL: http://arxiv.org/abs/2310.05446v5
- Date: Fri, 8 Mar 2024 11:31:28 GMT
- Title: RetSeg: Retention-based Colorectal Polyps Segmentation Network
- Authors: Khaled ELKarazle, Valliappan Raman, Caslon Chua and Patrick Then
- Abstract summary: Vision Transformers (ViTs) have revolutionized medical imaging analysis.
ViTs exhibit contextual awareness in processing visual data, culminating in robust and precise predictions.
We introduce RetSeg, an encoder-decoder network featuring multi-head retention blocks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision Transformers (ViTs) have revolutionized medical imaging analysis,
showcasing superior efficacy compared to conventional Convolutional Neural
Networks (CNNs) in vital tasks such as polyp classification, detection, and
segmentation. Leveraging attention mechanisms to focus on specific image
regions, ViTs exhibit contextual awareness in processing visual data,
culminating in robust and precise predictions, even for intricate medical
images. Moreover, the inherent self-attention mechanism in Transformers
accommodates varying input sizes and resolutions, granting an unprecedented
flexibility absent in traditional CNNs. However, Transformers grapple with
challenges like excessive memory usage and limited training parallelism due to
self-attention, rendering them impractical for real-time disease detection on
resource-constrained devices. In this study, we address these hurdles by
investigating the integration of the recently introduced retention mechanism
into polyp segmentation, introducing RetSeg, an encoder-decoder network
featuring multi-head retention blocks. Drawing inspiration from Retentive
Networks (RetNet), RetSeg is designed to bridge the gap between precise polyp
segmentation and resource utilization, particularly tailored for colonoscopy
images. We train and validate RetSeg for polyp segmentation employing two
publicly available datasets: Kvasir-SEG and CVC-ClinicDB. Additionally, we
showcase RetSeg's promising performance across diverse public datasets,
including CVC-ColonDB, ETIS-LaribPolypDB, CVC-300, and BKAI-IGH NeoPolyp. While
our work represents an early-stage exploration, further in-depth studies are
imperative to advance these promising findings.
Related papers
- TransResNet: Integrating the Strengths of ViTs and CNNs for High Resolution Medical Image Segmentation via Feature Grafting [6.987177704136503]
High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method.
Most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images.
We propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently.
arXiv Detail & Related papers (2024-10-01T18:22:34Z) - ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
Image Segmentation [0.0]
We propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation.
In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images.
By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements.
arXiv Detail & Related papers (2023-10-16T01:13:38Z) - Self-supervised Semantic Segmentation: Consistency over Transformation [3.485615723221064]
We propose a novel self-supervised algorithm, textbfS$3$-Net, which integrates a robust framework based on the proposed Inception Large Kernel Attention (I-LKA) modules.
We leverage deformable convolution as an integral component to effectively capture and delineate lesion deformations for superior object boundary definition.
Our experimental results on skin lesion and lung organ segmentation tasks show the superior performance of our method compared to the SOTA approaches.
arXiv Detail & Related papers (2023-08-31T21:28:46Z) - Lesion-aware Dynamic Kernel for Polyp Segmentation [49.63274623103663]
We propose a lesion-aware dynamic network (LDNet) for polyp segmentation.
It is a traditional u-shape encoder-decoder structure incorporated with a dynamic kernel generation and updating scheme.
This simple but effective scheme endows our model with powerful segmentation performance and generalization capability.
arXiv Detail & Related papers (2023-01-12T09:53:57Z) - Video-TransUNet: Temporally Blended Vision Transformer for CT VFSS
Instance Segmentation [11.575821326313607]
We propose Video-TransUNet, a deep architecture for segmentation in medical CT videos constructed by integrating temporal feature blending into the TransUNet deep learning framework.
In particular, our approach amalgamates strong frame representation via a ResNet CNN backbone, multi-frame feature blending via a Temporal Context Module, and reconstructive capabilities for multiple targets via a UNet-based convolutional-deconal architecture with multiple heads.
arXiv Detail & Related papers (2022-08-17T14:28:58Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - Towards a Computed-Aided Diagnosis System in Colonoscopy: Automatic
Polyp Segmentation Using Convolution Neural Networks [10.930181796935734]
We present a deep learning framework for recognizing lesions in colonoscopy and capsule endoscopy images.
To our knowledge, we present the first work to use FCNs for polyp segmentation in addition to proposing a novel combination of SfS and RGB that boosts performance.
arXiv Detail & Related papers (2021-01-15T10:08:53Z) - PraNet: Parallel Reverse Attention Network for Polyp Segmentation [155.93344756264824]
We propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images.
We first aggregate the features in high-level layers using a parallel partial decoder (PPD)
In addition, we mine the boundary cues using a reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues.
arXiv Detail & Related papers (2020-06-13T08:13:43Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.