CovSegNet: A Multi Encoder-Decoder Architecture for Improved Lesion
Segmentation of COVID-19 Chest CT Scans
- URL: http://arxiv.org/abs/2012.01473v1
- Date: Wed, 2 Dec 2020 19:26:35 GMT
- Title: CovSegNet: A Multi Encoder-Decoder Architecture for Improved Lesion
Segmentation of COVID-19 Chest CT Scans
- Authors: Tanvir Mahmud, Md Awsafur Rahman, Shaikh Anowarul Fattah, Sun-Yuan
Kung
- Abstract summary: An automated COVID-19 lesion segmentation scheme is proposed utilizing a highly efficient neural network architecture, namely CovSegNet.
Outstanding performances have been achieved in three publicly available datasets that largely outperform other state-of-the-art approaches.
- Score: 11.946078871080836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic lung lesions segmentation of chest CT scans is considered a pivotal
stage towards accurate diagnosis and severity measurement of COVID-19.
Traditional U-shaped encoder-decoder architecture and its variants suffer from
diminutions of contextual information in pooling/upsampling operations with
increased semantic gaps among encoded and decoded feature maps as well as
instigate vanishing gradient problems for its sequential gradient propagation
that result in sub-optimal performance. Moreover, operating with 3D CT-volume
poses further limitations due to the exponential increase of computational
complexity making the optimization difficult. In this paper, an automated
COVID-19 lesion segmentation scheme is proposed utilizing a highly efficient
neural network architecture, namely CovSegNet, to overcome these limitations.
Additionally, a two-phase training scheme is introduced where a deeper
2D-network is employed for generating ROI-enhanced CT-volume followed by a
shallower 3D-network for further enhancement with more contextual information
without increasing computational burden. Along with the traditional vertical
expansion of Unet, we have introduced horizontal expansion with multi-stage
encoder-decoder modules for achieving optimum performance. Additionally,
multi-scale feature maps are integrated into the scale transition process to
overcome the loss of contextual information. Moreover, a multi-scale fusion
module is introduced with a pyramid fusion scheme to reduce the semantic gaps
between subsequent encoder/decoder modules while facilitating the parallel
optimization for efficient gradient propagation. Outstanding performances have
been achieved in three publicly available datasets that largely outperform
other state-of-the-art approaches. The proposed scheme can be easily extended
for achieving optimum segmentation performances in a wide variety of
applications.
Related papers
- Enhancing Retinal Vascular Structure Segmentation in Images With a Novel
Design Two-Path Interactive Fusion Module Model [6.392575673488379]
We introduce Swin-Res-Net, a specialized module designed to enhance the precision of retinal vessel segmentation.
Swin-Res-Net utilizes the Swin transformer which uses shifted windows with displacement for partitioning.
Our proposed architecture produces outstanding results, either meeting or surpassing those of other published models.
arXiv Detail & Related papers (2024-03-03T01:36:11Z) - E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D
Medical Image Segmentation [36.367368163120794]
We propose a 3D medical image segmentation model, named Efficient to Efficient Network (E2ENet)
It incorporates two parametrically and computationally efficient designs.
It consistently achieves a superior trade-off between accuracy and efficiency across various resource constraints.
arXiv Detail & Related papers (2023-12-07T22:13:37Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Optimization-Inspired Cross-Attention Transformer for Compressive
Sensing [45.672646799969215]
Deep unfolding network (DUN) with good interpretability and high performance has attracted growing attention in compressive sensing.
Existing DUNs often improve the visual quality at the price of a large number of parameters and have the problem of feature information loss during iteration.
We propose an Optimization-inspired Cross-attention Transformer ( OCT) module as an iterative process, leading to a lightweight OCT-based Unfolding Framework ( OCTUF) for image CS.
arXiv Detail & Related papers (2023-04-27T07:21:30Z) - Denoising Diffusion Error Correction Codes [92.10654749898927]
Recently, neural decoders have demonstrated their advantage over classical decoding techniques.
Recent state-of-the-art neural decoders suffer from high complexity and lack the important iterative scheme characteristic of many legacy decoders.
We propose to employ denoising diffusion models for the soft decoding of linear codes at arbitrary block lengths.
arXiv Detail & Related papers (2022-09-16T11:00:50Z) - Memory-efficient Segmentation of High-resolution Volumetric MicroCT
Images [11.723370840090453]
We propose a memory-efficient network architecture for 3D high-resolution image segmentation.
The network incorporates both global and local features via a two-stage U-net-based cascaded framework.
Experiments show that it outperforms state-of-the-art 3D segmentation methods in terms of both segmentation accuracy and memory efficiency.
arXiv Detail & Related papers (2022-05-31T16:42:48Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - InversionNet3D: Efficient and Scalable Learning for 3D Full Waveform
Inversion [14.574636791985968]
In this paper, we present InversionNet3D, an efficient and scalable encoder-decoder network for 3D FWI.
The proposed method employs group convolution in the encoder to establish an effective hierarchy for learning information from multiple sources.
Experiments on the 3D Kimberlina dataset demonstrate that InversionNet3D achieves lower computational cost and lower memory footprint compared to the baseline.
arXiv Detail & Related papers (2021-03-25T22:24:57Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - EfficientFCN: Holistically-guided Decoding for Semantic Segmentation [49.27021844132522]
State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN)
We propose the EfficientFCN, whose backbone is a common ImageNet pre-trained network without any dilated convolution.
Such a framework achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost.
arXiv Detail & Related papers (2020-08-24T14:48:23Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.