CovSegNet: A Multi Encoder-Decoder Architecture for Improved Lesion
Segmentation of COVID-19 Chest CT Scans
- URL: http://arxiv.org/abs/2012.01473v1
- Date: Wed, 2 Dec 2020 19:26:35 GMT
- Title: CovSegNet: A Multi Encoder-Decoder Architecture for Improved Lesion
Segmentation of COVID-19 Chest CT Scans
- Authors: Tanvir Mahmud, Md Awsafur Rahman, Shaikh Anowarul Fattah, Sun-Yuan
Kung
- Abstract summary: An automated COVID-19 lesion segmentation scheme is proposed utilizing a highly efficient neural network architecture, namely CovSegNet.
Outstanding performances have been achieved in three publicly available datasets that largely outperform other state-of-the-art approaches.
- Score: 11.946078871080836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic lung lesions segmentation of chest CT scans is considered a pivotal
stage towards accurate diagnosis and severity measurement of COVID-19.
Traditional U-shaped encoder-decoder architecture and its variants suffer from
diminutions of contextual information in pooling/upsampling operations with
increased semantic gaps among encoded and decoded feature maps as well as
instigate vanishing gradient problems for its sequential gradient propagation
that result in sub-optimal performance. Moreover, operating with 3D CT-volume
poses further limitations due to the exponential increase of computational
complexity making the optimization difficult. In this paper, an automated
COVID-19 lesion segmentation scheme is proposed utilizing a highly efficient
neural network architecture, namely CovSegNet, to overcome these limitations.
Additionally, a two-phase training scheme is introduced where a deeper
2D-network is employed for generating ROI-enhanced CT-volume followed by a
shallower 3D-network for further enhancement with more contextual information
without increasing computational burden. Along with the traditional vertical
expansion of Unet, we have introduced horizontal expansion with multi-stage
encoder-decoder modules for achieving optimum performance. Additionally,
multi-scale feature maps are integrated into the scale transition process to
overcome the loss of contextual information. Moreover, a multi-scale fusion
module is introduced with a pyramid fusion scheme to reduce the semantic gaps
between subsequent encoder/decoder modules while facilitating the parallel
optimization for efficient gradient propagation. Outstanding performances have
been achieved in three publicly available datasets that largely outperform
other state-of-the-art approaches. The proposed scheme can be easily extended
for achieving optimum segmentation performances in a wide variety of
applications.
Related papers
- Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - CSGaussian: Progressive Rate-Distortion Compression and Segmentation for 3D Gaussian Splatting [57.73006852239138]
We present the first unified framework for rate-distortion-optimized compression and segmentation of 3D Gaussian Splatting (3DGS)<n>Inspired by recent advances in rate-distortion-optimized 3DGS compression, this work integrates semantic learning into the compression pipeline to support decoder-side applications.<n>Our scheme features a lightweight implicit neural representation-based hyperprior, enabling efficient entropy coding of both color and semantic attributes.
arXiv Detail & Related papers (2026-01-19T08:21:45Z) - Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation [6.82200201381917]
This paper proposes a novel decoder framework specifically designed for medical image segmentation, comprising three core modules.<n>First, the Adaptive Cross-Fusion Attention (ACFA) module integrates channel feature enhancement with spatial attention mechanisms to enhance responsiveness to key regions and structural orientations.<n>Second, the Triple Feature Fusion Attention (TFFA) module fuses features from Spatial, Fourier and Wavelet domains, achieving joint frequency-spatial representation while preserving local information such as edges and textures.<n>Third, the Structural-aware Multi-scale Masking Module (SMMM) optimize the skip connections between encoder and decoder by leveraging multi-scale context and structural s
arXiv Detail & Related papers (2025-12-05T07:39:14Z) - Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation [83.90109373769614]
3D Gaussian Splatting (3D-GS) has emerged as an efficient 3D representation and a promising foundation for semantic tasks like segmentation.<n>We propose a coarse-to-fine binary encoding scheme for per-Gaussian category representation, which compresses each feature into a single integer via the binary-to-decimal mapping.<n>We further design a progressive training strategy that decomposes panoptic segmentation into a series of independent sub-tasks, reducing inter-class conflicts and thereby enhancing fine-grained segmentation capability.
arXiv Detail & Related papers (2025-11-30T15:51:30Z) - FusionSort: Enhanced Cluttered Waste Segmentation with Advanced Decoding and Comprehensive Modality Optimization [0.17582178425580988]
We introduce an enhanced neural architecture that builds upon an existing-Decoder structure to improve the accuracy and efficiency of waste sorting systems.<n>Our model integrates several key innovations: a Comprehensive Attention Block within the decoder, which refines feature representations by combining convolutional and upsampling operations.<n>We also introduce a Data Fusion Block that fuses images with more than three channels.
arXiv Detail & Related papers (2025-08-27T11:32:59Z) - Enhancing Retinal Vascular Structure Segmentation in Images With a Novel
Design Two-Path Interactive Fusion Module Model [6.392575673488379]
We introduce Swin-Res-Net, a specialized module designed to enhance the precision of retinal vessel segmentation.
Swin-Res-Net utilizes the Swin transformer which uses shifted windows with displacement for partitioning.
Our proposed architecture produces outstanding results, either meeting or surpassing those of other published models.
arXiv Detail & Related papers (2024-03-03T01:36:11Z) - E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D
Medical Image Segmentation [36.367368163120794]
We propose a 3D medical image segmentation model, named Efficient to Efficient Network (E2ENet)
It incorporates two parametrically and computationally efficient designs.
It consistently achieves a superior trade-off between accuracy and efficiency across various resource constraints.
arXiv Detail & Related papers (2023-12-07T22:13:37Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Optimization-Inspired Cross-Attention Transformer for Compressive
Sensing [45.672646799969215]
Deep unfolding network (DUN) with good interpretability and high performance has attracted growing attention in compressive sensing.
Existing DUNs often improve the visual quality at the price of a large number of parameters and have the problem of feature information loss during iteration.
We propose an Optimization-inspired Cross-attention Transformer ( OCT) module as an iterative process, leading to a lightweight OCT-based Unfolding Framework ( OCTUF) for image CS.
arXiv Detail & Related papers (2023-04-27T07:21:30Z) - Denoising Diffusion Error Correction Codes [92.10654749898927]
Recently, neural decoders have demonstrated their advantage over classical decoding techniques.
Recent state-of-the-art neural decoders suffer from high complexity and lack the important iterative scheme characteristic of many legacy decoders.
We propose to employ denoising diffusion models for the soft decoding of linear codes at arbitrary block lengths.
arXiv Detail & Related papers (2022-09-16T11:00:50Z) - Memory-efficient Segmentation of High-resolution Volumetric MicroCT
Images [11.723370840090453]
We propose a memory-efficient network architecture for 3D high-resolution image segmentation.
The network incorporates both global and local features via a two-stage U-net-based cascaded framework.
Experiments show that it outperforms state-of-the-art 3D segmentation methods in terms of both segmentation accuracy and memory efficiency.
arXiv Detail & Related papers (2022-05-31T16:42:48Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - InversionNet3D: Efficient and Scalable Learning for 3D Full Waveform
Inversion [14.574636791985968]
In this paper, we present InversionNet3D, an efficient and scalable encoder-decoder network for 3D FWI.
The proposed method employs group convolution in the encoder to establish an effective hierarchy for learning information from multiple sources.
Experiments on the 3D Kimberlina dataset demonstrate that InversionNet3D achieves lower computational cost and lower memory footprint compared to the baseline.
arXiv Detail & Related papers (2021-03-25T22:24:57Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - EfficientFCN: Holistically-guided Decoding for Semantic Segmentation [49.27021844132522]
State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN)
We propose the EfficientFCN, whose backbone is a common ImageNet pre-trained network without any dilated convolution.
Such a framework achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost.
arXiv Detail & Related papers (2020-08-24T14:48:23Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.