CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural
Network for Semantic Segmentation
- URL: http://arxiv.org/abs/2108.00408v2
- Date: Mon, 11 Mar 2024 23:20:07 GMT
- Title: CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural
Network for Semantic Segmentation
- Authors: Haitong Tang, Shuang He, Mengduo Yang, Xia Lu, Qin Yu, Kaiyue Liu,
Hongjie Yan and Nizhuan Wang
- Abstract summary: We propose a novel strategy that reformulated the popularly-used convolution operation to multi-layer convolutional sparse coding block.
We show that the multi-layer convolutional sparse coding block enables semantic segmentation model to converge faster, can extract finer semantic and appearance information of images, and improve the ability to recover spatial detail information.
- Score: 0.44289311505645573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is a challenging task to accurately perform semantic segmentation due to
the complexity of real picture scenes. Many semantic segmentation methods based
on traditional deep learning insufficiently captured the semantic and
appearance information of images, which put limit on their generality and
robustness for various application scenes. In this paper, we proposed a novel
strategy that reformulated the popularly-used convolution operation to
multi-layer convolutional sparse coding block to ease the aforementioned
deficiency. This strategy can be possibly used to significantly improve the
segmentation performance of any semantic segmentation model that involves
convolutional operations. To prove the effectiveness of our idea, we chose the
widely-used U-Net model for the demonstration purpose, and we designed CSC-Unet
model series based on U-Net. Through extensive analysis and experiments, we
provided credible evidence showing that the multi-layer convolutional sparse
coding block enables semantic segmentation model to converge faster, can
extract finer semantic and appearance information of images, and improve the
ability to recover spatial detail information. The best CSC-Unet model
significantly outperforms the results of the original U-Net on three public
datasets with different scenarios, i.e., 87.14% vs. 84.71% on DeepCrack
dataset, 68.91% vs. 67.09% on Nuclei dataset, and 53.68% vs. 48.82% on CamVid
dataset, respectively.
Related papers
- UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation [64.01742988773745]
An increasing privacy concern exists regarding training large-scale image segmentation models on unauthorized private data.
We exploit the concept of unlearnable examples to make images unusable to model training by generating and adding unlearnable noise into the original images.
We empirically verify the effectiveness of UnSeg across 6 mainstream image segmentation tasks, 10 widely used datasets, and 7 different network architectures.
arXiv Detail & Related papers (2024-10-13T16:34:46Z) - Early Fusion of Features for Semantic Segmentation [10.362589129094975]
This paper introduces a novel segmentation framework that integrates a classifier network with a reverse HRNet architecture for efficient image segmentation.
Our methodology is rigorously tested across several benchmark datasets including Mapillary Vistas, Cityscapes, CamVid, COCO, and PASCAL-VOC2012.
The results demonstrate the effectiveness of our proposed model in achieving high segmentation accuracy, indicating its potential for various applications in image analysis.
arXiv Detail & Related papers (2024-02-08T22:58:06Z) - Using DUCK-Net for Polyp Image Segmentation [0.0]
"DUCK-Net" is capable of effectively learning and generalizing from small amounts of medical images to perform accurate segmentation tasks.
We demonstrate its capabilities specifically for polyp segmentation in colonoscopy images.
arXiv Detail & Related papers (2023-11-03T20:58:44Z) - Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training
of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets.
The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse.
Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z) - CEU-Net: Ensemble Semantic Segmentation of Hyperspectral Images Using
Clustering [2.741266294612776]
Clustering Ensemble U-Net (CEU-Net) is a novel semantic segmentation model for Hyperspectral images (HSIs)
CEU-Net combines spectral information extracted from convolutional neural network (CNN) training on a cluster of landscape pixels.
Our model outperforms existing state-of-the-art HSI semantic segmentation methods and gets competitive performance with and without patching.
arXiv Detail & Related papers (2022-03-09T16:51:15Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - SCG-Net: Self-Constructing Graph Neural Networks for Semantic
Segmentation [23.623276007011373]
We propose a module that learns a long-range dependency graph directly from the image and uses it to propagate contextual information efficiently.
The module is optimised via a novel adaptive diagonal enhancement method and a variational lower bound.
When incorporated into a neural network (SCG-Net), semantic segmentation is performed in an end-to-end manner and competitive performance.
arXiv Detail & Related papers (2020-09-03T12:13:09Z) - Semantic Segmentation With Multi Scale Spatial Attention For Self
Driving Cars [2.7317088388886384]
We present a novel neural network using multi scale feature fusion at various scales for accurate and efficient semantic image segmentation.
We used ResNet based feature extractor, dilated convolutional layers in downsampling part, atrous convolutional layers in the upsampling part and used concat operation to merge them.
A new attention module is proposed to encode more contextual information and enhance the receptive field of the network.
arXiv Detail & Related papers (2020-06-30T20:19:09Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.