Related papers: CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural Network for Semantic Segmentation

CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural Network for Semantic Segmentation

URL: http://arxiv.org/abs/2108.00408v2
Date: Mon, 11 Mar 2024 23:20:07 GMT
Title: CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural Network for Semantic Segmentation
Authors: Haitong Tang, Shuang He, Mengduo Yang, Xia Lu, Qin Yu, Kaiyue Liu, Hongjie Yan and Nizhuan Wang
Abstract summary: We propose a novel strategy that reformulated the popularly-used convolution operation to multi-layer convolutional sparse coding block. We show that the multi-layer convolutional sparse coding block enables semantic segmentation model to converge faster, can extract finer semantic and appearance information of images, and improve the ability to recover spatial detail information.
Score: 0.44289311505645573
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is a challenging task to accurately perform semantic segmentation due to the complexity of real picture scenes. Many semantic segmentation methods based on traditional deep learning insufficiently captured the semantic and appearance information of images, which put limit on their generality and robustness for various application scenes. In this paper, we proposed a novel strategy that reformulated the popularly-used convolution operation to multi-layer convolutional sparse coding block to ease the aforementioned deficiency. This strategy can be possibly used to significantly improve the segmentation performance of any semantic segmentation model that involves convolutional operations. To prove the effectiveness of our idea, we chose the widely-used U-Net model for the demonstration purpose, and we designed CSC-Unet model series based on U-Net. Through extensive analysis and experiments, we provided credible evidence showing that the multi-layer convolutional sparse coding block enables semantic segmentation model to converge faster, can extract finer semantic and appearance information of images, and improve the ability to recover spatial detail information. The best CSC-Unet model significantly outperforms the results of the original U-Net on three public datasets with different scenarios, i.e., 87.14% vs. 84.71% on DeepCrack dataset, 68.91% vs. 67.09% on Nuclei dataset, and 53.68% vs. 48.82% on CamVid dataset, respectively.

Related papers

UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation [64.01742988773745]
An increasing privacy concern exists regarding training large-scale image segmentation models on unauthorized private data. We exploit the concept of unlearnable examples to make images unusable to model training by generating and adding unlearnable noise into the original images. We empirically verify the effectiveness of UnSeg across 6 mainstream image segmentation tasks, 10 widely used datasets, and 7 different network architectures.
arXiv Detail & Related papers (2024-10-13T16:34:46Z)
Early Fusion of Features for Semantic Segmentation [10.362589129094975]
This paper introduces a novel segmentation framework that integrates a classifier network with a reverse HRNet architecture for efficient image segmentation. Our methodology is rigorously tested across several benchmark datasets including Mapillary Vistas, Cityscapes, CamVid, COCO, and PASCAL-VOC2012. The results demonstrate the effectiveness of our proposed model in achieving high segmentation accuracy, indicating its potential for various applications in image analysis.
arXiv Detail & Related papers (2024-02-08T22:58:06Z)
Using DUCK-Net for Polyp Image Segmentation [0.0]
"DUCK-Net" is capable of effectively learning and generalizing from small amounts of medical images to perform accurate segmentation tasks. We demonstrate its capabilities specifically for polyp segmentation in colonoscopy images.
arXiv Detail & Related papers (2023-11-03T20:58:44Z)
Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets. The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse. Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z)
CEU-Net: Ensemble Semantic Segmentation of Hyperspectral Images Using Clustering [2.741266294612776]
Clustering Ensemble U-Net (CEU-Net) is a novel semantic segmentation model for Hyperspectral images (HSIs) CEU-Net combines spectral information extracted from convolutional neural network (CNN) training on a cluster of landscape pixels. Our model outperforms existing state-of-the-art HSI semantic segmentation methods and gets competitive performance with and without patching.
arXiv Detail & Related papers (2022-03-09T16:51:15Z)
Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets. This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets. In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z)
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths. In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)
SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation [23.623276007011373]
We propose a module that learns a long-range dependency graph directly from the image and uses it to propagate contextual information efficiently. The module is optimised via a novel adaptive diagonal enhancement method and a variational lower bound. When incorporated into a neural network (SCG-Net), semantic segmentation is performed in an end-to-end manner and competitive performance.
arXiv Detail & Related papers (2020-09-03T12:13:09Z)
Semantic Segmentation With Multi Scale Spatial Attention For Self Driving Cars [2.7317088388886384]
We present a novel neural network using multi scale feature fusion at various scales for accurate and efficient semantic image segmentation. We used ResNet based feature extractor, dilated convolutional layers in downsampling part, atrous convolutional layers in the upsampling part and used concat operation to merge them. A new attention module is proposed to encode more contextual information and enhance the receptive field of the network.
arXiv Detail & Related papers (2020-06-30T20:19:09Z)
CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images. Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.