Related papers: Evidential fully convolutional network for semantic segmentation

Evidential fully convolutional network for semantic segmentation

URL: http://arxiv.org/abs/2103.13544v1
Date: Thu, 25 Mar 2021 01:21:22 GMT
Title: Evidential fully convolutional network for semantic segmentation
Authors: Zheng Tong, Philippe Xu, Thierry Den{\oe}ux
Abstract summary: We propose a hybrid architecture composed of a fully convolutional network (FCN) and a Dempster-Shafer layer for image semantic segmentation. Experiments show that the proposed combination improves the accuracy and calibration of semantic segmentation by assigning confusing pixels to multi-class sets.
Score: 6.230751621285322
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a hybrid architecture composed of a fully convolutional network (FCN) and a Dempster-Shafer layer for image semantic segmentation. In the so-called evidential FCN (E-FCN), an encoder-decoder architecture first extracts pixel-wise feature maps from an input image. A Dempster-Shafer layer then computes mass functions at each pixel location based on distances to prototypes. Finally, a utility layer performs semantic segmentation from mass functions and allows for imprecise classification of ambiguous pixels and outliers. We propose an end-to-end learning strategy for jointly updating the network parameters, which can make use of soft (imprecise) labels. Experiments using three databases (Pascal VOC 2011, MIT-scene Parsing and SIFT Flow) show that the proposed combination improves the accuracy and calibration of semantic segmentation by assigning confusing pixels to multi-class sets.

Related papers

SQ-GAN: Semantic Image Communications Using Masked Vector Quantization [54.35918290143049]
This work introduces Semantically Masked Vector Quantized Generative Adversarial Network (SQ-GAN)<n>It is a novel approach integrating semantically driven image coding and vector quantization to optimize image compression for semantic/task-oriented communications.<n>SQ-GAN outperforms state-of-the-art image compression schemes such as JPEG2000, BPG, and deep-learning based methods across multiple metrics.
arXiv Detail & Related papers (2025-02-13T17:35:57Z)
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels [53.8817160001038]
We propose a novel method, PixelCLIP, to adapt the CLIP image encoder for pixel-level understanding. To address the challenges of leveraging masks without semantic labels, we devise an online clustering algorithm. PixelCLIP shows significant performance improvements over CLIP and competitive results compared to caption-supervised methods.
arXiv Detail & Related papers (2024-09-30T01:13:03Z)
MacFormer: Semantic Segmentation with Fine Object Boundaries [38.430631361558426]
We introduce a new semantic segmentation architecture, MacFormer'', which features two key components. Firstly, using learnable agent tokens, a Mutual Agent Cross-Attention (MACA) mechanism effectively facilitates the bidirectional integration of features across encoder and decoder layers. Secondly, a Frequency Enhancement Module (FEM) in the decoder leverages high-frequency and low-frequency components to boost features in the frequency domain. MacFormer is demonstrated to be compatible with various network architectures and outperforms existing methods in both accuracy and efficiency on datasets benchmark ADE20K and Cityscapes.
arXiv Detail & Related papers (2024-08-11T05:36:10Z)
Semi-supervised segmentation of land cover images using nonlinear canonical correlation analysis with multiple features and t-SNE [1.7000283696243563]
Image segmentation is a clustering task whereby each pixel is assigned a cluster label. In this work, by resorting to label only a small quantity of pixels, a new semi-supervised segmentation approach is proposed. The proposed semi-supervised RBF-CCA algorithm has been implemented on several remotely sensed multispectral images.
arXiv Detail & Related papers (2024-01-22T17:56:07Z)
FuseNet: Self-Supervised Dual-Path Network for Medical Image Segmentation [3.485615723221064]
FuseNet is a dual-stream framework for self-supervised semantic segmentation. Cross-modal fusion technique extends the principles of CLIP by replacing textual data with augmented images. experiments on skin lesion and lung segmentation datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-11-22T00:03:16Z)
CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation [104.5033800500497]
Referring image segmentation aims at localizing all pixels of the visual objects described by a natural language sentence. Previous works learn to straightforwardly align the sentence embedding and pixel-level embedding for highlighting the referred objects. We propose CoupAlign, a simple yet effective multi-level visual-semantic alignment method.
arXiv Detail & Related papers (2022-12-04T08:53:42Z)
Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets. The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse. Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
Language-driven Semantic Segmentation [88.21498323896475]
We present LSeg, a novel model for language-driven semantic image segmentation. We use a text encoder to compute embeddings of descriptive input labels. The encoder is trained with a contrastive objective to align pixel embeddings to the text embedding of the corresponding semantic class.
arXiv Detail & Related papers (2022-01-10T18:59:10Z)
Maximize the Exploration of Congeneric Semantics for Weakly Supervised Semantic Segmentation [27.155133686127474]
We construct a graph neural network (P-GNN) based on the self-detected patches from different images that contain the same class labels. We conduct experiments on the popular PASCAL VOC 2012 benchmarks, and our model yields state-of-the-art performance.
arXiv Detail & Related papers (2021-10-08T08:59:16Z)
FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud Segmentation [30.736361776703568]
Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely. Most existing methods simply stack different point attributes/modalities as image channels to increase information capacity. We design FPS-Net, a convolutional fusion network that exploits the uniqueness and discrepancy among the projected image channels for optimal point cloud segmentation.
arXiv Detail & Related papers (2021-03-01T04:08:28Z)
A Holistically-Guided Decoder for Deep Representation Learning with Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps. We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.