Evidential fully convolutional network for semantic segmentation
- URL: http://arxiv.org/abs/2103.13544v1
- Date: Thu, 25 Mar 2021 01:21:22 GMT
- Title: Evidential fully convolutional network for semantic segmentation
- Authors: Zheng Tong, Philippe Xu, Thierry Den{\oe}ux
- Abstract summary: We propose a hybrid architecture composed of a fully convolutional network (FCN) and a Dempster-Shafer layer for image semantic segmentation.
Experiments show that the proposed combination improves the accuracy and calibration of semantic segmentation by assigning confusing pixels to multi-class sets.
- Score: 6.230751621285322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a hybrid architecture composed of a fully convolutional network
(FCN) and a Dempster-Shafer layer for image semantic segmentation. In the
so-called evidential FCN (E-FCN), an encoder-decoder architecture first
extracts pixel-wise feature maps from an input image. A Dempster-Shafer layer
then computes mass functions at each pixel location based on distances to
prototypes. Finally, a utility layer performs semantic segmentation from mass
functions and allows for imprecise classification of ambiguous pixels and
outliers. We propose an end-to-end learning strategy for jointly updating the
network parameters, which can make use of soft (imprecise) labels. Experiments
using three databases (Pascal VOC 2011, MIT-scene Parsing and SIFT Flow) show
that the proposed combination improves the accuracy and calibration of semantic
segmentation by assigning confusing pixels to multi-class sets.
Related papers
- Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels [53.8817160001038]
We propose a novel method, PixelCLIP, to adapt the CLIP image encoder for pixel-level understanding.
To address the challenges of leveraging masks without semantic labels, we devise an online clustering algorithm.
PixelCLIP shows significant performance improvements over CLIP and competitive results compared to caption-supervised methods.
arXiv Detail & Related papers (2024-09-30T01:13:03Z) - MacFormer: Semantic Segmentation with Fine Object Boundaries [38.430631361558426]
We introduce a new semantic segmentation architecture, MacFormer'', which features two key components.
Firstly, using learnable agent tokens, a Mutual Agent Cross-Attention (MACA) mechanism effectively facilitates the bidirectional integration of features across encoder and decoder layers.
Secondly, a Frequency Enhancement Module (FEM) in the decoder leverages high-frequency and low-frequency components to boost features in the frequency domain.
MacFormer is demonstrated to be compatible with various network architectures and outperforms existing methods in both accuracy and efficiency on datasets benchmark ADE20K and Cityscapes.
arXiv Detail & Related papers (2024-08-11T05:36:10Z) - Semi-supervised segmentation of land cover images using nonlinear
canonical correlation analysis with multiple features and t-SNE [1.7000283696243563]
Image segmentation is a clustering task whereby each pixel is assigned a cluster label.
In this work, by resorting to label only a small quantity of pixels, a new semi-supervised segmentation approach is proposed.
The proposed semi-supervised RBF-CCA algorithm has been implemented on several remotely sensed multispectral images.
arXiv Detail & Related papers (2024-01-22T17:56:07Z) - FuseNet: Self-Supervised Dual-Path Network for Medical Image
Segmentation [3.485615723221064]
FuseNet is a dual-stream framework for self-supervised semantic segmentation.
Cross-modal fusion technique extends the principles of CLIP by replacing textual data with augmented images.
experiments on skin lesion and lung segmentation datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-11-22T00:03:16Z) - CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for
Referring Image Segmentation [104.5033800500497]
Referring image segmentation aims at localizing all pixels of the visual objects described by a natural language sentence.
Previous works learn to straightforwardly align the sentence embedding and pixel-level embedding for highlighting the referred objects.
We propose CoupAlign, a simple yet effective multi-level visual-semantic alignment method.
arXiv Detail & Related papers (2022-12-04T08:53:42Z) - Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training
of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets.
The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse.
Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Language-driven Semantic Segmentation [88.21498323896475]
We present LSeg, a novel model for language-driven semantic image segmentation.
We use a text encoder to compute embeddings of descriptive input labels.
The encoder is trained with a contrastive objective to align pixel embeddings to the text embedding of the corresponding semantic class.
arXiv Detail & Related papers (2022-01-10T18:59:10Z) - Maximize the Exploration of Congeneric Semantics for Weakly Supervised
Semantic Segmentation [27.155133686127474]
We construct a graph neural network (P-GNN) based on the self-detected patches from different images that contain the same class labels.
We conduct experiments on the popular PASCAL VOC 2012 benchmarks, and our model yields state-of-the-art performance.
arXiv Detail & Related papers (2021-10-08T08:59:16Z) - FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point
Cloud Segmentation [30.736361776703568]
Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely.
Most existing methods simply stack different point attributes/modalities as image channels to increase information capacity.
We design FPS-Net, a convolutional fusion network that exploits the uniqueness and discrepancy among the projected image channels for optimal point cloud segmentation.
arXiv Detail & Related papers (2021-03-01T04:08:28Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.