Related papers: Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers

Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers

URL: http://arxiv.org/abs/2010.05495v1
Date: Mon, 12 Oct 2020 07:42:39 GMT
Title: Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers
Authors: Christoph Kamann, Burkhard G\"ussefeld, Robin Hutmacher, Jan Hendrik Metzen, Carsten Rother
Abstract summary: We build upon an insight from image classification that output can be improved by increasing the network-bias towards object shapes. Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color. We demonstrate the effectiveness of our training schema for DeepLabv3+ with various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it on the Cityscapes dataset.
Score: 39.95214171175713
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: For safety-critical applications such as autonomous driving, CNNs have to be robust with respect to unavoidable image corruptions, such as image noise. While previous works addressed the task of robust prediction in the context of full-image classification, we consider it for dense semantic segmentation. We build upon an insight from image classification that output robustness can be improved by increasing the network-bias towards object shapes. We present a new training schema that increases this shape bias. Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color that is not likely to appear in real imagery. This forces the network to rely more strongly on shape cues. We call this data augmentation technique ``Painting-by-Numbers''. We demonstrate the effectiveness of our training schema for DeepLabv3+ with various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it on the Cityscapes dataset. With respect to our 16 different types of image corruptions and 5 different network backbones, we are in 74% better than training with clean data. For cases where we are worse than a model trained without our training schema, it is mostly only marginally worse. However, for some image corruptions such as images with noise, we see a considerable performance gain of up to 25%.

Related papers

Data Attribution for Text-to-Image Models by Unlearning Synthesized Images [71.23012718682634]
The goal of data attribution for text-to-image models is to identify the training images that most influence the generation of a new image. We propose a new approach that efficiently identifies highly-influential images.
arXiv Detail & Related papers (2024-06-13T17:59:44Z)
Background Invariant Classification on Infrared Imagery by Data Efficient Training and Reducing Bias in CNNs [1.2891210250935146]
convolutional neural networks can classify objects in images very accurately. It is well known that the attention of the network may not always be on the semantically important regions of the scene. We propose a new two-step training procedure called textitsplit training to reduce this bias in CNNs on both Infrared imagery and RGB data.
arXiv Detail & Related papers (2022-01-22T23:29:42Z)
Enhanced Performance of Pre-Trained Networks by Matched Augmentation Distributions [10.74023489125222]
We propose a simple solution to address the train-test distributional shift. We combine results for multiple random crops for a test image. This not only matches the train time augmentation but also provides the full coverage of the input image.
arXiv Detail & Related papers (2022-01-19T22:33:00Z)
Automated Cleanup of the ImageNet Dataset by Model Consensus, Explainability and Confident Learning [0.0]
ImageNet was the backbone of various convolutional neural networks (CNNs) trained on ILSVRC12Net. This paper describes automated applications based on model consensus, explainability and confident learning to correct labeling mistakes. The ImageNet-Clean improves the model performance by 2-2.4 % for SqueezeNet and EfficientNet-B0 models.
arXiv Detail & Related papers (2021-03-30T13:16:35Z)
Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution. We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z)
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths. In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)
An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation [80.02124918255059]
Semi-supervised learning aims to boost the accuracy of a model by exploring unlabeled images. We learn two networks to mutually teach each other. The more reliable predictions on easy images in each network are used to teach the other network to learn about the corresponding hard images.
arXiv Detail & Related papers (2020-11-25T03:29:52Z)
Shape-Texture Debiased Neural Network Training [50.6178024087048]
Convolutional Neural Networks are often biased towards either texture or shape, depending on the training dataset. We develop an algorithm for shape-texture debiased learning. Experiments show that our method successfully improves model performance on several image recognition benchmarks.
arXiv Detail & Related papers (2020-10-12T19:16:12Z)
FU-net: Multi-class Image Segmentation Using Feedback Weighted U-net [5.193724835939252]
We present a generic deep convolutional neural network (DCNN) for multi-class image segmentation. It is based on a well-established supervised end-to-end DCNN model, known as U-net.
arXiv Detail & Related papers (2020-04-28T13:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.