Increasing the Robustness of Semantic Segmentation Models with
Painting-by-Numbers
- URL: http://arxiv.org/abs/2010.05495v1
- Date: Mon, 12 Oct 2020 07:42:39 GMT
- Title: Increasing the Robustness of Semantic Segmentation Models with
Painting-by-Numbers
- Authors: Christoph Kamann, Burkhard G\"ussefeld, Robin Hutmacher, Jan Hendrik
Metzen, Carsten Rother
- Abstract summary: We build upon an insight from image classification that output can be improved by increasing the network-bias towards object shapes.
Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color.
We demonstrate the effectiveness of our training schema for DeepLabv3+ with various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it on the Cityscapes dataset.
- Score: 39.95214171175713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For safety-critical applications such as autonomous driving, CNNs have to be
robust with respect to unavoidable image corruptions, such as image noise.
While previous works addressed the task of robust prediction in the context of
full-image classification, we consider it for dense semantic segmentation. We
build upon an insight from image classification that output robustness can be
improved by increasing the network-bias towards object shapes. We present a new
training schema that increases this shape bias. Our basic idea is to
alpha-blend a portion of the RGB training images with faked images, where each
class-label is given a fixed, randomly chosen color that is not likely to
appear in real imagery. This forces the network to rely more strongly on shape
cues. We call this data augmentation technique ``Painting-by-Numbers''. We
demonstrate the effectiveness of our training schema for DeepLabv3+ with
various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it
on the Cityscapes dataset. With respect to our 16 different types of image
corruptions and 5 different network backbones, we are in 74% better than
training with clean data. For cases where we are worse than a model trained
without our training schema, it is mostly only marginally worse. However, for
some image corruptions such as images with noise, we see a considerable
performance gain of up to 25%.
Related papers
- Data Attribution for Text-to-Image Models by Unlearning Synthesized Images [71.23012718682634]
The goal of data attribution for text-to-image models is to identify the training images that most influence the generation of a new image.
We propose a new approach that efficiently identifies highly-influential images.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - Background Invariant Classification on Infrared Imagery by Data
Efficient Training and Reducing Bias in CNNs [1.2891210250935146]
convolutional neural networks can classify objects in images very accurately.
It is well known that the attention of the network may not always be on the semantically important regions of the scene.
We propose a new two-step training procedure called textitsplit training to reduce this bias in CNNs on both Infrared imagery and RGB data.
arXiv Detail & Related papers (2022-01-22T23:29:42Z) - Enhanced Performance of Pre-Trained Networks by Matched Augmentation
Distributions [10.74023489125222]
We propose a simple solution to address the train-test distributional shift.
We combine results for multiple random crops for a test image.
This not only matches the train time augmentation but also provides the full coverage of the input image.
arXiv Detail & Related papers (2022-01-19T22:33:00Z) - Automated Cleanup of the ImageNet Dataset by Model Consensus,
Explainability and Confident Learning [0.0]
ImageNet was the backbone of various convolutional neural networks (CNNs) trained on ILSVRC12Net.
This paper describes automated applications based on model consensus, explainability and confident learning to correct labeling mistakes.
The ImageNet-Clean improves the model performance by 2-2.4 % for SqueezeNet and EfficientNet-B0 models.
arXiv Detail & Related papers (2021-03-30T13:16:35Z) - Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution.
We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human
Pose Estimation [80.02124918255059]
Semi-supervised learning aims to boost the accuracy of a model by exploring unlabeled images.
We learn two networks to mutually teach each other.
The more reliable predictions on easy images in each network are used to teach the other network to learn about the corresponding hard images.
arXiv Detail & Related papers (2020-11-25T03:29:52Z) - Shape-Texture Debiased Neural Network Training [50.6178024087048]
Convolutional Neural Networks are often biased towards either texture or shape, depending on the training dataset.
We develop an algorithm for shape-texture debiased learning.
Experiments show that our method successfully improves model performance on several image recognition benchmarks.
arXiv Detail & Related papers (2020-10-12T19:16:12Z) - FU-net: Multi-class Image Segmentation Using Feedback Weighted U-net [5.193724835939252]
We present a generic deep convolutional neural network (DCNN) for multi-class image segmentation.
It is based on a well-established supervised end-to-end DCNN model, known as U-net.
arXiv Detail & Related papers (2020-04-28T13:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.