Related papers: Background Invariant Classification on Infrared Imagery by Data Efficient Training and Reducing Bias in CNNs

Background Invariant Classification on Infrared Imagery by Data Efficient Training and Reducing Bias in CNNs

URL: http://arxiv.org/abs/2201.09144v1
Date: Sat, 22 Jan 2022 23:29:42 GMT
Title: Background Invariant Classification on Infrared Imagery by Data Efficient Training and Reducing Bias in CNNs
Authors: Maliha Arif, Calvin Yong, Abhijit Mahalanobis
Abstract summary: convolutional neural networks can classify objects in images very accurately. It is well known that the attention of the network may not always be on the semantically important regions of the scene. We propose a new two-step training procedure called textitsplit training to reduce this bias in CNNs on both Infrared imagery and RGB data.
Score: 1.2891210250935146
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Even though convolutional neural networks can classify objects in images very accurately, it is well known that the attention of the network may not always be on the semantically important regions of the scene. It has been observed that networks often learn background textures which are not relevant to the object of interest. In turn this makes the networks susceptible to variations and changes in the background which negatively affect their performance. We propose a new two-step training procedure called \textit{split training} to reduce this bias in CNNs on both Infrared imagery and RGB data. Our split training procedure has two steps: using MSE loss first train the layers of the network on images with background to match the activations of the same network when it is trained using images without background; then with these layers frozen, train the rest of the network with cross-entropy loss to classify the objects. Our training method outperforms the traditional training procedure in both a simple CNN architecture, and deep CNNs like VGG and Densenet which use lots of hardware resources, and learns to mimic human vision which focuses more on shape and structure than background with higher accuracy.

Related papers

Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks [25.691362553629588]
We study the rank of convolutional neural networks (CNNs) trained by gradient descent. We prove that CNNs trained by gradient descent can learn the intrinsic dimension of clean images, despite the presence of relatively large background noises.
arXiv Detail & Related papers (2025-04-11T15:29:55Z)
Neural Maximum A Posteriori Estimation on Unpaired Data for Motion Deblurring [87.97330195531029]
We propose a Neural Maximum A Posteriori (NeurMAP) estimation framework for training neural networks to recover blind motion information and sharp content from unpaired data. The proposed NeurMAP is an approach to existing deblurring neural networks, and is the first framework that enables training image deblurring networks on unpaired datasets.
arXiv Detail & Related papers (2022-04-26T08:09:47Z)
Is Deep Image Prior in Need of a Good Education? [57.3399060347311]
Deep image prior was introduced as an effective prior for image reconstruction. Despite its impressive reconstructive properties, the approach is slow when compared to learned or traditional reconstruction techniques. We develop a two-stage learning paradigm to address the computational challenge.
arXiv Detail & Related papers (2021-11-23T15:08:26Z)
Weakly-supervised fire segmentation by visualizing intermediate CNN layers [82.75113406937194]
Fire localization in images and videos is an important step for an autonomous system to combat fire incidents. We consider weakly supervised segmentation of fire in images, in which only image labels are used to train the network. We show that in the case of fire segmentation, which is a binary segmentation problem, the mean value of features in a mid-layer of classification CNN can perform better than conventional Class Activation Mapping (CAM) method.
arXiv Detail & Related papers (2021-11-16T11:56:28Z)
ResMLP: Feedforward networks for image classification with data-efficient training [73.26364887378597]
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. We will share our code based on the Timm library and pre-trained models.
arXiv Detail & Related papers (2021-05-07T17:31:44Z)
Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution. We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z)
Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers [39.95214171175713]
We build upon an insight from image classification that output can be improved by increasing the network-bias towards object shapes. Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color. We demonstrate the effectiveness of our training schema for DeepLabv3+ with various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it on the Cityscapes dataset.
arXiv Detail & Related papers (2020-10-12T07:42:39Z)
Deep Artifact-Free Residual Network for Single Image Super-Resolution [0.2399911126932526]
We propose Deep Artifact-Free Residual (DAFR) network which uses the merits of both residual learning and usage of ground-truth image as target. Our framework uses a deep model to extract the high-frequency information which is necessary for high-quality image reconstruction. Our experimental results show that the proposed method achieves better quantitative and qualitative image quality compared to the existing methods.
arXiv Detail & Related papers (2020-09-25T20:53:55Z)
The Neural Tangent Link Between CNN Denoisers and Non-Local Filters [4.254099382808598]
Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. We introduce a formal link between such networks through their neural kernel tangent (NTK) and well-known non-local filtering techniques. We evaluate our findings via extensive image denoising experiments.
arXiv Detail & Related papers (2020-06-03T16:50:54Z)
Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation. We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters. As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
Retrain or not retrain? -- efficient pruning methods of deep CNN networks [0.30458514384586394]
Convolutional neural networks (CNN) play a major role in image processing tasks like image classification, object detection, semantic segmentation. Very often CNN networks have from several to hundred stacked layers with several megabytes of weights. One of the possible methods to reduce complexity and memory footprint is pruning.
arXiv Detail & Related papers (2020-02-12T23:24:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.