Related papers: Input Similarity from the Neural Network Perspective

Input Similarity from the Neural Network Perspective

URL: http://arxiv.org/abs/2102.05262v1
Date: Wed, 10 Feb 2021 04:57:30 GMT
Title: Input Similarity from the Neural Network Perspective
Authors: Guillaume Charpiat, Nicolas Girard, Loris Felardos, Yuliya Tarabalka
Abstract summary: A neural network trained on a dataset with noisy labels reaches almost perfect accuracy. We show how to use a similarity measure to estimate sample density. We also propose to enforce that examples known to be similar should also be seen as similar by the network.
Score: 7.799648230758492
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We first exhibit a multimodal image registration task, for which a neural network trained on a dataset with noisy labels reaches almost perfect accuracy, far beyond noise variance. This surprising auto-denoising phenomenon can be explained as a noise averaging effect over the labels of similar input examples. This effect theoretically grows with the number of similar examples; the question is then to define and estimate the similarity of examples. We express a proper definition of similarity, from the neural network perspective, i.e. we quantify how undissociable two inputs $A$ and $B$ are, taking a machine learning viewpoint: how much a parameter variation designed to change the output for $A$ would impact the output for $B$ as well? We study the mathematical properties of this similarity measure, and show how to use it on a trained network to estimate sample density, in low complexity, enabling new types of statistical analysis for neural networks. We analyze data by retrieving samples perceived as similar by the network, and are able to quantify the denoising effect without requiring true labels. We also propose, during training, to enforce that examples known to be similar should also be seen as similar by the network, and notice speed-up training effects for certain datasets.

Related papers

A Novel Explainable Out-of-Distribution Detection Approach for Spiking Neural Networks [6.100274095771616]
This work presents a novel OoD detector that can identify whether test examples input to a Spiking Neural Network belong to the distribution of the data over which it was trained. We characterize the internal activations of the hidden layers of the network in the form of spike count patterns. A local explanation method is devised to produce attribution maps revealing which parts of the input instance push most towards the detection of an example as an OoD sample.
arXiv Detail & Related papers (2022-09-30T11:16:35Z)
Understanding Weight Similarity of Neural Networks via Chain Normalization Rule and Hypothesis-Training-Testing [58.401504709365284]
We present a weight similarity measure that can quantify the weight similarity of non-volution neural networks. We first normalize the weights of neural networks by a chain normalization rule, which is used to introduce weight-training representation learning. We extend traditional hypothesis-testing method to validate the hypothesis on the weight similarity of neural networks.
arXiv Detail & Related papers (2022-08-08T19:11:03Z)
Learning from Data with Noisy Labels Using Temporal Self-Ensemble [11.245833546360386]
Deep neural networks (DNNs) have an enormous capacity to memorize noisy labels. Current state-of-the-art methods present a co-training scheme that trains dual networks using samples associated with small losses. We propose a simple yet effective robust training scheme that operates by training only a single network.
arXiv Detail & Related papers (2022-07-21T08:16:31Z)
Regularization by Misclassification in ReLU Neural Networks [3.288086999241324]
We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label. We show that label noise propels the network to a sparse solution in the following sense: for a typical input, a small fraction of neurons are active, and the firing pattern of the hidden layers is sparser.
arXiv Detail & Related papers (2021-11-03T11:42:38Z)
Slope and generalization properties of neural networks [0.0]
We show that the distribution of the slope of a well-trained neural network classifier is generally independent of the width of the layers in a fully connected network. The slope is of similar size throughout the relevant volume, and varies smoothly. It also behaves as predicted in rescaling examples. We discuss possible applications of the slope concept, such as using it as a part of the loss function or stopping criterion during network training, or ranking data sets in terms of their complexity.
arXiv Detail & Related papers (2021-07-03T17:54:27Z)
Adversarial Examples Detection with Bayesian Neural Network [57.185482121807716]
We propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors. We propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example detection.
arXiv Detail & Related papers (2021-05-18T15:51:24Z)
Toward Scalable and Unified Example-based Explanation and Outlier Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction. We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z)
Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. We propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z)
What Do Neural Networks Learn When Trained With Random Labels? [20.54410239839646]
We study deep neural networks (DNNs) trained on natural image data with entirely random labels. We show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch.
arXiv Detail & Related papers (2020-06-18T12:07:22Z)
Robust and On-the-fly Dataset Denoising for Image Classification [72.10311040730815]
On-the-fly Data Denoising (ODD) is robust to mislabeled examples, while introducing almost zero computational overhead compared to standard training. ODD is able to achieve state-of-the-art results on a wide range of datasets including real-world ones such as WebVision and Clothing1M.
arXiv Detail & Related papers (2020-03-24T03:59:26Z)
Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning. The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.