Related papers: Generalizing Neural Networks by Reflecting Deviating Data in Production

Generalizing Neural Networks by Reflecting Deviating Data in Production

URL: http://arxiv.org/abs/2110.02718v1
Date: Wed, 6 Oct 2021 13:05:45 GMT
Title: Generalizing Neural Networks by Reflecting Deviating Data in Production
Authors: Yan Xiao and Yun Lin and Ivan Beschastnikh and Changsheng Sun and David S. Rosenblum and Jin Song Dong
Abstract summary: We present a runtime approach that mitigates DNN mis-predictions caused by unexpected runtime inputs to the DNN. We use a distribution analyzer based on the distance metric learned by a Siamese network to identify "unseen" semantically-preserving inputs. Our approach transforms those unexpected inputs into inputs from the training set that are identified as having similar semantics.
Score: 15.498447555957773
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Trained with a sufficiently large training and testing dataset, Deep Neural Networks (DNNs) are expected to generalize. However, inputs may deviate from the training dataset distribution in real deployments. This is a fundamental issue with using a finite dataset. Even worse, real inputs may change over time from the expected distribution. Taken together, these issues may lead deployed DNNs to mis-predict in production. In this work, we present a runtime approach that mitigates DNN mis-predictions caused by the unexpected runtime inputs to the DNN. In contrast to previous work that considers the structure and parameters of the DNN itself, our approach treats the DNN as a blackbox and focuses on the inputs to the DNN. Our approach has two steps. First, it recognizes and distinguishes "unseen" semantically-preserving inputs. For this we use a distribution analyzer based on the distance metric learned by a Siamese network. Second, our approach transforms those unexpected inputs into inputs from the training set that are identified as having similar semantics. We call this process input reflection and formulate it as a search problem over the embedding space on the training set. This embedding space is learned by a Quadruplet network as an auxiliary model for the subject model to improve the generalization. We implemented a tool called InputReflector based on the above two-step approach and evaluated it with experiments on three DNN models trained on CIFAR-10, MNIST, and FMINST image datasets. The results show that InputReflector can effectively distinguish inputs that retain semantics of the distribution (e.g., blurred, brightened, contrasted, and zoomed images) and out-of-distribution inputs from normal inputs.

Related papers

Use of Parallel Explanatory Models to Enhance Transparency of Neural Network Configurations for Cell Degradation Detection [18.214293024118145]
We build a parallel model to illuminate and understand the internal operation of neural networks. We show how each layer of the RNN transforms the input distributions to increase detection accuracy. At the same time we also discover a side effect acting to limit the improvement in accuracy.
arXiv Detail & Related papers (2024-04-17T12:22:54Z)
Deep Networks Always Grok and Here is Why [15.327649172531606]
Grokking, or delayed generalization, is a phenomenon where generalization in a deep neural network (DNN) occurs long after achieving near zero training error. We demonstrate that grokking is actually much more widespread and materializes in a wide range of practical settings.
arXiv Detail & Related papers (2024-02-23T18:59:31Z)
Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in Deployment [25.527665632625627]
It is important to reason about the trustworthiness of the model's predictions with unseen data during deployment. Existing methods for specifying and verifying traditional software are insufficient for this task. We propose a novel technique that uses rules derived from neural network computations to infer data preconditions.
arXiv Detail & Related papers (2024-01-26T03:47:18Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z)
Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks [29.975937981538664]
We introduce a novel approach that allows end-users to exploit pretrained DNN models in their own testing environment without modifying the models. We present a textitmeta input which is an additional input transforming the distribution of testing data to be aligned with that of training data. As a result, end-users can exploit well-trained models in their own testing environment which can differ from the training environment.
arXiv Detail & Related papers (2022-10-21T02:11:38Z)
Invertible Neural Networks for Graph Prediction [22.140275054568985]
In this work, we address conditional generation using deep invertible neural networks. We adopt an end-to-end training approach since our objective is to address prediction and generation in the forward and backward processes at once.
arXiv Detail & Related papers (2022-06-02T17:28:33Z)
DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA) Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution. Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z)
A Survey on Assessing the Generalization Envelope of Deep Neural Networks: Predictive Uncertainty, Out-of-distribution and Adversarial Samples [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications. It is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent. This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs.
arXiv Detail & Related papers (2020-08-21T09:12:52Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs) NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.