Related papers: Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks

Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks

URL: http://arxiv.org/abs/2110.01955v1
Date: Tue, 5 Oct 2021 11:36:25 GMT
Title: Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks
Authors: Alexander Fuchs, Christian Knoll, Franz Pernkopf
Abstract summary: normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
Score: 86.42889611784855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks rely heavily on normalization methods to improve their performance and learning behavior. Although normalization methods spurred the development of increasingly deep and efficient architectures, they also increase the vulnerability with respect to noise and input corruptions. In most applications, however, noise is ubiquitous and diverse; this can often lead to complete failure of machine learning systems as they fail to cope with mismatches between the input distribution during training- and test-time. The most common normalization method, batch normalization, reduces the distribution shift during training but is agnostic to changes in the input distribution during test time. This makes batch normalization prone to performance degradation whenever noise is present during test-time. Sample-based normalization methods can correct linear transformations of the activation distribution but cannot mitigate changes in the distribution shape; this makes the network vulnerable to distribution changes that cannot be reflected in the normalization parameters. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. This reduces the mismatch between the training and test-time distribution by minimizing the 1-D Wasserstein distance. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions and thus improves the classification performance without the need for retraining or fine-tuning the model.

Related papers

Protected Test-Time Adaptation via Online Entropy Matching: A Betting Approach [14.958884168060097]
We present a novel approach for test-time adaptation via online self-training. Our approach combines concepts in betting martingales and online learning to form a detection tool capable of reacting to distribution shifts. Experimental results demonstrate that our approach improves test-time accuracy under distribution shifts while maintaining accuracy and calibration in their absence.
arXiv Detail & Related papers (2024-08-14T12:40:57Z)
Quantile Activation: Correcting a Failure Mode of ML Models [4.035209200949511]
We propose a simple activation function, quantile activation (QAct) that addresses this problem without significantly increasing computational costs. The proposed quantile activation (QAct) outputs the relative quantile position of neuron activations within their context distribution. We find that this approach unexpectedly outperforms DINOv2 (small), despite DINOv2 being trained with a much larger network and dataset.
arXiv Detail & Related papers (2024-05-19T14:42:19Z)
DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration [38.4461170690033]
We propose a novel fine-tuning framework, namely distribution regularization with semantic calibration (DR-Tune) DR-Tune employs distribution regularization by enforcing the downstream task head to decrease its classification error on the pretrained feature distribution. To alleviate the interference by semantic drift, we develop the semantic calibration (SC) module.
arXiv Detail & Related papers (2023-08-23T10:59:20Z)
AltUB: Alternating Training Method to Update Base Distribution of Normalizing Flow for Anomaly Detection [1.3999481573773072]
Unsupervised anomaly detection is coming into the spotlight these days in various practical domains. One of the major approaches for it is a normalizing flow which pursues the invertible transformation of a complex distribution as images into an easy distribution as N(0, I)
arXiv Detail & Related papers (2022-10-26T16:31:15Z)
Continual Test-Time Domain Adaptation [94.51284735268597]
Test-time domain adaptation aims to adapt a source pre-trained model to a target domain without using any source data. CoTTA is easy to implement and can be readily incorporated in off-the-shelf pre-trained models.
arXiv Detail & Related papers (2022-03-25T11:42:02Z)
Test-Time Adaptation to Distribution Shift by Confidence Maximization and Input Transformation [44.494319305269535]
neural networks often exhibit poor performance on data unlikely under the train-time data distribution. This paper focuses on the fully test-time adaptation setting, where only unlabeled data from the target distribution is required. We propose a novel loss that improves test-time adaptation by addressing both premature convergence and instability of entropy minimization.
arXiv Detail & Related papers (2021-06-28T22:06:10Z)
KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications. A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z)
Training Deep Neural Networks Without Batch Normalization [4.266320191208303]
This work studies batch normalization in detail, while comparing it with other methods such as weight normalization, gradient clipping and dropout. The main purpose of this work is to determine if it is possible to train networks effectively when batch normalization is removed through adaption of the training process.
arXiv Detail & Related papers (2020-08-18T15:04:40Z)
Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment [52.02794488304448]
We propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows. We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.
arXiv Detail & Related papers (2020-03-26T22:10:04Z)
Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification. We empirically show that embedding propagation yields a smoother embedding manifold. We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.