Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks
- URL: http://arxiv.org/abs/2110.01955v1
- Date: Tue, 5 Oct 2021 11:36:25 GMT
- Title: Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks
- Authors: Alexander Fuchs, Christian Knoll, Franz Pernkopf
- Abstract summary: normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
- Score: 86.42889611784855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks rely heavily on normalization methods to improve their
performance and learning behavior. Although normalization methods spurred the
development of increasingly deep and efficient architectures, they also
increase the vulnerability with respect to noise and input corruptions. In most
applications, however, noise is ubiquitous and diverse; this can often lead to
complete failure of machine learning systems as they fail to cope with
mismatches between the input distribution during training- and test-time. The
most common normalization method, batch normalization, reduces the distribution
shift during training but is agnostic to changes in the input distribution
during test time. This makes batch normalization prone to performance
degradation whenever noise is present during test-time. Sample-based
normalization methods can correct linear transformations of the activation
distribution but cannot mitigate changes in the distribution shape; this makes
the network vulnerable to distribution changes that cannot be reflected in the
normalization parameters. We propose an unsupervised non-parametric
distribution correction method that adapts the activation distribution of each
layer. This reduces the mismatch between the training and test-time
distribution by minimizing the 1-D Wasserstein distance. In our experiments, we
empirically show that the proposed method effectively reduces the impact of
intense image corruptions and thus improves the classification performance
without the need for retraining or fine-tuning the model.
Related papers
- Protected Test-Time Adaptation via Online Entropy Matching: A Betting Approach [14.958884168060097]
We present a novel approach for test-time adaptation via online self-training.
Our approach combines concepts in betting martingales and online learning to form a detection tool capable of reacting to distribution shifts.
Experimental results demonstrate that our approach improves test-time accuracy under distribution shifts while maintaining accuracy and calibration in their absence.
arXiv Detail & Related papers (2024-08-14T12:40:57Z) - DR-Tune: Improving Fine-tuning of Pretrained Visual Models by
Distribution Regularization with Semantic Calibration [38.4461170690033]
We propose a novel fine-tuning framework, namely distribution regularization with semantic calibration (DR-Tune)
DR-Tune employs distribution regularization by enforcing the downstream task head to decrease its classification error on the pretrained feature distribution.
To alleviate the interference by semantic drift, we develop the semantic calibration (SC) module.
arXiv Detail & Related papers (2023-08-23T10:59:20Z) - AltUB: Alternating Training Method to Update Base Distribution of
Normalizing Flow for Anomaly Detection [1.3999481573773072]
Unsupervised anomaly detection is coming into the spotlight these days in various practical domains.
One of the major approaches for it is a normalizing flow which pursues the invertible transformation of a complex distribution as images into an easy distribution as N(0, I)
arXiv Detail & Related papers (2022-10-26T16:31:15Z) - Continual Test-Time Domain Adaptation [94.51284735268597]
Test-time domain adaptation aims to adapt a source pre-trained model to a target domain without using any source data.
CoTTA is easy to implement and can be readily incorporated in off-the-shelf pre-trained models.
arXiv Detail & Related papers (2022-03-25T11:42:02Z) - Test-Time Adaptation to Distribution Shift by Confidence Maximization
and Input Transformation [44.494319305269535]
neural networks often exhibit poor performance on data unlikely under the train-time data distribution.
This paper focuses on the fully test-time adaptation setting, where only unlabeled data from the target distribution is required.
We propose a novel loss that improves test-time adaptation by addressing both premature convergence and instability of entropy minimization.
arXiv Detail & Related papers (2021-06-28T22:06:10Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Training Deep Neural Networks Without Batch Normalization [4.266320191208303]
This work studies batch normalization in detail, while comparing it with other methods such as weight normalization, gradient clipping and dropout.
The main purpose of this work is to determine if it is possible to train networks effectively when batch normalization is removed through adaption of the training process.
arXiv Detail & Related papers (2020-08-18T15:04:40Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z) - Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable
Neural Distribution Alignment [52.02794488304448]
We propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows.
We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.
arXiv Detail & Related papers (2020-03-26T22:10:04Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.