Test-Time Adaptation to Distribution Shift by Confidence Maximization
and Input Transformation
- URL: http://arxiv.org/abs/2106.14999v1
- Date: Mon, 28 Jun 2021 22:06:10 GMT
- Title: Test-Time Adaptation to Distribution Shift by Confidence Maximization
and Input Transformation
- Authors: Chaithanya Kumar Mummadi, Robin Hutmacher, Kilian Rambach, Evgeny
Levinkov, Thomas Brox, Jan Hendrik Metzen
- Abstract summary: neural networks often exhibit poor performance on data unlikely under the train-time data distribution.
This paper focuses on the fully test-time adaptation setting, where only unlabeled data from the target distribution is required.
We propose a novel loss that improves test-time adaptation by addressing both premature convergence and instability of entropy minimization.
- Score: 44.494319305269535
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep neural networks often exhibit poor performance on data that is unlikely
under the train-time data distribution, for instance data affected by
corruptions. Previous works demonstrate that test-time adaptation to data
shift, for instance using entropy minimization, effectively improves
performance on such shifted distributions. This paper focuses on the fully
test-time adaptation setting, where only unlabeled data from the target
distribution is required. This allows adapting arbitrary pretrained networks.
Specifically, we propose a novel loss that improves test-time adaptation by
addressing both premature convergence and instability of entropy minimization.
This is achieved by replacing the entropy by a non-saturating surrogate and
adding a diversity regularizer based on batch-wise entropy maximization that
prevents convergence to trivial collapsed solutions. Moreover, we propose to
prepend an input transformation module to the network that can partially undo
test-time distribution shifts. Surprisingly, this preprocessing can be learned
solely using the fully test-time adaptation loss in an end-to-end fashion
without any target domain labels or source domain data. We show that our
approach outperforms previous work in improving the robustness of publicly
available pretrained image classifiers to common corruptions on such
challenging benchmarks as ImageNet-C.
Related papers
- Protected Test-Time Adaptation via Online Entropy Matching: A Betting Approach [14.958884168060097]
We present a novel approach for test-time adaptation via online self-training.
Our approach combines concepts in betting martingales and online learning to form a detection tool capable of reacting to distribution shifts.
Experimental results demonstrate that our approach improves test-time accuracy under distribution shifts while maintaining accuracy and calibration in their absence.
arXiv Detail & Related papers (2024-08-14T12:40:57Z) - Addressing Distribution Shift at Test Time in Pre-trained Language
Models [3.655021726150369]
State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks.
PLMs have been found to degrade in performance under distribution shift.
We present an approach that improves the performance of PLMs at test-time under distribution shift.
arXiv Detail & Related papers (2022-12-05T16:04:54Z) - Improving Test-Time Adaptation via Shift-agnostic Weight Regularization
and Nearest Source Prototypes [18.140619966865955]
We propose a novel test-time adaptation strategy that adjusts the model pre-trained on the source domain using only unlabeled online data from the target domain.
We show that our method exhibits state-of-the-art performance on various standard benchmarks and even outperforms its supervised counterpart.
arXiv Detail & Related papers (2022-07-24T10:17:05Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Few-Shot Adaptation of Pre-Trained Networks for Domain Shift [17.123505029637055]
Deep networks are prone to performance degradation when there is a domain shift between the source (training) data and target (test) data.
Recent test-time adaptation methods update batch normalization layers of pre-trained source models deployed in new target environments with streaming data to mitigate such performance degradation.
We propose a framework for few-shot domain adaptation to address the practical challenges of data-efficient adaptation.
arXiv Detail & Related papers (2022-05-30T16:49:59Z) - Continual Test-Time Domain Adaptation [94.51284735268597]
Test-time domain adaptation aims to adapt a source pre-trained model to a target domain without using any source data.
CoTTA is easy to implement and can be readily incorporated in off-the-shelf pre-trained models.
arXiv Detail & Related papers (2022-03-25T11:42:02Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - SENTRY: Selective Entropy Optimization via Committee Consistency for
Unsupervised Domain Adaptation [14.086066389856173]
We propose a UDA algorithm that judges the reliability of a target instance based on its predictive consistency under a committee of random image transformations.
Our algorithm then selectively minimizes predictive entropy to increase confidence on highly consistent target instances, while maximizing predictive entropy to reduce confidence on highly inconsistent ones.
In combination with pseudo-label based approximate target class balancing, our approach leads to significant improvements over the state-of-the-art on 27/31 domain shifts from standard UDA benchmarks as well as benchmarks designed to stress-test adaptation under label distribution shift.
arXiv Detail & Related papers (2020-12-21T16:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.