Robust Mean Teacher for Continual and Gradual Test-Time Adaptation
- URL: http://arxiv.org/abs/2211.13081v2
- Date: Wed, 22 Mar 2023 18:44:42 GMT
- Title: Robust Mean Teacher for Continual and Gradual Test-Time Adaptation
- Authors: Mario D\"obler, Robert A. Marsden, Bin Yang
- Abstract summary: Gradual test-time adaptation (TTA) considers not only a single domain shift, but a sequence of shifts.
We propose and show that in the setting of TTA, the symmetric cross-entropy is better suited as a consistency loss for mean teachers.
We demonstrate the effectiveness of our proposed method 'robust mean teacher' (RMT) on the continual and gradual corruption benchmarks.
- Score: 5.744133015573047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since experiencing domain shifts during test-time is inevitable in practice,
test-time adaption (TTA) continues to adapt the model after deployment.
Recently, the area of continual and gradual test-time adaptation (TTA) emerged.
In contrast to standard TTA, continual TTA considers not only a single domain
shift, but a sequence of shifts. Gradual TTA further exploits the property that
some shifts evolve gradually over time. Since in both settings long test
sequences are present, error accumulation needs to be addressed for methods
relying on self-training. In this work, we propose and show that in the setting
of TTA, the symmetric cross-entropy is better suited as a consistency loss for
mean teachers compared to the commonly used cross-entropy. This is justified by
our analysis with respect to the (symmetric) cross-entropy's gradient
properties. To pull the test feature space closer to the source domain, where
the pre-trained model is well posed, contrastive learning is leveraged. Since
applications differ in their requirements, we address several settings,
including having source data available and the more challenging source-free
setting. We demonstrate the effectiveness of our proposed method 'robust mean
teacher' (RMT) on the continual and gradual corruption benchmarks CIFAR10C,
CIFAR100C, and Imagenet-C. We further consider ImageNet-R and propose a new
continual DomainNet-126 benchmark. State-of-the-art results are achieved on all
benchmarks.
Related papers
- Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Layerwise Early Stopping for Test Time Adaptation [0.2968738145616401]
Test Time Adaptation (TTA) addresses the problem of distribution shift by enabling pretrained models to learn new features on an unseen domain at test time.
It poses a significant challenge to maintain a balance between learning new features and retaining useful pretrained features.
We propose Layerwise EArly STopping (LEAST) for TTA to address this problem.
arXiv Detail & Related papers (2024-04-04T19:55:11Z) - Persistent Test-time Adaptation in Recurring Testing Scenarios [12.024233973321756]
Current test-time adaptation (TTA) approaches aim to adapt a machine learning model to environments that change continuously.
Yet, it is unclear whether TTA methods can maintain their adaptability over prolonged periods.
We propose persistent TTA (PeTTA) which senses when the model is diverging towards collapse and adjusts the adaptation strategy.
arXiv Detail & Related papers (2023-11-30T02:24:44Z) - Towards Real-World Test-Time Adaptation: Tri-Net Self-Training with
Balanced Normalization [52.03927261909813]
Existing works mainly consider real-world test-time adaptation under non-i.i.d. data stream and continual domain shift.
We argue failure of state-of-the-art methods is first caused by indiscriminately adapting normalization layers to imbalanced testing data.
The final TTA model, termed as TRIBE, is built upon a tri-net architecture with balanced batchnorm layers.
arXiv Detail & Related papers (2023-09-26T14:06:26Z) - RDumb: A simple approach that questions our progress in continual test-time adaptation [12.374649969346441]
Test-Time Adaptation (TTA) allows to update pre-trained models to changing data distributions at deployment time.
Recent work proposed and applied methods for continual adaptation over long timescales.
We find that eventually all but one state-of-the-art methods collapse and perform worse than a non-adapting model.
arXiv Detail & Related papers (2023-06-08T17:52:34Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Towards Stable Test-Time Adaptation in Dynamic Wild World [60.98073673220025]
Test-time adaptation (TTA) has shown to be effective at tackling distribution shifts between training and testing data by adapting a given model on test samples.
Online model updating of TTA may be unstable and this is often a key obstacle preventing existing TTA methods from being deployed in the real world.
arXiv Detail & Related papers (2023-02-24T02:03:41Z) - A Probabilistic Framework for Lifelong Test-Time Adaptation [34.07074915005366]
Test-time adaptation (TTA) is the problem of updating a pre-trained source model at inference time given test input(s) from a different target domain.
We present PETAL (Probabilistic lifElong Test-time Adaptation with seLf-training prior), which solves lifelong TTA using a probabilistic approach.
Our method achieves better results than the current state-of-the-art for online lifelong test-time adaptation across various benchmarks.
arXiv Detail & Related papers (2022-12-19T18:42:19Z) - TeST: Test-time Self-Training under Distribution Shift [99.68465267994783]
Test-Time Self-Training (TeST) is a technique that takes as input a model trained on some source data and a novel data distribution at test time.
We find that models adapted using TeST significantly improve over baseline test-time adaptation algorithms.
arXiv Detail & Related papers (2022-09-23T07:47:33Z) - Robust Continual Test-time Adaptation: Instance-aware BN and
Prediction-balanced Memory [58.72445309519892]
We present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams.
Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner.
arXiv Detail & Related papers (2022-08-10T03:05:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.