RDumb: A simple approach that questions our progress in continual test-time adaptation
- URL: http://arxiv.org/abs/2306.05401v3
- Date: Wed, 3 Apr 2024 10:16:22 GMT
- Title: RDumb: A simple approach that questions our progress in continual test-time adaptation
- Authors: Ori Press, Steffen Schneider, Matthias Kümmerer, Matthias Bethge,
- Abstract summary: Test-Time Adaptation (TTA) allows to update pre-trained models to changing data distributions at deployment time.
Recent work proposed and applied methods for continual adaptation over long timescales.
We find that eventually all but one state-of-the-art methods collapse and perform worse than a non-adapting model.
- Score: 12.374649969346441
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test-Time Adaptation (TTA) allows to update pre-trained models to changing data distributions at deployment time. While early work tested these algorithms for individual fixed distribution shifts, recent work proposed and applied methods for continual adaptation over long timescales. To examine the reported progress in the field, we propose the Continually Changing Corruptions (CCC) benchmark to measure asymptotic performance of TTA techniques. We find that eventually all but one state-of-the-art methods collapse and perform worse than a non-adapting model, including models specifically proposed to be robust to performance collapse. In addition, we introduce a simple baseline, "RDumb", that periodically resets the model to its pretrained state. RDumb performs better or on par with the previously proposed state-of-the-art in all considered benchmarks. Our results show that previous TTA approaches are neither effective at regularizing adaptation to avoid collapse nor able to outperform a simplistic resetting strategy.
Related papers
- Test-Time Model Adaptation with Only Forward Passes [68.11784295706995]
Test-time adaptation has proven effective in adapting a given trained model to unseen test samples with potential distribution shifts.
We propose a test-time Forward-Optimization Adaptation (FOA) method.
FOA runs on quantized 8-bit ViT, outperforms gradient-based TENT on full-precision 32-bit ViT, and achieves an up to 24-fold memory reduction on ImageNet-C.
arXiv Detail & Related papers (2024-04-02T05:34:33Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Persistent Test-time Adaptation in Episodic Testing Scenarios [13.514033978964308]
Current test-time adaptation approaches aim to adapt to environments that change continuously.
It is unclear whether the adaptability of these methods is sustained after a long run.
This study proposes a novel testing setting called episodic TTA.
arXiv Detail & Related papers (2023-11-30T02:24:44Z) - AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation [16.85284386728494]
We propose to validate test-time adaptation methods using datasets for autonomous driving, namely CLAD-C and SHIFT.
We observe that current test-time adaptation methods struggle to effectively handle varying degrees of domain shift.
The proposed method, named AR-TTA, outperforms existing approaches on both synthetic and more real-world benchmarks.
arXiv Detail & Related papers (2023-09-18T19:34:23Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Universal Test-time Adaptation through Weight Ensembling, Diversity
Weighting, and Prior Correction [3.5139431332194198]
Test-time adaptation (TTA) continues to update the model after deployment, leveraging the current test data.
We identify and highlight several challenges a self-training based method has to deal with.
To prevent the model from becoming biased, we leverage a dataset and model-agnostic certainty and diversity weighting.
arXiv Detail & Related papers (2023-06-01T13:16:10Z) - Test-Time Adaptation with Perturbation Consistency Learning [32.58879780726279]
We propose a simple test-time adaptation method to promote the model to make stable predictions for samples with distribution shifts.
Our method can achieve higher or comparable performance with less inference time over strong PLM backbones.
arXiv Detail & Related papers (2023-04-25T12:29:22Z) - TeST: Test-time Self-Training under Distribution Shift [99.68465267994783]
Test-Time Self-Training (TeST) is a technique that takes as input a model trained on some source data and a novel data distribution at test time.
We find that models adapted using TeST significantly improve over baseline test-time adaptation algorithms.
arXiv Detail & Related papers (2022-09-23T07:47:33Z) - Robust Continual Test-time Adaptation: Instance-aware BN and
Prediction-balanced Memory [58.72445309519892]
We present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams.
Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner.
arXiv Detail & Related papers (2022-08-10T03:05:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.