Layerwise Early Stopping for Test Time Adaptation
- URL: http://arxiv.org/abs/2404.03784v1
- Date: Thu, 4 Apr 2024 19:55:11 GMT
- Title: Layerwise Early Stopping for Test Time Adaptation
- Authors: Sabyasachi Sahoo, Mostafa ElAraby, Jonas Ngnawe, Yann Pequignot, Frederic Precioso, Christian Gagne,
- Abstract summary: Test Time Adaptation (TTA) addresses the problem of distribution shift by enabling pretrained models to learn new features on an unseen domain at test time.
It poses a significant challenge to maintain a balance between learning new features and retaining useful pretrained features.
We propose Layerwise EArly STopping (LEAST) for TTA to address this problem.
- Score: 0.2968738145616401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test Time Adaptation (TTA) addresses the problem of distribution shift by enabling pretrained models to learn new features on an unseen domain at test time. However, it poses a significant challenge to maintain a balance between learning new features and retaining useful pretrained features. In this paper, we propose Layerwise EArly STopping (LEAST) for TTA to address this problem. The key idea is to stop adapting individual layers during TTA if the features being learned do not appear beneficial for the new domain. For that purpose, we propose using a novel gradient-based metric to measure the relevance of the current learnt features to the new domain without the need for supervised labels. More specifically, we propose to use this metric to determine dynamically when to stop updating each layer during TTA. This enables a more balanced adaptation, restricted to layers benefiting from it, and only for a certain number of steps. Such an approach also has the added effect of limiting the forgetting of pretrained features useful for dealing with new domains. Through extensive experiments, we demonstrate that Layerwise Early Stopping improves the performance of existing TTA approaches across multiple datasets, domain shifts, model architectures, and TTA losses.
Related papers
- Enhancing Test Time Adaptation with Few-shot Guidance [35.13317598777832]
Deep neural networks often encounter significant performance drops while facing with domain shifts between training (source) and test (target) data.
Test Time Adaptation (TTA) methods have been proposed to adapt pre-trained source model to handle out-of-distribution streaming target data.
We develop Few-Shot Test Time Adaptation (FS-TTA), a novel and practical setting that utilizes a few-shot support set on top of TTA.
arXiv Detail & Related papers (2024-09-02T15:50:48Z) - Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning [65.57123249246358]
We propose ExpAndable Subspace Ensemble (EASE) for PTM-based CIL.
We train a distinct lightweight adapter module for each new task, aiming to create task-specific subspaces.
Our prototype complement strategy synthesizes old classes' new features without using any old class instance.
arXiv Detail & Related papers (2024-03-18T17:58:13Z) - What, How, and When Should Object Detectors Update in Continually
Changing Test Domains? [34.13756022890991]
Test-time adaptation algorithms have been proposed to adapt the model online while inferring test data.
We propose a novel online adaption approach for object detection in continually changing test domains.
Our approach surpasses baselines on widely used benchmarks, achieving improvements of up to 4.9%p and 7.9%p in mAP.
arXiv Detail & Related papers (2023-12-12T07:13:08Z) - Persistent Test-time Adaptation in Recurring Testing Scenarios [12.024233973321756]
Current test-time adaptation (TTA) approaches aim to adapt a machine learning model to environments that change continuously.
Yet, it is unclear whether TTA methods can maintain their adaptability over prolonged periods.
We propose persistent TTA (PeTTA) which senses when the model is diverging towards collapse and adjusts the adaptation strategy.
arXiv Detail & Related papers (2023-11-30T02:24:44Z) - From Question to Exploration: Test-Time Adaptation in Semantic Segmentation? [21.27237423511349]
Test-time adaptation (TTA) aims to adapt a model, initially trained on training data, to test data with potential distribution shifts.
We investigate the applicability of existing classic TTA strategies in semantic segmentation.
arXiv Detail & Related papers (2023-10-09T01:59:49Z) - Towards Real-World Test-Time Adaptation: Tri-Net Self-Training with
Balanced Normalization [52.03927261909813]
Existing works mainly consider real-world test-time adaptation under non-i.i.d. data stream and continual domain shift.
We argue failure of state-of-the-art methods is first caused by indiscriminately adapting normalization layers to imbalanced testing data.
The final TTA model, termed as TRIBE, is built upon a tri-net architecture with balanced batchnorm layers.
arXiv Detail & Related papers (2023-09-26T14:06:26Z) - REALM: Robust Entropy Adaptive Loss Minimization for Improved
Single-Sample Test-Time Adaptation [5.749155230209001]
Fully-test-time adaptation (F-TTA) can mitigate performance loss due to distribution shifts between train and test data.
We present a general framework for improving robustness of F-TTA to noisy samples, inspired by self-paced learning and robust loss functions.
arXiv Detail & Related papers (2023-09-07T18:44:58Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Improved Test-Time Adaptation for Domain Generalization [48.239665441875374]
Test-time training (TTT) adapts the learned model with test data.
This work addresses two main factors: selecting an appropriate auxiliary TTT task for updating and identifying reliable parameters to update during the test phase.
We introduce additional adaptive parameters for the trained model, and we suggest only updating the adaptive parameters during the test phase.
arXiv Detail & Related papers (2023-04-10T10:12:38Z) - Towards Stable Test-Time Adaptation in Dynamic Wild World [60.98073673220025]
Test-time adaptation (TTA) has shown to be effective at tackling distribution shifts between training and testing data by adapting a given model on test samples.
Online model updating of TTA may be unstable and this is often a key obstacle preventing existing TTA methods from being deployed in the real world.
arXiv Detail & Related papers (2023-02-24T02:03:41Z) - Robust Mean Teacher for Continual and Gradual Test-Time Adaptation [5.744133015573047]
Gradual test-time adaptation (TTA) considers not only a single domain shift, but a sequence of shifts.
We propose and show that in the setting of TTA, the symmetric cross-entropy is better suited as a consistency loss for mean teachers.
We demonstrate the effectiveness of our proposed method 'robust mean teacher' (RMT) on the continual and gradual corruption benchmarks.
arXiv Detail & Related papers (2022-11-23T16:14:45Z) - TeST: Test-time Self-Training under Distribution Shift [99.68465267994783]
Test-Time Self-Training (TeST) is a technique that takes as input a model trained on some source data and a novel data distribution at test time.
We find that models adapted using TeST significantly improve over baseline test-time adaptation algorithms.
arXiv Detail & Related papers (2022-09-23T07:47:33Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data.
We propose an active sample selection criterion to identify reliable and non-redundant samples.
We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.