Temporal Test-Time Adaptation with State-Space Models
- URL: http://arxiv.org/abs/2407.12492v2
- Date: Wed, 2 Oct 2024 17:29:54 GMT
- Title: Temporal Test-Time Adaptation with State-Space Models
- Authors: Mona Schirmer, Dan Zhang, Eric Nalisnick,
- Abstract summary: Adapting a model on test samples can help mitigate this drop in performance.
Most test-time adaptation methods have focused on synthetic corruption shifts.
We propose STAD, a probabilistic state-space model that adapts a deployed model to temporal distribution shifts.
- Score: 4.248760709042802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distribution shifts between training and test data are inevitable over the lifecycle of a deployed model, leading to performance decay. Adapting a model on test samples can help mitigate this drop in performance. However, most test-time adaptation methods have focused on synthetic corruption shifts, leaving a variety of distribution shifts underexplored. In this paper, we focus on distribution shifts that evolve gradually over time, which are common in the wild but challenging for existing methods, as we show. To address this, we propose STAD, a probabilistic state-space model that adapts a deployed model to temporal distribution shifts by learning the time-varying dynamics in the last set of hidden features. Without requiring labels, our model infers time-evolving class prototypes that act as a dynamic classification head. Through experiments on real-world temporal distribution shifts, we show that our method excels in handling small batch sizes and label shift.
Related papers
- DOTA: Distributional Test-Time Adaptation of Vision-Language Models [52.98590762456236]
Training-free test-time dynamic adapter (TDA) is a promising approach to address this issue.
We propose a simple yet effective method for DistributiOnal Test-time Adaptation (Dota)
Dota continually estimates the distributions of test samples, allowing the model to continually adapt to the deployment environment.
arXiv Detail & Related papers (2024-09-28T15:03:28Z) - Test-Time Model Adaptation with Only Forward Passes [68.11784295706995]
Test-time adaptation has proven effective in adapting a given trained model to unseen test samples with potential distribution shifts.
We propose a test-time Forward-Optimization Adaptation (FOA) method.
FOA runs on quantized 8-bit ViT, outperforms gradient-based TENT on full-precision 32-bit ViT, and achieves an up to 24-fold memory reduction on ImageNet-C.
arXiv Detail & Related papers (2024-04-02T05:34:33Z) - Invariant Anomaly Detection under Distribution Shifts: A Causal
Perspective [6.845698872290768]
Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples.
Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down.
We attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts.
arXiv Detail & Related papers (2023-12-21T23:20:47Z) - Distributionally Robust Post-hoc Classifiers under Prior Shifts [31.237674771958165]
We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors.
We present an extremely lightweight post-hoc approach that performs scaling adjustments to predictions from a pre-trained model.
arXiv Detail & Related papers (2023-09-16T00:54:57Z) - Addressing Distribution Shift at Test Time in Pre-trained Language
Models [3.655021726150369]
State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks.
PLMs have been found to degrade in performance under distribution shift.
We present an approach that improves the performance of PLMs at test-time under distribution shift.
arXiv Detail & Related papers (2022-12-05T16:04:54Z) - Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time [69.77704012415845]
Temporal shifts can considerably degrade performance of machine learning models deployed in the real world.
We benchmark 13 prior approaches, including methods in domain generalization, continual learning, self-supervised learning, and ensemble learning.
Under both evaluation strategies, we observe an average performance drop of 20% from in-distribution to out-of-distribution data.
arXiv Detail & Related papers (2022-11-25T17:07:53Z) - TTAPS: Test-Time Adaption by Aligning Prototypes using Self-Supervision [70.05605071885914]
We propose a novel modification of the self-supervised training algorithm SwAV that adds the ability to adapt to single test samples.
We show the success of our method on the common benchmark dataset CIFAR10-C.
arXiv Detail & Related papers (2022-05-18T05:43:06Z) - Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data.
We propose an active sample selection criterion to identify reliable and non-redundant samples.
We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.