TEA: Test-time Energy Adaptation
- URL: http://arxiv.org/abs/2311.14402v2
- Date: Tue, 27 Feb 2024 04:29:37 GMT
- Title: TEA: Test-time Energy Adaptation
- Authors: Yige Yuan, Bingbing Xu, Liang Hou, Fei Sun, Huawei Shen, Xueqi Cheng
- Abstract summary: Test-time adaptation (TTA) aims to improve model generalizability when test data diverges from training distribution.
We propose a novel energy-based perspective, enhancing the model's perception of target data distributions.
- Score: 67.4574269851666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test-time adaptation (TTA) aims to improve model generalizability when test
data diverges from training distribution, offering the distinct advantage of
not requiring access to training data and processes, especially valuable in the
context of large pre-trained models. However, current TTA methods fail to
address the fundamental issue: covariate shift, i.e., the decreased
generalizability can be attributed to the model's reliance on the marginal
distribution of the training data, which may impair model calibration and
introduce confirmation bias. To address this, we propose a novel energy-based
perspective, enhancing the model's perception of target data distributions
without requiring access to training data or processes. Building on this
perspective, we introduce $\textbf{T}$est-time $\textbf{E}$nergy
$\textbf{A}$daptation ($\textbf{TEA}$), which transforms the trained classifier
into an energy-based model and aligns the model's distribution with the test
data's, enhancing its ability to perceive test distributions and thus improving
overall generalizability. Extensive experiments across multiple tasks,
benchmarks and architectures demonstrate TEA's superior generalization
performance against state-of-the-art methods. Further in-depth analyses reveal
that TEA can equip the model with a comprehensive perception of test
distribution, ultimately paving the way toward improved generalization and
calibration.
Related papers
- Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization [28.977757627384165]
Domain Domain (DG) aims to avoid the performance degradation of the model when the distribution shift between the limited training data and unseen test data occurs.
Recently, foundation models with enormous parameters have been pre-trained with huge datasets, demonstrating strong generalization ability.
Our framework achieves SOTA performance on five DG benchmarks, while only requiring training a small number of parameters without adding additional testing cost.
arXiv Detail & Related papers (2024-07-21T07:50:49Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Diverse Data Augmentation with Diffusions for Effective Test-time Prompt
Tuning [73.75282761503581]
We propose DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data.
Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13%.
arXiv Detail & Related papers (2023-08-11T09:36:31Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Test-Time Adaptation with Perturbation Consistency Learning [32.58879780726279]
We propose a simple test-time adaptation method to promote the model to make stable predictions for samples with distribution shifts.
Our method can achieve higher or comparable performance with less inference time over strong PLM backbones.
arXiv Detail & Related papers (2023-04-25T12:29:22Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [143.14128737978342]
Test-time adaptation, an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - TsmoBN: Interventional Generalization for Unseen Clients in Federated
Learning [23.519212374186232]
We form a training structural causal model (SCM) to explain the challenges of model generalization in a distributed learning paradigm.
We present a simple yet effective method using test-specific and momentum tracked batch normalization (TsmoBN) to generalize FL models to testing clients.
arXiv Detail & Related papers (2021-10-19T13:46:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.