Related papers: MITA: Bridging the Gap between Model and Data for Test-time Adaptation

MITA: Bridging the Gap between Model and Data for Test-time Adaptation

URL: http://arxiv.org/abs/2410.09398v1
Date: Sat, 12 Oct 2024 07:02:33 GMT
Title: MITA: Bridging the Gap between Model and Data for Test-time Adaptation
Authors: Yige Yuan, Bingbing Xu, Teng Xiao, Liang Hou, Fei Sun, Huawei Shen, Xueqi Cheng,
Abstract summary: Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
Score: 68.62509948690698
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. However, existing mainstream TTA methods, predominantly operating at batch level, often exhibit suboptimal performance in complex real-world scenarios, particularly when confronting outliers or mixed distributions. This phenomenon stems from a pronounced over-reliance on statistical patterns over the distinct characteristics of individual instances, resulting in a divergence between the distribution captured by the model and data characteristics. To address this challenge, we propose Meet-In-The-Middle based Test-Time Adaptation ($\textbf{MITA}$), which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions, thereby meeting in the middle. MITA pioneers a significant departure from traditional approaches that focus solely on aligning the model to the data, facilitating a more effective bridging of the gap between model's distribution and data characteristics. Comprehensive experiments with MITA across three distinct scenarios (Outlier, Mixture, and Pure) demonstrate its superior performance over SOTA methods, highlighting its potential to significantly enhance generalizability in practical applications.

Related papers

Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark [53.876493664396506]
Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions. This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context. We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement. To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques.
arXiv Detail & Related papers (2025-01-02T17:01:06Z)
Bridging the Gap for Test-Time Multimodal Sentiment Analysis [7.871669754963032]
Multimodal sentiment analysis (MSA) is an emerging research topic that aims to understand and recognize human sentiment or emotions through multiple modalities. In this paper, we propose two strategies: Contrastive Adaptation and Stable Pseudo-label generation (CASP) for test-time adaptation for MSA.
arXiv Detail & Related papers (2024-12-10T02:26:33Z)
Client Contribution Normalization for Enhanced Federated Learning [4.726250115737579]
Mobile devices, including smartphones and laptops, generate decentralized and heterogeneous data. Federated Learning (FL) offers a promising alternative by enabling collaborative training of a global model across decentralized devices without data sharing. This paper focuses on data-dependent heterogeneity in FL and proposes a novel approach leveraging mean latent representations extracted from locally trained models.
arXiv Detail & Related papers (2024-11-10T04:03:09Z)
On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations. We propose an autoregressive sampling approach that significantly improves performance in forecasting. We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z)
DATTA: Towards Diversity Adaptive Test-Time Adaptation in Dynamic Wild World [6.816521410643928]
This paper proposes a new general method, named Diversity Adaptive Test-Time Adaptation (DATTA), aimed at improving Quality of Experience (QoE) It features three key components: Diversity Discrimination (DD) to assess batch diversity, Diversity Adaptive Batch Normalization (DABN) to tailor normalization methods based on DD insights, and Diversity Adaptive Fine-Tuning (DAFT) to selectively fine-tune the model. Experimental results show that our method achieves up to a 21% increase in accuracy compared to state-of-the-art methodologies.
arXiv Detail & Related papers (2024-08-15T09:50:11Z)
On ADMM in Heterogeneous Federated Learning: Personalization, Robustness, and Fairness [16.595935469099306]
We propose FLAME, an optimization framework by utilizing the alternating direction method of multipliers (ADMM) to train personalized and global models. Our theoretical analysis establishes the global convergence and two kinds of convergence rates for FLAME under mild assumptions. Our experimental findings show that FLAME outperforms state-of-the-art methods in convergence and accuracy, and it achieves higher test accuracy under various attacks.
arXiv Detail & Related papers (2024-07-23T11:35:42Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
TEA: Test-time Energy Adaptation [67.4574269851666]
Test-time adaptation (TTA) aims to improve model generalizability when test data diverges from training distribution. We propose a novel energy-based perspective, enhancing the model's perception of target data distributions.
arXiv Detail & Related papers (2023-11-24T10:49:49Z)
Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition. We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z)
Consistency Regularization for Generalizable Source-free Domain Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset. Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets. We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.