Audio-Visual Continual Test-Time Adaptation without Forgetting
- URL: http://arxiv.org/abs/2602.18528v1
- Date: Fri, 20 Feb 2026 04:55:01 GMT
- Title: Audio-Visual Continual Test-Time Adaptation without Forgetting
- Authors: Sarthak Kumar Maharana, Akshay Mehra, Bhavya Ramakrishna, Yunhui Guo, Guan-Ming Su,
- Abstract summary: continual test-time adaptation involves continually adapting a source audio-visual model at test-time.<n>We propose a method, $texttAV-CTTA$, that improves test-time performance of the models without access to any source data.
- Score: 28.411642462831647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Audio-visual continual test-time adaptation involves continually adapting a source audio-visual model at test-time, to unlabeled non-stationary domains, where either or both modalities can be distributionally shifted, which hampers online cross-modal learning and eventually leads to poor accuracy. While previous works have tackled this problem, we find that SOTA methods suffer from catastrophic forgetting, where the model's performance drops well below the source model due to continual parameter updates at test-time. In this work, we first show that adapting only the modality fusion layer to a target domain not only improves performance on that domain but can also enhance performance on subsequent domains. Based on this strong cross-task transferability of the fusion layer's parameters, we propose a method, $\texttt{AV-CTTA}$, that improves test-time performance of the models without access to any source data. Our approach works by using a selective parameter retrieval mechanism that dynamically retrieves the best fusion layer parameters from a buffer using only a small batch of test data. These parameters are then integrated into the model, adapted to the current test distribution, and saved back for future use. Extensive experiments on benchmark datasets involving unimodal and bimodal corruptions show our proposed $\texttt{AV-CTTA}$ significantly outperforms existing methods while minimizing catastrophic forgetting.
Related papers
- TAPS : Frustratingly Simple Test Time Active Learning for VLMs [0.0]
Test-Time Optimization enables models to adapt to new data during inference by updating parameters on-the-fly.<n>We propose a novel Test-Time Active Learning framework that adaptively queries uncertain samples and updates prompts dynamically.<n>Our framework provides a practical and effective solution for real-world deployment in safety-critical applications such as autonomous systems and medical diagnostics.
arXiv Detail & Related papers (2025-07-26T18:04:49Z) - MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging [29.58798660724693]
Continual model merging integrates independently fine-tuned models sequentially without access to the original training data.<n>We propose MINGLE, a novel framework for Test-Time Continual Model Merging.<n> MINGLE achieves robust generalization, significantly reduces forgetting, and consistently surpasses previous state-of-the-art methods by 7-9% on average.
arXiv Detail & Related papers (2025-05-17T07:24:22Z) - APCoTTA: Continual Test-Time Adaptation for Semantic Segmentation of Airborne LiDAR Point Clouds [14.348191795901101]
Airborne laser scanning (ALS) point cloud segmentation is a fundamental task for large-scale 3D scene understanding.<n> Continuous Test-Time Adaptation (CTTA) offers a solution by adapting a source-pretrained model to evolving, unlabeled target domains.<n>We propose APCoTTA, the first CTTA method tailored for ALS point cloud semantic segmentation.
arXiv Detail & Related papers (2025-05-15T05:21:16Z) - DOTA: Distributional Test-Time Adaptation of Vision-Language Models [69.41389326333771]
Vision-language foundation models can be unreliable when significant distribution gaps exist between training and test data.<n>We propose DOTA (DistributiOnal Test-time Adaptation), a simple yet effective method addressing this limitation.<n>This distribution-centric approach enables the model to continually learn and adapt to the deployment environment.
arXiv Detail & Related papers (2024-09-28T15:03:28Z) - Enhancing Test Time Adaptation with Few-shot Guidance [62.49199492255226]
Deep neural networks often encounter significant performance drops while facing with domain shifts between training (source) and test (target) data.<n>Test Time Adaptation (TTA) methods have been proposed to adapt pre-trained source model to handle out-of-distribution streaming target data.<n>We develop Few-Shot Test Time Adaptation (FS-TTA), a novel and practical setting that utilizes a few-shot support set on top of TTA.
arXiv Detail & Related papers (2024-09-02T15:50:48Z) - Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment [81.78901060731269]
Test-time adaptation (TTA) aims to improve the performance of source-domain pre-trained models on previously unseen, shifted target domains.<n>Traditional TTA methods primarily adapt model weights based on target data streams, making model performance sensitive to the amount and order of target data.<n>The recently proposed diffusion-driven TTA methods mitigate this by adapting model inputs instead of weights, where an unconditional diffusion model, trained on the source domain, transforms target-domain data into a synthetic domain that is expected to approximate the source domain.
arXiv Detail & Related papers (2024-06-06T17:39:09Z) - Test-Time Model Adaptation with Only Forward Passes [68.11784295706995]
Test-time adaptation has proven effective in adapting a given trained model to unseen test samples with potential distribution shifts.
We propose a test-time Forward-Optimization Adaptation (FOA) method.
FOA runs on quantized 8-bit ViT, outperforms gradient-based TENT on full-precision 32-bit ViT, and achieves an up to 24-fold memory reduction on ImageNet-C.
arXiv Detail & Related papers (2024-04-02T05:34:33Z) - CONTRAST: Continual Multi-source Adaptation to Dynamic Distributions [42.293444710522294]
Continual Multi-source Adaptation to Dynamic Distributions (CONTRAST) is a novel method that optimally combines multiple source models to adapt to the dynamic test data.
We show that the proposed method is able to optimally combine the source models and prioritize updates to the model least prone to forgetting.
arXiv Detail & Related papers (2024-01-04T22:23:56Z) - AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation [1.4530711901349282]
We propose to validate test-time adaptation methods using datasets for autonomous driving, namely CLAD-C and SHIFT.
We observe that current test-time adaptation methods struggle to effectively handle varying degrees of domain shift.
We enhance the well-established self-training framework by incorporating a small memory buffer to increase model stability.
arXiv Detail & Related papers (2023-09-18T19:34:23Z) - Train/Test-Time Adaptation with Retrieval [129.8579208970529]
We introduce Train/Test-Time Adaptation with Retrieval ($rm T3AR$), a method to adapt models both at train and test time.
$rm T3AR$ adapts a given model to the downstream task using refined pseudo-labels and a self-supervised contrastive objective function.
Thanks to the retrieval module, our method gives the user or service provider the possibility to improve model adaptation on the downstream task.
arXiv Detail & Related papers (2023-03-25T02:44:57Z) - On-the-Fly Test-time Adaptation for Medical Image Segmentation [63.476899335138164]
Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem.
We propose a new framework called Adaptive UNet where each convolutional block is equipped with an adaptive batch normalization layer.
During test-time, the model takes in just the new test image and generates a domain code to adapt the features of source model according to the test data.
arXiv Detail & Related papers (2022-03-10T18:51:29Z) - Parameter-free Online Test-time Adaptation [19.279048049267388]
We show how test-time adaptation methods fare for a number of pre-trained models on a variety of real-world scenarios.
We propose a particularly "conservative" approach, which addresses the problem with a Laplacian Adjusted Maximum Estimation (LAME)
Our approach exhibits a much higher average accuracy across scenarios than existing methods, while being notably faster and have a much lower memory footprint.
arXiv Detail & Related papers (2022-01-15T00:29:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.