Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation
- URL: http://arxiv.org/abs/2303.10457v1
- Date: Sat, 18 Mar 2023 16:51:19 GMT
- Title: Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation
- Authors: Haozhi Cao, Yuecong Xu, Jianfei Yang, Pengyu Yin, Shenghai Yuan, Lihua
Xie
- Abstract summary: Continual Test-Time Adaptation (CTTA) generalizes conventional Test-Time Adaptation (TTA) by assuming that the target domain is dynamic over time rather than stationary.
In this paper, we explore Multi-Modal Continual Test-Time Adaptation (MM-CTTA) as a new extension of CTTA for 3D semantic segmentation.
- Score: 26.674085603033742
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Continual Test-Time Adaptation (CTTA) generalizes conventional Test-Time
Adaptation (TTA) by assuming that the target domain is dynamic over time rather
than stationary. In this paper, we explore Multi-Modal Continual Test-Time
Adaptation (MM-CTTA) as a new extension of CTTA for 3D semantic segmentation.
The key to MM-CTTA is to adaptively attend to the reliable modality while
avoiding catastrophic forgetting during continual domain shifts, which is out
of the capability of previous TTA or CTTA methods. To fulfill this gap, we
propose an MM-CTTA method called Continual Cross-Modal Adaptive Clustering
(CoMAC) that addresses this task from two perspectives. On one hand, we propose
an adaptive dual-stage mechanism to generate reliable cross-modal predictions
by attending to the reliable modality based on the class-wise feature-centroid
distance in the latent space. On the other hand, to perform test-time
adaptation without catastrophic forgetting, we design class-wise momentum
queues that capture confident target features for adaptation while
stochastically restoring pseudo-source features to revisit source knowledge. We
further introduce two new benchmarks to facilitate the exploration of MM-CTTA
in the future. Our experimental results show that our method achieves
state-of-the-art performance on both benchmarks.
Related papers
- Analytic Continual Test-Time Adaptation for Multi-Modality Corruption [23.545997349882857]
Test-Time Adaptation (TTA) aims to help pre-trained models bridge the gap between source and target datasets.
We propose a novel approach, Multi-modality Dynamic Analytic Adapter (MDAA) for MM-CTTA tasks.
MDAA achieves state-of-the-art performance on MM-CTTA while ensuring reliable model adaptation.
arXiv Detail & Related papers (2024-10-29T01:21:24Z) - TS-TCD: Triplet-Level Cross-Modal Distillation for Time-Series Forecasting Using Large Language Models [15.266543423942617]
We present a novel framework, TS-TCD, which introduces a comprehensive three-tiered cross-modal knowledge distillation mechanism.
Unlike prior work that focuses on isolated alignment techniques, our framework systematically integrates.
Experiments on benchmark time-series demonstrate that TS-TCD achieves state-of-the-art results, outperforming traditional methods in both accuracy and robustness.
arXiv Detail & Related papers (2024-09-23T12:57:24Z) - Adaptive Cascading Network for Continual Test-Time Adaptation [12.718826132518577]
We study the problem of continual test-time adaption where the goal is to adapt a source pre-trained model to a sequence of unlabelled target domains at test time.
Existing methods on test-time training suffer from several limitations.
arXiv Detail & Related papers (2024-07-17T01:12:57Z) - Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments [13.163784646113214]
Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to continually changing target domains.
We present AMROD, featuring three core components. Firstly, the object-level contrastive learning module extracts object-level features for contrastive learning to refine the feature representation in the target domain.
Secondly, the adaptive monitoring module dynamically skips unnecessary adaptation and updates the category-specific threshold based on predicted confidence scores to enable efficiency and improve the quality of pseudo-labels.
arXiv Detail & Related papers (2024-06-24T08:30:03Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo
Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance.
Current methods with a fixed model do not work uniformly well across various datasets.
This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain
Medical Images [56.72015587067494]
We propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA.
Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods.
arXiv Detail & Related papers (2022-05-27T02:34:32Z) - MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation [104.48766162008815]
We propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation.
To design a framework that can take full advantage of multi-modality, each modality provides regularized self-supervisory signals to other modalities.
Our regularized pseudo labels produce stable self-learning signals in numerous multi-modal test-time adaptation scenarios.
arXiv Detail & Related papers (2022-04-27T02:28:12Z) - A Transductive Multi-Head Model for Cross-Domain Few-Shot Learning [72.30054522048553]
We present a new method, Transductive Multi-Head Few-Shot learning (TMHFS), to address the Cross-Domain Few-Shot Learning challenge.
The proposed methods greatly outperform the strong baseline, fine-tuning, on four different target domains.
arXiv Detail & Related papers (2020-06-08T02:39:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.