MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation
- URL: http://arxiv.org/abs/2204.12667v1
- Date: Wed, 27 Apr 2022 02:28:12 GMT
- Title: MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation
- Authors: Inkyu Shin, Yi-Hsuan Tsai, Bingbing Zhuang, Samuel Schulter, Buyu Liu,
Sparsh Garg, In So Kweon, Kuk-Jin Yoon
- Abstract summary: We propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation.
To design a framework that can take full advantage of multi-modality, each modality provides regularized self-supervisory signals to other modalities.
Our regularized pseudo labels produce stable self-learning signals in numerous multi-modal test-time adaptation scenarios.
- Score: 104.48766162008815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test-time adaptation approaches have recently emerged as a practical solution
for handling domain shift without access to the source domain data. In this
paper, we propose and explore a new multi-modal extension of test-time
adaptation for 3D semantic segmentation. We find that directly applying
existing methods usually results in performance instability at test time
because multi-modal input is not considered jointly. To design a framework that
can take full advantage of multi-modality, where each modality provides
regularized self-supervisory signals to other modalities, we propose two
complementary modules within and across the modalities. First, Intra-modal
Pseudolabel Generation (Intra-PG) is introduced to obtain reliable pseudo
labels within each modality by aggregating information from two models that are
both pre-trained on source data but updated with target data at different
paces. Second, Inter-modal Pseudo-label Refinement (Inter-PR) adaptively
selects more reliable pseudo labels from different modalities based on a
proposed consistency scheme. Experiments demonstrate that our regularized
pseudo labels produce stable self-learning signals in numerous multi-modal
test-time adaptation scenarios for 3D semantic segmentation. Visit our project
website at https://www.nec-labs.com/~mas/MM-TTA.
Related papers
- Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation [61.91492500828508]
Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal support samples.
We introduce a cost-free multimodal FS-PCS setup, utilizing textual labels and the potentially available 2D image modality.
We propose a simple yet effective Test-time Adaptive Cross-modal Seg (TACC) technique to mitigate training bias.
arXiv Detail & Related papers (2024-10-29T19:28:41Z) - Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models [11.545127156146368]
We introduce Dual Prototype Evolving (DPE), a novel test-time adaptation approach for pre-trained vision-language models (VLMs)
We create and evolve two sets of prototypes--textual and visual--to progressively capture more accurate multi-modal representations for target classes during test time.
Our proposed DPE consistently outperforms previous state-of-the-art methods while also exhibiting competitive computational efficiency.
arXiv Detail & Related papers (2024-10-16T17:59:49Z) - Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection [64.08296187555095]
Uni$2$Det is a framework for unified and universal multi-dataset training on 3D detection.
We introduce multi-stage prompting modules for multi-dataset 3D detection.
Results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.
arXiv Detail & Related papers (2024-09-30T17:57:50Z) - UniTTA: Unified Benchmark and Versatile Framework Towards Realistic Test-Time Adaptation [66.05528698010697]
Test-Time Adaptation aims to adapt pre-trained models to the target domain during testing.
Researchers have identified various challenging scenarios and developed diverse methods to address these challenges.
We propose a Unified Test-Time Adaptation benchmark, which is comprehensive and widely applicable.
arXiv Detail & Related papers (2024-07-29T15:04:53Z) - Adaptive Test-Time Personalization for Federated Learning [51.25437606915392]
We introduce a novel setting called test-time personalized federated learning (TTPFL)
In TTPFL, clients locally adapt a global model in an unsupervised way without relying on any labeled data during test-time.
We propose a novel algorithm called ATP to adaptively learn the adaptation rates for each module in the model from distribution shifts among source domains.
arXiv Detail & Related papers (2023-10-28T20:42:47Z) - Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation [26.674085603033742]
Continual Test-Time Adaptation (CTTA) generalizes conventional Test-Time Adaptation (TTA) by assuming that the target domain is dynamic over time rather than stationary.
In this paper, we explore Multi-Modal Continual Test-Time Adaptation (MM-CTTA) as a new extension of CTTA for 3D semantic segmentation.
arXiv Detail & Related papers (2023-03-18T16:51:19Z) - Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with
Optimal Transport [24.930976128926314]
We propose a novel Multi-modal Multi-instance Multi-label Deep Network (M3DN)
M3DN considers M3 learning in an end-to-end multi-modal deep network and utilizes consistency principle among different modal bag-level predictions.
Thereby M3DNS can better predict label and exploit label correlation simultaneously.
arXiv Detail & Related papers (2021-04-17T09:18:28Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.