MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation
- URL: http://arxiv.org/abs/2204.12667v1
- Date: Wed, 27 Apr 2022 02:28:12 GMT
- Title: MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation
- Authors: Inkyu Shin, Yi-Hsuan Tsai, Bingbing Zhuang, Samuel Schulter, Buyu Liu,
Sparsh Garg, In So Kweon, Kuk-Jin Yoon
- Abstract summary: We propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation.
To design a framework that can take full advantage of multi-modality, each modality provides regularized self-supervisory signals to other modalities.
Our regularized pseudo labels produce stable self-learning signals in numerous multi-modal test-time adaptation scenarios.
- Score: 104.48766162008815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test-time adaptation approaches have recently emerged as a practical solution
for handling domain shift without access to the source domain data. In this
paper, we propose and explore a new multi-modal extension of test-time
adaptation for 3D semantic segmentation. We find that directly applying
existing methods usually results in performance instability at test time
because multi-modal input is not considered jointly. To design a framework that
can take full advantage of multi-modality, where each modality provides
regularized self-supervisory signals to other modalities, we propose two
complementary modules within and across the modalities. First, Intra-modal
Pseudolabel Generation (Intra-PG) is introduced to obtain reliable pseudo
labels within each modality by aggregating information from two models that are
both pre-trained on source data but updated with target data at different
paces. Second, Inter-modal Pseudo-label Refinement (Inter-PR) adaptively
selects more reliable pseudo labels from different modalities based on a
proposed consistency scheme. Experiments demonstrate that our regularized
pseudo labels produce stable self-learning signals in numerous multi-modal
test-time adaptation scenarios for 3D semantic segmentation. Visit our project
website at https://www.nec-labs.com/~mas/MM-TTA.
Related papers
- Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities [8.517830626176641]
Any2Seg is a novel framework that can achieve robust segmentation from any combination of modalities in any visual conditions.
Experiments on two benchmarks with four modalities demonstrate that Any2Seg achieves the state-of-the-art under the multi-modal setting.
arXiv Detail & Related papers (2024-07-16T03:34:38Z) - U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation [63.31007867379312]
We introduce U3M: An Unbiased Multiscale Modal Fusion Model for Multimodal Semantics.
We employ feature fusion at multiple scales to ensure the effective extraction and integration of both global and local features.
Experimental results demonstrate that our approach achieves superior performance across multiple datasets.
arXiv Detail & Related papers (2024-05-24T08:58:48Z) - Adaptive Test-Time Personalization for Federated Learning [51.25437606915392]
We introduce a novel setting called test-time personalized federated learning (TTPFL)
In TTPFL, clients locally adapt a global model in an unsupervised way without relying on any labeled data during test-time.
We propose a novel algorithm called ATP to adaptively learn the adaptation rates for each module in the model from distribution shifts among source domains.
arXiv Detail & Related papers (2023-10-28T20:42:47Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic
Segmentation [27.23513712371972]
We propose a simple yet efficient multi-modal fusion mechanism Linear Fusion.
We also propose M3L: Multi-modal Teacher for Masked Modality Learning.
Our proposal shows an absolute improvement of up to 10% on robust mIoU above the most competitive baselines.
arXiv Detail & Related papers (2023-04-21T05:52:50Z) - Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation [26.674085603033742]
Continual Test-Time Adaptation (CTTA) generalizes conventional Test-Time Adaptation (TTA) by assuming that the target domain is dynamic over time rather than stationary.
In this paper, we explore Multi-Modal Continual Test-Time Adaptation (MM-CTTA) as a new extension of CTTA for 3D semantic segmentation.
arXiv Detail & Related papers (2023-03-18T16:51:19Z) - Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with
Optimal Transport [24.930976128926314]
We propose a novel Multi-modal Multi-instance Multi-label Deep Network (M3DN)
M3DN considers M3 learning in an end-to-end multi-modal deep network and utilizes consistency principle among different modal bag-level predictions.
Thereby M3DNS can better predict label and exploit label correlation simultaneously.
arXiv Detail & Related papers (2021-04-17T09:18:28Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - mDALU: Multi-Source Domain Adaptation and Label Unification with Partial
Datasets [102.62639692656458]
This paper treats this task as a multi-source domain adaptation and label unification problem.
Our method consists of a partially-supervised adaptation stage and a fully-supervised adaptation stage.
We verify the method on three different tasks, image classification, 2D semantic image segmentation, and joint 2D-3D semantic segmentation.
arXiv Detail & Related papers (2020-12-15T15:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.