Related papers: Improving Multimodal Distillation for 3D Semantic Segmentation under Domain Shift

Improving Multimodal Distillation for 3D Semantic Segmentation under Domain Shift

URL: http://arxiv.org/abs/2511.17455v1
Date: Fri, 21 Nov 2025 17:57:43 GMT
Title: Improving Multimodal Distillation for 3D Semantic Segmentation under Domain Shift
Authors: Björn Michele, Alexandre Boulch, Gilles Puy, Tuan-Hung Vu, Renaud Marlet, Nicolas Courty,
Abstract summary: We conduct an exhaustive study to identify recipes for exploiting vision foundation models (VFMs) in unsupervised domain adaptation for semantic segmentation of lidar point clouds.<n>The resulting pipeline achieves state-of-the-art results in four widely-recognized and challenging settings.
Score: 62.50795372173394
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic segmentation networks trained under full supervision for one type of lidar fail to generalize to unseen lidars without intervention. To reduce the performance gap under domain shifts, a recent trend is to leverage vision foundation models (VFMs) providing robust features across domains. In this work, we conduct an exhaustive study to identify recipes for exploiting VFMs in unsupervised domain adaptation for semantic segmentation of lidar point clouds. Building upon unsupervised image-to-lidar knowledge distillation, our study reveals that: (1) the architecture of the lidar backbone is key to maximize the generalization performance on a target domain; (2) it is possible to pretrain a single backbone once and for all, and use it to address many domain shifts; (3) best results are obtained by keeping the pretrained backbone frozen and training an MLP head for semantic segmentation. The resulting pipeline achieves state-of-the-art results in four widely-recognized and challenging settings. The code will be available at: https://github.com/valeoai/muddos.

Related papers

Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models [47.66611300605174]
Rein++ is an efficient VFM-based segmentation framework.<n>It demonstrates superior generalization from limited data.<n>It enables effective adaptation to diverse unlabeled scenarios.
arXiv Detail & Related papers (2025-08-03T08:53:30Z)
VLMs meet UDA: Boosting Transferability of Open Vocabulary Segmentation with Unsupervised Domain Adaptation [3.776249047528669]
This paper proposes enhancing segmentation accuracy across diverse domains by integrating Vision-Language reasoning with key strategies for Unsupervised Domain Adaptation (UDA)<n>We improve the fine-grained segmentation capabilities of VLMs through multi-scale contextual data, robust text embeddings with prompt augmentation, and layer-wise fine-tuning in our proposed Foundational-Retaining Open Vocabulary (FROVSS) framework.<n>The resulting UDA-FROV framework is the first UDA approach to effectively adapt across domains without requiring shared categories.
arXiv Detail & Related papers (2024-12-12T12:49:42Z)
Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation [17.875516787157018]
We study how to harness the knowledge priors learned by 2D visual foundation models to produce more accurate labels for unlabeled target domains. Our method is evaluated on various autonomous driving datasets and the results demonstrate a significant improvement for 3D segmentation task.
arXiv Detail & Related papers (2024-03-15T03:58:17Z)
Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation [65.78246406460305]
compositional semantic mixing represents the first unsupervised domain adaptation technique for point cloud segmentation. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world)
arXiv Detail & Related papers (2023-08-28T14:43:36Z)
Walking Your LiDOG: A Journey Through Multiple Domains for LiDAR Semantic Segmentation [16.857861952638498]
We study domain generalization for LiDAR semantic segmentation (DG-LSS) Our results confirm a significant gap between methods, evaluated in a cross-domain setting. We propose the first method specifically designed for DG-LSS, which obtains $34.88$ mIoU on the target domain.
arXiv Detail & Related papers (2023-04-23T17:43:29Z)
Generalized Semantic Segmentation by Self-Supervised Source Domain Projection and Multi-Level Contrastive Learning [79.0660895390689]
Deep networks trained on the source domain show degraded performance when tested on unseen target domain data. We propose a Domain Projection and Contrastive Learning (DPCL) approach for generalized semantic segmentation.
arXiv Detail & Related papers (2023-03-03T13:07:14Z)
Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials. We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field. Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z)
Domain Adaptation for Semantic Segmentation via Patch-Wise Contrastive Learning [62.7588467386166]
We leverage contrastive learning to bridge the domain gap by aligning the features of structurally similar label patches across domains. Our approach consistently outperforms state-of-the-art unsupervised and semi-supervised methods on two challenging domain adaptive segmentation tasks.
arXiv Detail & Related papers (2021-04-22T13:39:12Z)
Towards Adaptive Semantic Segmentation by Progressive Feature Refinement [16.40758125170239]
We propose an innovative progressive feature refinement framework, along with domain adversarial learning to boost the transferability of segmentation networks. As a result, the segmentation models trained with source domain images can be transferred to a target domain without significant performance degradation.
arXiv Detail & Related papers (2020-09-30T04:17:48Z)
Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation [97.8552697905657]
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains. We propose Alleviating Semantic-level Shift (ASS), which can successfully promote the distribution consistency from both global and local views. We apply our ASS to two domain adaptation tasks, from GTA5 to Cityscapes and from Synthia to Cityscapes.
arXiv Detail & Related papers (2020-04-02T03:25:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.