Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts
- URL: http://arxiv.org/abs/2505.23926v1
- Date: Thu, 29 May 2025 18:21:47 GMT
- Title: Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts
- Authors: Xuweiyi Chen, Wentao Zhou, Aruni RoyChowdhury, Zezhou Cheng,
- Abstract summary: We propose Point-MoE, a Mixture-of-Experts architecture designed to enable cross-domain generalization in 3D perception.<n>Standard point cloud backbones degrade significantly in performance when trained on mixed-domain data.<n>Point-MoE with a simple top-k routing strategy can automatically specialize experts, even without access to domain labels.
- Score: 7.787211625411271
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While scaling laws have transformed natural language processing and computer vision, 3D point cloud understanding has yet to reach that stage. This can be attributed to both the comparatively smaller scale of 3D datasets, as well as the disparate sources of the data itself. Point clouds are captured by diverse sensors (e.g., depth cameras, LiDAR) across varied domains (e.g., indoor, outdoor), each introducing unique scanning patterns, sampling densities, and semantic biases. Such domain heterogeneity poses a major barrier towards training unified models at scale, especially under the realistic constraint that domain labels are typically inaccessible at inference time. In this work, we propose Point-MoE, a Mixture-of-Experts architecture designed to enable large-scale, cross-domain generalization in 3D perception. We show that standard point cloud backbones degrade significantly in performance when trained on mixed-domain data, whereas Point-MoE with a simple top-k routing strategy can automatically specialize experts, even without access to domain labels. Our experiments demonstrate that Point-MoE not only outperforms strong multi-domain baselines but also generalizes better to unseen domains. This work highlights a scalable path forward for 3D understanding: letting the model discover structure in diverse 3D data, rather than imposing it via manual curation or domain supervision.
Related papers
- DG-MVP: 3D Domain Generalization via Multiple Views of Point Clouds for Classification [10.744510913722817]
Deep neural networks have achieved significant success in 3D point cloud classification.<n>In this paper, we focus on the 3D point cloud domain generalization problem.<n>We propose a novel method for 3D point cloud domain generalization, which can generalize to unseen domains of point clouds.
arXiv Detail & Related papers (2025-04-16T19:43:32Z) - One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection [71.78795573911512]
We propose textbfOneDet3D, a universal one-for-all model that addresses 3D detection across different domains.
We propose the domain-aware in scatter and context, guided by a routing mechanism, to address the data interference issue.
The fully sparse structure and anchor-free head further accommodate point clouds with significant scale disparities.
arXiv Detail & Related papers (2024-11-03T14:21:56Z) - Point Cloud Mixture-of-Domain-Experts Model for 3D Self-supervised Learning [50.55005524072687]
Point clouds, as a primary representation of 3D data, can be categorized into scene domain point clouds and object domain point clouds.<n>In this paper, we propose to learn a comprehensive Point cloud Mixture-of-Domain-Experts model (Point-MoDE) via a block-to-scene pre-training strategy.
arXiv Detail & Related papers (2024-10-13T15:51:20Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields [52.08335264414515]
We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene.
Our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output.
We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency.
arXiv Detail & Related papers (2024-05-30T04:14:58Z) - SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from
Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D)
SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage.
Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z) - MetaSets: Meta-Learning on Point Sets for Generalizable Representations [100.5981809166658]
We study a new problem of 3D Domain Generalization (3DDG) with the goal to generalize the model to other unseen domains of point clouds without access to them in the training process.
We propose to tackle this problem via MetaSets, which meta-learns point cloud representations from a group of classification tasks on carefully-designed transformed point sets.
We design two benchmarks for Sim-to-Real transfer of 3D point clouds. Experimental results show that MetaSets outperforms existing 3D deep learning methods by large margins.
arXiv Detail & Related papers (2022-04-15T03:24:39Z) - Domain Adaptation for Real-World Single View 3D Reconstruction [1.611271868398988]
unsupervised domain adaptation can be used to transfer knowledge from the labeled synthetic source domain to the unlabeled real target domain.
We propose a novel architecture which takes advantage of the fact that in this setting, target domain data is unsupervised with regards to the 3D model but supervised for class labels.
Results are performed with ShapeNet as the source domain and domains within the Object Domain Suite (ODDS) dataset as the target.
arXiv Detail & Related papers (2021-08-24T22:02:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.