MME: Mixture of Mesh Experts with Random Walk Transformer Gating
- URL: http://arxiv.org/abs/2603.00828v1
- Date: Sat, 28 Feb 2026 22:13:00 GMT
- Title: MME: Mixture of Mesh Experts with Random Walk Transformer Gating
- Authors: Amir Belder, Ayellet Tal,
- Abstract summary: We present a novel Mixture of Experts (MoE) framework designed to harness the complementary strengths of diverse approaches.<n>We propose a new gate architecture that encourages each expert to specialise in the classes it excels in.<n>Our framework achieves state-of-the-art results in mesh classification, retrieval, and semantic segmentation tasks.
- Score: 13.564417897372875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, various methods have been proposed for mesh analysis, each offering distinct advantages and often excelling on different object classes. We present a novel Mixture of Experts (MoE) framework designed to harness the complementary strengths of these diverse approaches. We propose a new gate architecture that encourages each expert to specialise in the classes it excels in. Our design is guided by two key ideas: (1) random walks over the mesh surface effectively capture the regions that individual experts attend to, and (2) an attention mechanism that enables the gate to focus on the areas most informative for each expert's decision-making. To further enhance performance, we introduce a dynamic loss balancing scheme that adjusts a trade-off between diversity and similarity losses throughout the training, where diversity prompts expert specialization, and similarity enables knowledge sharing among the experts. Our framework achieves state-of-the-art results in mesh classification, retrieval, and semantic segmentation tasks. Our code is available at: https://github.com/amirbelder/MME-Mixture-of-Mesh-Experts.
Related papers
- pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation [68.3777121585281]
We propose a novel Mixture-of-Experts prompt tuning method called pMoE.<n>The proposed pMoE significantly enhances the model's versatility and applicability to a broad spectrum of tasks.<n>We conduct extensive experiments across 47 adaptation tasks, including both classification and segmentation in general and medical domains.
arXiv Detail & Related papers (2026-02-26T12:27:06Z) - Training Diverse Graph Experts for Ensembles: A Systematic Empirical Study [15.65200571307458]
We present the first systematic empirical study of expert-level diversification techniques for GNN ensembles.<n>We evaluate 20 diversification strategies across 14 node classification benchmarks.<n>Our comprehensive evaluation examines each technique in terms of expert diversity, complementarity, and ensemble performance.
arXiv Detail & Related papers (2025-10-21T07:40:51Z) - Generalized Fine-Grained Category Discovery with Multi-Granularity Conceptual Experts [81.68203255687051]
Generalized Category Discovery is an open-world problem that clusters unlabeled data by leveraging knowledge from partially labeled categories.<n>Existing approaches fail to exploit multi-granularity conceptual information in visual data.<n>We propose a Multi-Granularity Experts framework that integrates multi-granularity knowledge for accurate category discovery.
arXiv Detail & Related papers (2025-09-30T13:25:11Z) - Multi-Task Dense Prediction Fine-Tuning with Mixture of Fine-Grained Experts [22.936728143586443]
Multi-task learning (MTL) for dense prediction has shown promising results but still faces challenges in balancing shared representations with task-specific specialization.<n>We introduce a novel Fine-Grained Mixture of Experts architecture that explores MoE-based MTL models through a combination of three key innovations and fine-tuning.
arXiv Detail & Related papers (2025-07-25T08:59:30Z) - Advancing Expert Specialization for Better MoE [22.88847592702946]
Mixture-of-Experts (MoE) models enable efficient scaling of large language models (LLMs) by activating only a subset of experts per input.<n>We observe that the commonly used auxiliary load balancing loss often leads to expert overlap and overly uniform routing.<n>We propose a simple yet effective solution that introduces two complementary objectives.
arXiv Detail & Related papers (2025-05-28T13:09:47Z) - On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating [75.29576838162714]
DeepSeekMoE stands out because of two unique features: the deployment of a shared expert strategy and of the normalized sigmoid gating mechanism.<n>We perform a convergence analysis of the expert estimation task to highlight the gains in sample efficiency for both the shared expert strategy and the normalized sigmoid gating.
arXiv Detail & Related papers (2025-05-16T04:58:18Z) - LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization [61.16890890570814]
Domain generalization (DG) methods aim to maintain good performance in an unseen target domain by using training data from multiple source domains.
This work introduces a simple yet effective framework, dubbed learning from multiple experts (LFME) that aims to make the target model an expert in all source domains to improve DG.
arXiv Detail & Related papers (2024-10-22T13:44:10Z) - MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for
Long-tailed Semantic Segmentation [36.03023287593103]
Long-tailed distribution of semantic categories causes unsatisfactory performance in semantic segmentation on tail categories.
We propose MEDOE, a novel framework for long-tailed semantic segmentation via contextual information ensemble-and-grouping.
Experimental results show that the proposed framework outperforms the current methods on both Cityscapes and ADE20K datasets.
arXiv Detail & Related papers (2023-08-16T08:30:44Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z) - Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification [106.08067870620218]
We propose a self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME)
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-06T12:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.