Related papers: MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

URL: http://arxiv.org/abs/2405.01029v2
Date: Mon, 6 May 2024 11:35:57 GMT
Title: MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts
Authors: Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song, Yining Ma, Jie Zhang, Chi Xu,
Abstract summary: We propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE) We develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity. Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants.
Score: 26.790392171537754
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to solve vehicle routing problems (VRPs) has garnered much attention. However, most neural solvers are only structured and trained independently on a specific problem, making them less generic and practical. In this paper, we aim to develop a unified neural solver that can cope with a range of VRP variants simultaneously. Specifically, we propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE), which greatly enhances the model capacity without a proportional increase in computation. We further develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity. Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants, and showcases decent results on the few-shot setting and real-world benchmark instances. We further conduct extensive studies on the effect of MoE configurations in solving VRPs, and observe the superiority of hierarchical gating when facing out-of-distribution data. The source code is available at: https://github.com/RoyalSkye/Routing-MVMoE.

Related papers

LRM-1B: Towards Large Routing Model [26.18687224390521]
Vehicle routing problems (VRPs) are central to optimization with significant practical implications.<n>Recent advancements in neural optimization (NCO) have demonstrated promising results by leveraging neural networks to solve VRPs.<n>This study introduces a Large Routing Model with 1 billion parameters (LRM-1B) designed to address diverse VRP scenarios.
arXiv Detail & Related papers (2025-07-04T05:10:20Z)
MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver [9.61561012521585]
This work introduces a novel multi-task learning method driven by knowledge distillation (MTL-KD)<n>The proposed MTL-KD method transfers policy knowledge from multiple distinct RL-based single-task models to a single heavy decoder model, label-free training and effectively improving the model's generalization ability across diverse tasks.<n> Experimental results on 6 seen and 10 unseen VRP variants with up to 1000 nodes indicate that our proposed method consistently achieves superior performance on both uniform and real-world benchmarks.
arXiv Detail & Related papers (2025-06-03T14:35:36Z)
DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing Mechanism [5.988126768890861]
DynMoLE is a hybrid routing strategy that dynamically adjusts expert selection based on the Tsallis entropy of the router's probability distribution. Our experiments on commonsense reasoning benchmarks demonstrate that DynMoLE achieves substantial performance improvements.
arXiv Detail & Related papers (2025-04-01T11:14:19Z)
Mixture of Routers [4.248666380057258]
We propose an efficient fine-tuning method called Mixture of Routers (MoR) MoR uses multiple sub-routers for joint selection and uses a learnable main router to determine the weights of the sub-routers. Results show that MoR outperforms baseline models on most tasks, achieving an average performance improvement of 1%.
arXiv Detail & Related papers (2025-03-30T08:39:09Z)
TuneNSearch: a hybrid transfer learning and local search approach for solving vehicle routing problems [43.89334324926175]
TuneNSearch is a hybrid transfer learning and local search approach for addressing different variants of vehicle routing problems (VRP) We first pre-train a reinforcement learning model on the multi-depot VRP, followed by a short fine-tuning phase to adapt it to different variants. Results show that TuneNSearch outperforms many existing state-of-the-art models trained for each VRP variant, requiring only one-fifth of the training epochs.
arXiv Detail & Related papers (2025-03-16T21:34:11Z)
DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding [76.3876070043663]
We propose DriveLMM-o1, a dataset and benchmark designed to advance step-wise visual reasoning for autonomous driving. Our benchmark features over 18k VQA examples in the training set and more than 4k in the test set, covering diverse questions on perception, prediction, and planning. Our model achieves a +7.49% gain in final answer accuracy, along with a 3.62% improvement in reasoning score over the previous best open-source model.
arXiv Detail & Related papers (2025-03-13T17:59:01Z)
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications. FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z)
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost. We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion. By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z)
Cross-Problem Learning for Solving Vehicle Routing Problems [24.212686893913826]
Existing neurals often train a deep architecture from scratch for each specific vehicle routing problem (VRP) This paper proposes the cross-problem learning to empirically assists training for different downstream VRP variants.
arXiv Detail & Related papers (2024-04-17T18:17:50Z)
Learning to Deliver: a Foundation Model for the Montreal Capacitated Vehicle Routing Problem [5.295700401553376]
We present the Foundation Model for the Montreal Capacitated Vehicle Routing Problem (FM-MCVRP) FM-MCVRP is a novel Deep Learning (DL) model that approximates high-quality solutions to a variant of the Capacitated Vehicle Routing Problem (CVRP) We show that FM-MCVRP produces better MCVRP solutions than the training data and generalizes to larger sized problem instances not seen during training.
arXiv Detail & Related papers (2024-02-28T16:02:29Z)
Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization [18.298695520665348]
Vehicle routing problems (VRPs) can be found in numerous real-world applications. In this work, we make the first attempt to tackle the crucial challenge of cross-problem generalization. Our proposed model can successfully solve VRPs with unseen attribute combinations in a zero-shot generalization manner.
arXiv Detail & Related papers (2024-02-23T13:25:23Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Towards Omni-generalizable Neural Methods for Vehicle Routing Problems [14.210085924625705]
This paper studies a challenging yet realistic setting, which considers generalization across both size and distribution in VRPs. We propose a generic meta-learning framework, which enables effective training of an model with the capability of fast adaptation to new tasks during inference.
arXiv Detail & Related papers (2023-05-31T06:14:34Z)
Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching. We propose retro branching; a simple yet effective approach to RL for branching. We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z)
Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition [8.727216421226814]
We propose a Progressive Multi-Stage Interactive training method with a Recursive Mosaic Generator (RMG-PMSI) First, we propose a Recursive Mosaic Generator (RMG) that generates images with different granularities in different phases. Then, the features of different stages pass through a Multi-Stage Interaction (MSI) module, which strengthens and complements the corresponding features of different stages. Experiments on three prestigious fine-grained benchmarks show that RMG-PMSI can significantly improve the performance with good robustness and transferability.
arXiv Detail & Related papers (2021-12-08T10:50:03Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.