RouteFinder: Towards Foundation Models for Vehicle Routing Problems
- URL: http://arxiv.org/abs/2406.15007v2
- Date: Wed, 09 Oct 2024 14:23:24 GMT
- Title: RouteFinder: Towards Foundation Models for Vehicle Routing Problems
- Authors: Federico Berto, Chuanbo Hua, Nayeli Gast Zepeda, André Hottung, Niels Wouda, Leon Lan, Junyoung Park, Kevin Tierney, Jinkyoo Park,
- Abstract summary: RouteFinder is a framework to tackle different Vehicle Routing Problem (VRP) variants.
Our core idea is that a foundation model for VRPs should be able to represent variants by treating each as a subset of a generalized problem equipped with different attributes.
- Score: 21.3310292139361
- License:
- Abstract: This paper introduces RouteFinder, a comprehensive foundation model framework to tackle different Vehicle Routing Problem (VRP) variants. Our core idea is that a foundation model for VRPs should be able to represent variants by treating each as a subset of a generalized problem equipped with different attributes. We propose a unified VRP environment capable of efficiently handling any attribute combination. The RouteFinder model leverages a modern transformer-based encoder and global attribute embeddings to improve task representation. Additionally, we introduce two reinforcement learning techniques to enhance multi-task performance: mixed batch training, which enables training on different variants at once, and multi-variant reward normalization to balance different reward scales. Finally, we propose efficient adapter layers that enable fine-tuning for new variants with unseen attributes. Extensive experiments on 24 VRP variants show RouteFinder achieves competitive results. Our code is openly available at https://github.com/ai4co/routefinder.
Related papers
- ODRL: A Benchmark for Off-Dynamics Reinforcement Learning [59.72217833812439]
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods.
ODRL contains four experimental settings where the source and target domains can be either online or offline.
We conduct extensive benchmarking experiments, which show that no method has universal advantages across varied dynamics shifts.
arXiv Detail & Related papers (2024-10-28T05:29:38Z) - Prompt Learning for Generalized Vehicle Routing [17.424910810870273]
This work investigates an efficient prompt learning approach in Neural optimization for cross-distribution adaptation.
The proposed model learns a set of prompts among various distributions and then selects the best-matched one to prompt a pre-trained attention model for each problem instance.
It also outperforms existing generalized models on both in-distribution prediction and zero-shot generalization to a diverse set of new tasks.
arXiv Detail & Related papers (2024-05-20T15:42:23Z) - MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts [26.790392171537754]
We propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE)
We develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity.
Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants.
arXiv Detail & Related papers (2024-05-02T06:02:07Z) - Cross-Problem Learning for Solving Vehicle Routing Problems [24.212686893913826]
Existing neurals often train a deep architecture from scratch for each specific vehicle routing problem (VRP)
This paper proposes the cross-problem learning to empirically assists training for different downstream VRP variants.
arXiv Detail & Related papers (2024-04-17T18:17:50Z) - Bi-directional Adapter for Multi-modal Tracking [67.01179868400229]
We propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter.
We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another.
Our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods.
arXiv Detail & Related papers (2023-12-17T05:27:31Z) - Towards Omni-generalizable Neural Methods for Vehicle Routing Problems [14.210085924625705]
This paper studies a challenging yet realistic setting, which considers generalization across both size and distribution in VRPs.
We propose a generic meta-learning framework, which enables effective training of an model with the capability of fast adaptation to new tasks during inference.
arXiv Detail & Related papers (2023-05-31T06:14:34Z) - Multi-Head Adapter Routing for Cross-Task Generalization [56.75667096355806]
Polytropon learns an inventory of adapters and a routing function that selects a subset of adapters for each task during both pre-training and few-shot adaptation.
We find that routing is most beneficial during multi-task pre-training rather than during few-shot adaptation.
arXiv Detail & Related papers (2022-11-07T19:35:55Z) - StableMoE: Stable Routing Strategy for Mixture of Experts [109.0602120199226]
Mixture-of-Experts (MoE) technique can scale up the model size of Transformers with an affordable computational overhead.
We propose StableMoE with two training stages to address the routing fluctuation problem.
Results show that StableMoE outperforms existing MoE methods in terms of both convergence speed and performance.
arXiv Detail & Related papers (2022-04-18T16:48:19Z) - MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
Configurations [51.85020143716815]
We propose MutualNet to train a single network that can run at a diverse set of resource constraints.
Our method trains a cohort of model configurations with various network widths and input resolutions.
MutualNet is a general training methodology that can be applied to various network structures.
arXiv Detail & Related papers (2021-05-14T22:30:13Z) - Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural
Architecture Search [60.965024145243596]
One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance.
To alleviate this problem, we present a simple yet effective architecture distillation method.
We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training.
Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.
arXiv Detail & Related papers (2020-10-29T17:55:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.