Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture
- URL: http://arxiv.org/abs/2406.06652v2
- Date: Mon, 17 Jun 2024 14:02:57 GMT
- Title: Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture
- Authors: Yubin Xiao, Di Wang, Xuan Wu, Yuesong Wu, Boyang Li, Wei Du, Liupu Wang, You Zhou,
- Abstract summary: We propose a plug-and-play Entropy-based Scaling Factor (ESF) and a Distribution-Specific (DS) decoder.
ESF adjusts the attention weight pattern of the model towards familiar ones discovered during training when solving VRPs of varying sizes.
DS decoder explicitly models VRPs of multiple training distribution patterns through multiple auxiliary light decoders, expanding the model representation space.
- Score: 9.244633039170186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural models produce promising results when solving Vehicle Routing Problems (VRPs), but often fall short in generalization. Recent attempts to enhance model generalization often incur unnecessarily large training cost or cannot be directly applied to other models solving different VRP variants. To address these issues, we take a novel perspective on model architecture in this study. Specifically, we propose a plug-and-play Entropy-based Scaling Factor (ESF) and a Distribution-Specific (DS) decoder to enhance the size and distribution generalization, respectively. ESF adjusts the attention weight pattern of the model towards familiar ones discovered during training when solving VRPs of varying sizes. The DS decoder explicitly models VRPs of multiple training distribution patterns through multiple auxiliary light decoders, expanding the model representation space to encompass a broader range of distributional scenarios. We conduct extensive experiments on both synthetic and widely recognized real-world benchmarking datasets and compare the performance with seven baseline models. The results demonstrate the effectiveness of using ESF and DS decoder to obtain a more generalizable model and showcase their applicability to solve different VRP variants, i.e., travelling salesman problem and capacitated VRP. Notably, our proposed generic components require minimal computational resources, and can be effortlessly integrated into conventional generalization strategies to further elevate model generalization.
Related papers
- DiffSG: A Generative Solver for Network Optimization with Diffusion Model [75.27274046562806]
Diffusion generative models can consider a broader range of solutions and exhibit stronger generalization by learning parameters.
We propose a new framework, which leverages intrinsic distribution learning of diffusion generative models to learn high-quality solutions.
arXiv Detail & Related papers (2024-08-13T07:56:21Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations.
We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs)
EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding.
Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z) - Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization [18.298695520665348]
Vehicle routing problems (VRPs) can be found in numerous real-world applications.
In this work, we make the first attempt to tackle the crucial challenge of cross-problem generalization.
Our proposed model can successfully solve VRPs with unseen attribute combinations in a zero-shot generalization manner.
arXiv Detail & Related papers (2024-02-23T13:25:23Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - Attention, Filling in The Gaps for Generalization in Routing Problems [5.210197476419621]
This paper aims at encouraging the consolidation of the field through understanding and improving current existing models.
We first target model discrepancies by adapting the Kool et al. method and its loss function for Sparse Dynamic Attention.
We then target inherent differences through the use of a mixed instance training method that has been shown to outperform single instance training in certain scenarios.
arXiv Detail & Related papers (2022-07-14T21:36:51Z) - Bottlenecks CLUB: Unifying Information-Theoretic Trade-offs Among
Complexity, Leakage, and Utility [8.782250973555026]
Bottleneck problems are an important class of optimization problems that have recently gained increasing attention in the domain of machine learning and information theory.
We propose a general family of optimization problems, termed as complexity-leakage-utility bottleneck (CLUB) model.
We show that the CLUB model generalizes all these problems as well as most other information-theoretic privacy models.
arXiv Detail & Related papers (2022-07-11T14:07:48Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Learning to Solve Routing Problems via Distributionally Robust
Optimization [14.506553345693536]
Recent deep models for solving routing problems assume a single distribution of nodes for training, which severely impairs their cross-distribution generalization ability.
We exploit group distributionally robust optimization (group DRO) to tackle this issue, where we jointly optimize the weights for different groups of distributions and the parameters for the deep model in an interleaved manner during training.
We also design a module based on convolutional neural network, which allows the deep model to learn more informative latent pattern among the nodes.
arXiv Detail & Related papers (2022-02-15T08:06:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.