Related papers: Distance-aware Attention Reshaping: Enhance Generalization of Neural Solver for Large-scale Vehicle Routing Problems

Distance-aware Attention Reshaping: Enhance Generalization of Neural Solver for Large-scale Vehicle Routing Problems

URL: http://arxiv.org/abs/2401.06979v1
Date: Sat, 13 Jan 2024 05:01:14 GMT
Title: Distance-aware Attention Reshaping: Enhance Generalization of Neural Solver for Large-scale Vehicle Routing Problems
Authors: Yang Wang and Ya-Hui Jia and Wei-Neng Chen and Yi Mei
Abstract summary: We propose a distance-aware attention reshaping method, assisting neural solvers in solving large-scale vehicle routing problems. We utilize the Euclidean distance information between current nodes to adjust attention scores. Experimental results show that the proposed method significantly outperforms existing state-of-the-art neural solvers on the large-scale CVRPLib dataset.
Score: 5.190244678604757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural solvers based on attention mechanism have demonstrated remarkable effectiveness in solving vehicle routing problems. However, in the generalization process from small scale to large scale, we find a phenomenon of the dispersion of attention scores in existing neural solvers, which leads to poor performance. To address this issue, this paper proposes a distance-aware attention reshaping method, assisting neural solvers in solving large-scale vehicle routing problems. Specifically, without the need for additional training, we utilize the Euclidean distance information between current nodes to adjust attention scores. This enables a neural solver trained on small-scale instances to make rational choices when solving a large-scale problem. Experimental results show that the proposed method significantly outperforms existing state-of-the-art neural solvers on the large-scale CVRPLib dataset.

Related papers

Learning Discontinuous Galerkin Solutions to Elliptic Problems via Small Linear Convolutional Neural Networks [1.124958340749622]
We propose two approaches for learning discontinuous Galerkin solutions to PDEs using small linear convolutional neural networks. Our first approach is supervised and depends on labeled data, while our second approach is unsupervised and does not rely on any training data. In both cases, our methods use substantially fewer parameters than similar numerics-based neural networks while also demonstrating comparable accuracy to the true and DG solutions for elliptic problems.
arXiv Detail & Related papers (2025-02-12T20:53:34Z)
Neural Networks for Vehicle Routing Problem [0.0]
Route optimization can be viewed as a new challenge for neural networks. Recent developments in machine learning provide a new toolset, for tackling complex problems. The main area of application of neural networks is the area of classification and regression.
arXiv Detail & Related papers (2024-09-17T15:45:30Z)
Deep Reinforcement Learning for Picker Routing Problem in Warehousing [0.6562256987706128]
We introduce an attention based neural network for modeling picker tours, which is trained using Reinforcement Learning. A key advantage of our proposed method is its ability to offer an option to reduce the perceived complexity of routes.
arXiv Detail & Related papers (2024-02-05T21:25:45Z)
Sparse Multitask Learning for Efficient Neural Representation of Motor Imagery and Execution [30.186917337606477]
We introduce a sparse multitask learning framework for motor imagery (MI) and motor execution (ME) tasks. Given a dual-task CNN model for MI-ME classification, we apply a saliency-based sparsification approach to prune superfluous connections. Our results indicate that this tailored sparsity can mitigate the overfitting problem and improve the test performance with small amount of data.
arXiv Detail & Related papers (2023-12-10T09:06:16Z)
Adaptive recurrent vision performs zero-shot computation scaling to unseen difficulty levels [6.053394076324473]
We investigate whether adaptive computation can also enable vision models to extrapolate solutions beyond their training distribution's difficulty level. We combine convolutional recurrent neural networks (ConvRNNs) with a learnable mechanism based on Graves: PathFinder and Mazes. We show that AdRNNs learn to dynamically halt processing early (or late) to solve easier (or harder) problems, 2) these RNNs zero-shot generalize to more difficult problem settings not shown during training by dynamically increasing the number of recurrent at test time.
arXiv Detail & Related papers (2023-11-12T21:07:04Z)
Genetic Algorithms with Neural Cost Predictor for Solving Hierarchical Vehicle Routing Problems [20.684353068460375]
When vehicle routing decisions are intertwined with higher-level decisions, the resulting optimization problems pose significant challenges for computation. We propose a novel deep-learning-based approach called Genetic Algorithm with Neural Cost Predictor (GANCP) to tackle the challenge. In particular, our proposed neural network learns the objective values of the HGS-CVRP open-source package that solves capacitated vehicle routing problems.
arXiv Detail & Related papers (2023-10-22T02:46:37Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Zonotope Domains for Lagrangian Neural Network Verification [102.13346781220383]
We decompose the problem of verifying a deep neural network into the verification of many 2-layer neural networks. Our technique yields bounds that improve upon both linear programming and Lagrangian-based verification techniques.
arXiv Detail & Related papers (2022-10-14T19:31:39Z)
DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting [70.62923754433461]
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non- optimization problem. We propose a novel method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller subproblems that often have analytical solutions.
arXiv Detail & Related papers (2021-06-16T20:43:49Z)
Unlocking Pixels for Reinforcement Learning via Implicit Attention [61.666538764049854]
We make use of new efficient attention algorithms, recently shown to be highly effective for Transformers. This allows our attention-based controllers to scale to larger visual inputs, and facilitate the use of smaller patches. In addition, we propose a new efficient algorithm approximating softmax attention with what we call hybrid random features.
arXiv Detail & Related papers (2021-02-08T17:00:26Z)
Untangling tradeoffs between recurrence and self-attention in neural networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks. We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies. We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z)
Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks. We propose a feature distortion method (Disout) for addressing the aforementioned problem. The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.