Related papers: On Distributional Dependent Performance of Classical and Neural Routing Solvers

On Distributional Dependent Performance of Classical and Neural Routing Solvers

URL: http://arxiv.org/abs/2508.02510v1
Date: Mon, 04 Aug 2025 15:17:08 GMT
Title: On Distributional Dependent Performance of Classical and Neural Routing Solvers
Authors: Daniela Thyssens, Tim Dernedde, Wilson Sentanoe, Lars Schmidt-Thieme,
Abstract summary: NCO aims to learn to solve a class of problems by learning the underlying distribution of problem instances.<n>This work explores a novel approach to formulate the distribution of problem instances to learn from and, more importantly, plant a structure in the sampled problem instances.<n>We evaluate representative NCO methods and specialized Operation Research metas on this novel task and demonstrate that the performance gap between neural routing solvers and highly specialized meta-heuristics decreases when learning from sub-samples drawn from a fixed base node distribution.
Score: 5.359176539960004
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural Combinatorial Optimization aims to learn to solve a class of combinatorial problems through data-driven methods and notably through employing neural networks by learning the underlying distribution of problem instances. While, so far neural methods struggle to outperform highly engineered problem specific meta-heuristics, this work explores a novel approach to formulate the distribution of problem instances to learn from and, more importantly, plant a structure in the sampled problem instances. In application to routing problems, we generate large problem instances that represent custom base problem instance distributions from which training instances are sampled. The test instances to evaluate the methods on the routing task consist of unseen problems sampled from the underlying large problem instance. We evaluate representative NCO methods and specialized Operation Research meta heuristics on this novel task and demonstrate that the performance gap between neural routing solvers and highly specialized meta-heuristics decreases when learning from sub-samples drawn from a fixed base node distribution.

Related papers

Generative Diffusion Models for Resource Allocation in Wireless Networks [77.36145730415045]
We train a policy to imitate an expert and generate new samples from the optimal distribution.<n>We achieve near-optimal performance through the sequential execution of the generated samples.<n>We present numerical results in a case study of power control.
arXiv Detail & Related papers (2025-04-28T21:44:31Z)
Learning to Reduce Search Space for Generalizable Neural Routing Solver [12.396576646539252]
Constructive neural optimization (NCO) has attracted growing research attention due to its ability to solve complex routing problems without relying on handcrafted rules.<n>Existing NCO methods face challenges in generalizing to large-scale problems due to high computational complexity and inefficient capture of structural patterns.<n>We propose a novel learning-based search space reduction method that adaptively selects a small set of promising candidate nodes at each step of the constructive NCO process.
arXiv Detail & Related papers (2025-03-05T03:25:09Z)
The Effects of Multi-Task Learning on ReLU Neural Network Functions [17.786058035763254]
This paper studies the properties of multi-task shallow ReLU neural network learning problems, wherein the network is trained to fit a dataset with minimal sum of squared weights.<n>Remarkably, the solutions learned for each individual task resemble those obtained by solving a kernel regression problem, revealing a novel connection between neural networks and kernel methods.
arXiv Detail & Related papers (2024-10-29T03:27:08Z)
Too Big, so Fail? -- Enabling Neural Construction Methods to Solve Large-Scale Routing Problems [10.832715681422842]
We show that even state-of-the-art neural construction methods are outperformed by simple iterations. We propose to use the ruin recreate principle that alternates between completely destroying a localized part of the solution and then recreating an improved variant.
arXiv Detail & Related papers (2023-09-29T09:36:37Z)
Learning to Extrapolate: A Transductive Approach [44.74850954809099]
We tackle the problem of developing machine learning systems that retain the power of over parameterized function approximators. We propose a simple strategy based on bilinear embeddings to enable this type of generalization. We instantiate a simple, practical algorithm applicable to various supervised learning and imitation learning tasks.
arXiv Detail & Related papers (2023-04-27T17:00:51Z)
Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems. Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored. We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z)
BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch. BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training. We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Decentralized Personalized Federated Learning for Min-Max Problems [79.61785798152529]
This paper is the first to study PFL for saddle point problems encompassing a broader range of optimization problems. We propose new algorithms to address this problem and provide a theoretical analysis of the smooth (strongly) convex-(strongly) concave saddle point problems. Numerical experiments for bilinear problems and neural networks with adversarial noise demonstrate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2021-06-14T10:36:25Z)
Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks. In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other. This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
Learning the Travelling Salesperson Problem Requires Rethinking Generalization [9.176056742068813]
End-to-end training of neural network solvers for graph optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently. While state-of-the-art learning-driven approaches perform closely to classical solvers when trained on trivially small sizes, they are unable to generalize the learnt policy to larger instances at practical scales. This work presents an end-to-end neural optimization pipeline that unifies several recent papers in order to identify the principled biases, model architectures and learning algorithms that promote generalization to instances larger than those seen in training.
arXiv Detail & Related papers (2020-06-12T10:14:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.