Related papers: A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

URL: http://arxiv.org/abs/2110.15105v2
Date: Fri, 29 Oct 2021 09:06:30 GMT
Title: A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers
Authors: Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang
Abstract summary: We introduce a two-player zerosum framework between a trainable emphr and a emphData Generator. We show that our framework achieves the most generalizable performance on different Traveling Salesman Problems tasks.
Score: 16.98434288039677
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we shed new light on the generalization ability of deep learning-based solvers for Traveling Salesman Problems (TSP). Specifically, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator}, where the Solver aims to solve the task instances provided by the Generator, and the Generator aims to generate increasingly difficult instances for improving the Solver. Grounded in \textsl{Policy Space Response Oracle} (PSRO) methods, our two-player framework outputs a population of best-responding Solvers, over which we can mix and output a combined model that achieves the least exploitability against the Generator, and thereby the most generalizable performance on different TSP tasks. We conduct experiments on a variety of TSP instances with different types and sizes. Results suggest that our Solvers achieve the state-of-the-art performance even on tasks the Solver never meets, whilst the performance of other deep learning-based Solvers drops sharply due to over-fitting. On real-world instances from \textsc{TSPLib}, our method also attains a \textbf{12\%} improvement, in terms of optimal gap, over the best baseline model. To demonstrate the principle of our framework, we study the learning outcome of the proposed two-player game and demonstrate that the exploitability of the Solver population decreases during training, and it eventually approximates the Nash equilibrium along with the Generator.

Related papers

Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Neural Solver Selection for Combinatorial Optimization [23.449310200885893]
We propose the first general framework to coordinate neural solvers, which involves feature extraction, selection model, and selection strategy. We show that the proposed framework can effectively distribute instances and the resulting composite solver can achieve significantly better performance.
arXiv Detail & Related papers (2024-10-13T02:05:41Z)
Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem [1.3927943269211593]
This paper aims to integrate constraint programming (CP) within a deep learning (DL) based methodology, leveraging the benefits of both. We introduce a method that involves training a DL model using optimal solutions generated by CP, ensuring the model learns from high-quality data. Our hybrid approach has been extensively tested on three public FJSSP benchmarks, demonstrating superior performance over five state-of-the-art DRL approaches.
arXiv Detail & Related papers (2024-03-14T10:16:57Z)
Self-Labeling the Job Shop Scheduling Problem [15.723699332053558]
We show that generative models can be trained by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label. We prove the robustness of SLIM to various parameters and its generality by applying it to the Traveling Salesman Problem.
arXiv Detail & Related papers (2024-01-22T11:08:36Z)
Promoting Generalization for Exact Solvers via Adversarial Instance Augmentation [62.738582127114704]
Adar is a framework for understanding and improving the generalization of both imitation-learning-based (IL-based) and reinforcement-learning-based solvers (RL-based)
arXiv Detail & Related papers (2023-10-22T03:15:36Z)
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX. textttMEX integrates estimation and planning components while balancing exploration exploitation automatically. It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z)
ASP: Learn a Universal Neural Solver! [16.98189196303338]
We propose ASP: Adaptive Staircase Policy Space Response Oracle to address these generalization issues. ASP consists of two components: Distributional Exploration and Persistent Scale Adaption. Our results show that ASP can help neural solvers explore and adapt to unseen distributions and varying scales, achieving superior performance.
arXiv Detail & Related papers (2023-03-01T12:47:14Z)
Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks. We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z)
Learning to Solve Routing Problems via Distributionally Robust Optimization [14.506553345693536]
Recent deep models for solving routing problems assume a single distribution of nodes for training, which severely impairs their cross-distribution generalization ability. We exploit group distributionally robust optimization (group DRO) to tackle this issue, where we jointly optimize the weights for different groups of distributions and the parameters for the deep model in an interleaved manner during training. We also design a module based on convolutional neural network, which allows the deep model to learn more informative latent pattern among the nodes.
arXiv Detail & Related papers (2022-02-15T08:06:44Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms. SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.