ASP: Learn a Universal Neural Solver!
- URL: http://arxiv.org/abs/2303.00466v1
- Date: Wed, 1 Mar 2023 12:47:14 GMT
- Title: ASP: Learn a Universal Neural Solver!
- Authors: Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong
Yang
- Abstract summary: We propose ASP: Adaptive Staircase Policy Space Response Oracle to address these generalization issues.
ASP consists of two components: Distributional Exploration and Persistent Scale Adaption.
Our results show that ASP can help neural solvers explore and adapt to unseen distributions and varying scales, achieving superior performance.
- Score: 16.98189196303338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Applying machine learning to combinatorial optimization problems has the
potential to improve both efficiency and accuracy. However, existing
learning-based solvers often struggle with generalization when faced with
changes in problem distributions and scales. In this paper, we propose a new
approach called ASP: Adaptive Staircase Policy Space Response Oracle to address
these generalization issues and learn a universal neural solver. ASP consists
of two components: Distributional Exploration, which enhances the solver's
ability to handle unknown distributions using Policy Space Response Oracles,
and Persistent Scale Adaption, which improves scalability through curriculum
learning. We have tested ASP on several challenging COPs, including the
traveling salesman problem, the vehicle routing problem, and the prize
collecting TSP, as well as the real-world instances from TSPLib and CVRPLib.
Our results show that even with the same model size and weak training signal,
ASP can help neural solvers explore and adapt to unseen distributions and
varying scales, achieving superior performance. In particular, compared with
the same neural solvers under a standard training pipeline, ASP produces a
remarkable decrease in terms of the optimality gap with 90.9% and 47.43% on
generated instances and real-world instances for TSP, and a decrease of 19% and
45.57% for CVRP.
Related papers
- Any Image Restoration with Efficient Automatic Degradation Adaptation [132.81912195537433]
We propose a unified manner to achieve joint embedding by leveraging the inherent similarities across various degradations for efficient and comprehensive restoration.
Our network sets new SOTA records while reducing model complexity by approximately -82% in trainable parameters and -85% in FLOPs.
arXiv Detail & Related papers (2024-07-18T10:26:53Z) - Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning [18.579378919155864]
We propose Adaptive $Q$Network (AdaQN) to take into account the non-stationarity of the optimization procedure without requiring additional samples.
AdaQN is theoretically sound and empirically validate it in MuJoCo control problems and Atari $2600 games.
arXiv Detail & Related papers (2024-05-25T11:57:43Z) - Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization [15.842155380912002]
This work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural optimization.
In particular, we design a powerful yet lightweight instance-conditioned Routing adaptation module for the NCO model.
We develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution.
arXiv Detail & Related papers (2024-05-03T08:00:19Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Pointerformer: Deep Reinforced Multi-Pointer Transformer for the
Traveling Salesman Problem [67.32731657297377]
Traveling Salesman Problem (TSP) is a classic routing optimization problem originally arising in the domain of transportation and logistics.
Recently, Deep Reinforcement Learning has been increasingly employed to solve TSP due to its high inference efficiency.
We propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer.
arXiv Detail & Related papers (2023-04-19T03:48:32Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z) - A Game-Theoretic Approach for Improving Generalization Ability of TSP
Solvers [16.98434288039677]
We introduce a two-player zerosum framework between a trainable emphr and a emphData Generator.
We show that our framework achieves the most generalizable performance on different Traveling Salesman Problems tasks.
arXiv Detail & Related papers (2021-10-28T13:35:22Z) - Sample-Efficient Automated Deep Reinforcement Learning [33.53903358611521]
We propose a population-based automated RL framework to meta-optimize arbitrary off-policy RL algorithms.
By sharing the collected experience across the population, we substantially increase the sample efficiency of the meta-optimization.
We demonstrate the capabilities of our sample-efficient AutoRL approach in a case study with the popular TD3 algorithm in the MuJoCo benchmark suite.
arXiv Detail & Related papers (2020-09-03T10:04:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.