On Solving Close Enough Orienteering Problems with Overlapped Neighborhoods
- URL: http://arxiv.org/abs/2310.04257v3
- Date: Wed, 15 May 2024 08:13:01 GMT
- Title: On Solving Close Enough Orienteering Problems with Overlapped Neighborhoods
- Authors: Qiuchen Qian, Yanran Wang, David Boyle,
- Abstract summary: Close Enough Traveling Salesman Problem (CETSP) is a well-known variant of Close Enough Orienteering Problem (CEOP)
Heuristics based on overlapped neighborhoods, known as Steiner Zones (SZ), have gained attention in addressing CETSP.
Here we show how such limitations can be converted into advantages in a Close Enough Orienteering Problem (CEOP) by aggregating prizes across overlapped neighborhoods.
- Score: 2.990411348977783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Close Enough Traveling Salesman Problem (CETSP) is a well-known variant of TSP whereby the agent may complete its mission at any point within a target neighborhood. Heuristics based on overlapped neighborhoods, known as Steiner Zones (SZ), have gained attention in addressing CETSP. While SZs offer effective approximations to the original graph, their inherent overlap imposes constraints on search space, potentially conflicting with global optimization objectives. Here we show how such limitations can be converted into advantages in a Close Enough Orienteering Problem (CEOP) by aggregating prizes across overlapped neighborhoods. We further extend classic CEOP with Non-uniform Neighborhoods (CEOP-N) by introducing non-uniform costs for prize collection. To tackle CEOP and CEOP-N, we develop a new approach featuring a Randomized Steiner Zone Discretization (RSZD) scheme coupled with a hybrid algorithm based on Particle Swarm Optimization (PSO) and Ant Colony System (ACS), CRaSZe-AntS. The RSZD scheme identifies sub-regions for PSO exploration, and ACS determines the discrete visiting sequence. We evaluate the RSZD's discretization performance on CEOP instances derived from established CETSP instances and compare CRaSZe-AntS against the most relevant state-of-the-art heuristic focused on single-neighborhood optimization for CEOP instances. We also compare the performance of the interior search within SZs and the boundary search on individual neighborhoods in the context of CEOP-N. Our experimental results show that CRaSZe-AntS can yield comparable solution quality with significantly reduced computation time compared to the single neighborhood strategy, where we observe an average 140.44% increase in prize collection and a 55.18% reduction in algorithm execution time. CRaSZe-AntS is thus highly effective in solving emerging CEOP-N, examples of which include truck-and-drone delivery scenarios.
Related papers
- Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy
Optimization [63.32053223422317]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
In particular, we focus on characterizing the variance over values induced by a distribution over MDPs.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Personalized Federated X -armed Bandit [44.7483638955294]
We study the personalized federated $mathcalX$-armed bandit problem, where the heterogeneous local objectives of the clients are optimized simultaneously in the federated learning paradigm.
We propose the textttPF-PNE algorithm with a unique double elimination strategy, which safely eliminates the non-optimal regions while encouraging federated collaboration through biased but effective evaluations of the local objectives.
arXiv Detail & Related papers (2023-10-25T03:11:32Z) - High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance.
We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation.
We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z) - High-Similarity-Pass Attention for Single Image Super-Resolution [81.56822938033118]
Recent developments in the field of non-local attention (NLA) have led to a renewed interest in self-similarity-based single image super-resolution (SISR)
We introduce a concise yet effective soft thresholding operation to obtain high-similarity-pass attention (HSPA)
To demonstrate the effectiveness of the HSPA, we constructed a deep high-similarity-pass attention network (HSPAN)
arXiv Detail & Related papers (2023-05-25T06:24:14Z) - Accelerating the Genetic Algorithm for Large-scale Traveling Salesman
Problems by Cooperative Coevolutionary Pointer Network with Reinforcement
Learning [2.7716102039510564]
We propose a two-stage optimization strategy for solving the Large-scale Traveling Salesman Problems (LSTSPs) named CCPNRL-GA.
First, we hypothesize that the participation of a well-performed individual as an elite can accelerate the convergence of optimization.
We validate the performance of our proposal on 10 LSTSPs and compare it with traditional EAs.
arXiv Detail & Related papers (2022-09-27T00:06:04Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks [59.419152768018506]
We show that any optimal policy necessarily satisfies the k-SP constraint.
We propose a novel cost function that penalizes the policy violating SP constraint, instead of completely excluding it.
Our experiments on MiniGrid, DeepMind Lab, Atari, and Fetch show that the proposed method significantly improves proximal policy optimization (PPO)
arXiv Detail & Related papers (2021-07-13T21:39:21Z) - An Improved Approach for Estimating Social POI Boundaries With Textual
Attributes on Social Media [3.590202054885437]
It has been insufficiently explored how to perform density-based clustering by exploiting textual attributes on social media.
We present a new approach and algorithm, built upon our earlier work on social POI boundary estimation (SoBEst)
Our study is motivated by the following empirical observation: a fixed representative coordinate of each POI that SoBEst basically assumes may be far away from the centroid of the estimated social POI boundary for certain POIs.
arXiv Detail & Related papers (2020-12-18T00:41:44Z) - CRPO: A New Approach for Safe Reinforcement Learning with Convergence
Guarantee [61.176159046544946]
In safe reinforcement learning (SRL) problems, an agent explores the environment to maximize an expected total reward and avoids violation of certain constraints.
This is the first-time analysis of SRL algorithms with global optimal policies.
arXiv Detail & Related papers (2020-11-11T16:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.