On Solving Close Enough Orienteering Problems with Overlapped Neighborhoods
- URL: http://arxiv.org/abs/2310.04257v3
- Date: Wed, 15 May 2024 08:13:01 GMT
- Title: On Solving Close Enough Orienteering Problems with Overlapped Neighborhoods
- Authors: Qiuchen Qian, Yanran Wang, David Boyle,
- Abstract summary: Close Enough Traveling Salesman Problem (CETSP) is a well-known variant of Close Enough Orienteering Problem (CEOP)
Heuristics based on overlapped neighborhoods, known as Steiner Zones (SZ), have gained attention in addressing CETSP.
Here we show how such limitations can be converted into advantages in a Close Enough Orienteering Problem (CEOP) by aggregating prizes across overlapped neighborhoods.
- Score: 2.990411348977783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Close Enough Traveling Salesman Problem (CETSP) is a well-known variant of TSP whereby the agent may complete its mission at any point within a target neighborhood. Heuristics based on overlapped neighborhoods, known as Steiner Zones (SZ), have gained attention in addressing CETSP. While SZs offer effective approximations to the original graph, their inherent overlap imposes constraints on search space, potentially conflicting with global optimization objectives. Here we show how such limitations can be converted into advantages in a Close Enough Orienteering Problem (CEOP) by aggregating prizes across overlapped neighborhoods. We further extend classic CEOP with Non-uniform Neighborhoods (CEOP-N) by introducing non-uniform costs for prize collection. To tackle CEOP and CEOP-N, we develop a new approach featuring a Randomized Steiner Zone Discretization (RSZD) scheme coupled with a hybrid algorithm based on Particle Swarm Optimization (PSO) and Ant Colony System (ACS), CRaSZe-AntS. The RSZD scheme identifies sub-regions for PSO exploration, and ACS determines the discrete visiting sequence. We evaluate the RSZD's discretization performance on CEOP instances derived from established CETSP instances and compare CRaSZe-AntS against the most relevant state-of-the-art heuristic focused on single-neighborhood optimization for CEOP instances. We also compare the performance of the interior search within SZs and the boundary search on individual neighborhoods in the context of CEOP-N. Our experimental results show that CRaSZe-AntS can yield comparable solution quality with significantly reduced computation time compared to the single neighborhood strategy, where we observe an average 140.44% increase in prize collection and a 55.18% reduction in algorithm execution time. CRaSZe-AntS is thus highly effective in solving emerging CEOP-N, examples of which include truck-and-drone delivery scenarios.
Related papers
- Efficient Learning of POMDPs with Known Observation Model in Average-Reward Setting [56.92178753201331]
We propose the Observation-Aware Spectral (OAS) estimation technique, which enables the POMDP parameters to be learned from samples collected using a belief-based policy.
We show the consistency of the OAS procedure, and we prove a regret guarantee of order $mathcalO(sqrtT log(T)$ for the proposed OAS-UCRL algorithm.
arXiv Detail & Related papers (2024-10-02T08:46:34Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Parameter optimization comparison in QAOA using Stochastic Hill Climbing with Random Re-starts and Local Search with entangled and non-entangled mixing operators [0.0]
This study investigates the efficacy of Hill Climbing with Random Restarts (SHC-RR) compared to Local Search (LS) strategies.
Our results consistently show that SHC-RR outperforms LS approaches, showcasing superior efficacy despite its ostensibly simpler optimization mechanism.
arXiv Detail & Related papers (2024-05-14T20:12:17Z) - Personalized Federated X -armed Bandit [44.7483638955294]
We study the personalized federated $mathcalX$-armed bandit problem, where the heterogeneous local objectives of the clients are optimized simultaneously in the federated learning paradigm.
We propose the textttPF-PNE algorithm with a unique double elimination strategy, which safely eliminates the non-optimal regions while encouraging federated collaboration through biased but effective evaluations of the local objectives.
arXiv Detail & Related papers (2023-10-25T03:11:32Z) - High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance.
We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation.
We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z) - High-Similarity-Pass Attention for Single Image Super-Resolution [81.56822938033118]
Recent developments in the field of non-local attention (NLA) have led to a renewed interest in self-similarity-based single image super-resolution (SISR)
We introduce a concise yet effective soft thresholding operation to obtain high-similarity-pass attention (HSPA)
To demonstrate the effectiveness of the HSPA, we constructed a deep high-similarity-pass attention network (HSPAN)
arXiv Detail & Related papers (2023-05-25T06:24:14Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks [59.419152768018506]
We show that any optimal policy necessarily satisfies the k-SP constraint.
We propose a novel cost function that penalizes the policy violating SP constraint, instead of completely excluding it.
Our experiments on MiniGrid, DeepMind Lab, Atari, and Fetch show that the proposed method significantly improves proximal policy optimization (PPO)
arXiv Detail & Related papers (2021-07-13T21:39:21Z) - An Improved Approach for Estimating Social POI Boundaries With Textual
Attributes on Social Media [3.590202054885437]
It has been insufficiently explored how to perform density-based clustering by exploiting textual attributes on social media.
We present a new approach and algorithm, built upon our earlier work on social POI boundary estimation (SoBEst)
Our study is motivated by the following empirical observation: a fixed representative coordinate of each POI that SoBEst basically assumes may be far away from the centroid of the estimated social POI boundary for certain POIs.
arXiv Detail & Related papers (2020-12-18T00:41:44Z) - CRPO: A New Approach for Safe Reinforcement Learning with Convergence
Guarantee [61.176159046544946]
In safe reinforcement learning (SRL) problems, an agent explores the environment to maximize an expected total reward and avoids violation of certain constraints.
This is the first-time analysis of SRL algorithms with global optimal policies.
arXiv Detail & Related papers (2020-11-11T16:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.