Related papers: Learning Admissible Heuristics for A*: Theory and Practice

Learning Admissible Heuristics for A*: Theory and Practice

URL: http://arxiv.org/abs/2509.22626v1
Date: Fri, 26 Sep 2025 17:51:26 GMT
Title: Learning Admissible Heuristics for A*: Theory and Practice
Authors: Ehsan Futuhi, Nathan R. Sturtevant,
Abstract summary: deep learning approaches often disregard admissibility and provide limited guarantees on generalization beyond the training data.<n>This paper addresses both of these limitations. First, we pose learning as a constrained optimization problem and introduce Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training.<n>On the Rubik's Cube domain, this method yields near-admissibles with significantly stronger than compressed pattern database (PDB) guidances.
Score: 8.408138419383747
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Heuristic functions are central to the performance of search algorithms such as A-star, where admissibility - the property of never overestimating the true shortest-path cost - guarantees solution optimality. Recent deep learning approaches often disregard admissibility and provide limited guarantees on generalization beyond the training data. This paper addresses both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training. On the Rubik's Cube domain, this method yields near-admissible heuristics with significantly stronger guidance than compressed pattern database (PDB) heuristics. Theoretically, we study the sample complexity of learning heuristics. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik's Cube, we tighten the bound on the number of training samples needed for A-star to generalize. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primarily on the network's width and depth, rather than on graph size. Using the same network, we also provide the first generalization guarantees for goal-dependent heuristics.

Related papers

UPath: Universal Planner Across Topological Heterogeneity For Grid-Based Pathfinding [43.22339935902436]
In this work, we close this gap by designing an universal predictor by designing a model trained once, but capable of generalizing tasks.<n>Our empirical approach shows suggested halves the computational effort of A* by up to a factor of 2.2, while still providing solutions within 3% of average factor of 2.2.
arXiv Detail & Related papers (2026-02-27T08:34:56Z)
Which Algorithms Can Graph Neural Networks Learn? [14.935565203071528]
We propose a general theoretical framework that characterizes the sufficient conditions under which MPNNs can learn an algorithm.<n>Our framework applies to a broad class of algorithms, including single-source shortest paths, minimum spanning trees, and general dynamic programming problems.
arXiv Detail & Related papers (2026-02-13T17:09:50Z)
RL for Reasoning by Adaptively Revealing Rationales [36.50924054394857]
Supervised fine-tuning (SFT) relies on dense ground-truth labels, which become increasingly costly as sequence length grows.<n>We address this by adaptive backtracking (AdaBack), a per-sample curriculum learning algorithm that reveals only a partial prefix of the target output during training.<n>We show that our adaptive curriculum over partial answers reliably solves problems that are otherwise intractable.
arXiv Detail & Related papers (2025-06-22T17:46:14Z)
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits [58.63897489864948]
Reinforcement learning with outcome-based feedback faces a fundamental challenge.<n>How do we assign credit to the right actions?<n>This paper provides the first comprehensive analysis of this problem in online RL with general function approximation.
arXiv Detail & Related papers (2025-05-26T17:44:08Z)
Minimum Description Length and Generalization Guarantees for Representation Learning [16.2444595840653]
This paper presents a framework that allows us to derive upper bounds on the generalization error of a representation learning algorithm. Rather than the mutual information between the encoder's input and the representation, our new bounds involve the "multi-letter" relative entropy. To the best knowledge of the authors, the established generalization bounds are the first of their kind for Information Bottleneck (IB) type encoders and representation learning.
arXiv Detail & Related papers (2024-02-05T18:12:28Z)
Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks. We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z)
TransPath: Learning Heuristics For Grid-Based Pathfinding via Transformers [64.88759709443819]
We suggest learning the instance-dependent proxies that are supposed to notably increase the efficiency of the search. The first proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one. The second proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path.
arXiv Detail & Related papers (2022-12-22T14:26:11Z)
Learning Non-Vacuous Generalization Bounds from Optimization [8.294831479902658]
We present a simple yet non-vacuous generalization bound from the optimization perspective. We achieve this goal by leveraging that the hypothesis set accessed by gradient algorithms is essentially fractal-like. Numerical studies demonstrate that our approach is able to yield plausible generalization guarantees for modern neural networks.
arXiv Detail & Related papers (2022-06-09T08:59:46Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness [33.09831377640498]
We study approaches to improve uncertainty property of a single network, based on a single, deterministic representation. We propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs. On a suite of vision and language understanding benchmarks, SNGP outperforms other single-model approaches in prediction, calibration and out-of-domain detection.
arXiv Detail & Related papers (2022-05-01T05:46:13Z)
Generalization Through The Lens Of Leave-One-Out Error [22.188535244056016]
We show that the leave-one-out error provides a tractable way to estimate the generalization ability of deep neural networks in the kernel regime. Our work therefore demonstrates that the leave-one-out error provides a tractable way to estimate the generalization ability of deep neural networks in the kernel regime.
arXiv Detail & Related papers (2022-03-07T14:56:00Z)
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape. We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z)
Polynomial-Time Exact MAP Inference on Discrete Models with Global Dependencies [83.05591911173332]
junction tree algorithm is the most general solution for exact MAP inference with run-time guarantees. We propose a new graph transformation technique via node cloning which ensures a run-time for solving our target problem independently of the form of a corresponding clique tree.
arXiv Detail & Related papers (2019-12-27T13:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.