Related papers: Approximate Dec-POMDP Solving Using Multi-Agent A*

Approximate Dec-POMDP Solving Using Multi-Agent A*

URL: http://arxiv.org/abs/2405.05662v1
Date: Thu, 9 May 2024 10:33:07 GMT
Title: Approximate Dec-POMDP Solving Using Multi-Agent A*
Authors: Wietze Koops, Sebastian Junges, Nils Jansen,
Abstract summary: We present an A*-based algorithm to compute policies for finite-horizon Dec-POMDPs. Our goal is to sacrifice optimality in favor of scalability for larger horizons.
Score: 8.728372851272727
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an A*-based algorithm to compute policies for finite-horizon Dec-POMDPs. Our goal is to sacrifice optimality in favor of scalability for larger horizons. The main ingredients of our approach are (1) using clustered sliding window memory, (2) pruning the A* search tree, and (3) using novel A* heuristics. Our experiments show competitive performance to the state-of-the-art. Moreover, for multiple benchmarks, we achieve superior performance. In addition, we provide an A* algorithm that finds upper bounds for the optimum, tailored towards problems with long horizons. The main ingredient is a new heuristic that periodically reveals the state, thereby limiting the number of reachable beliefs. Our experiments demonstrate the efficacy and scalability of the approach.

Related papers

Search Strategy Generation for Branch and Bound Using Genetic Programming [0.0]
We introduce GP2S (Genetic Programming for Search Strategy), a novel machine learning approach that automatically generates a B&B search strategy. We compare our approach with the standard method of the SCIP solver, a recent graph neural network-based method, and handcrafteds. Our method is at most 2% slower than the best baseline and consistently outperforms SCIP, achieving an average speedup of 11.3%.
arXiv Detail & Related papers (2024-12-12T16:57:46Z)
Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives [16.101435842520473]
This paper studies the challenging yet important problem in POMDPs known as the (indefinite-horizon) Maximal Reachability Probability Problem. Inspired by the success of point-based methods developed for discounted problems, we study their extensions to MRPP. We present a novel algorithm that leverages the strengths of these techniques for efficient exploration of the belief space.
arXiv Detail & Related papers (2024-06-05T02:33:50Z)
Branches: Efficiently Seeking Optimal Sparse Decision Trees with AO* [12.403737756721467]
Decision Tree (DT) Learning is a fundamental problem in Interpretable Machine Learning, yet it poses a formidable challenge. We formulate the problem as an AND/OR graph search which we solve with a novel AO*-type algorithm called Branches. We prove both optimality and complexity guarantees for Branches and we show that it is more efficient than the state of the art theoretically and on a variety of experiments.
arXiv Detail & Related papers (2024-06-04T10:11:46Z)
Runtime Analysis of a Multi-Valued Compact Genetic Algorithm on Generalized OneMax [2.07180164747172]
We provide a first runtime analysis of a generalized OneMax function. We show that the r-cGA solves this r-valued OneMax problem efficiently. At the end of experiments, we state one conjecture related to the expected runtime of another variant of multi-valued OneMax function.
arXiv Detail & Related papers (2024-04-17T10:40:12Z)
Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning [39.40838358438744]
Linear Programs (ILPs) are powerful tools for modeling and solving a large number of optimization problems. Large Neighborhood Search (LNS), as a algorithm, can find high quality solutions to ILPs faster than Branch and Bound. We propose a novel approach, CL-LNS, that delivers state-of-the-art anytime performance on several ILP benchmarks measured by metrics.
arXiv Detail & Related papers (2023-02-03T07:15:37Z)
Differentially-Private Hierarchical Clustering with Provable Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering. We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset. We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z)
Coordinate Descent for SLOPE [6.838073951329198]
Sorted L-One Penalized Estimation (SLOPE) is a generalization of the lasso with appealing statistical properties. Current software packages that fit SLOPE rely on algorithms that perform poorly in high dimensions. We propose a new fast algorithm to solve the SLOPE optimization problem, which combines proximal gradient descent and proximal coordinate descent steps.
arXiv Detail & Related papers (2022-10-26T15:20:30Z)
Planning and Learning with Adaptive Lookahead [74.39132848733847]
Policy Iteration (PI) algorithm alternates between greedy one-step policy improvement and policy evaluation. Recent literature shows that multi-step lookahead policy improvement leads to a better convergence rate at the expense of increased complexity per iteration. We propose for the first time to dynamically adapt the multi-step lookahead horizon as a function of the state and of the value estimate.
arXiv Detail & Related papers (2022-01-28T20:26:55Z)
Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network. We design two optimal algorithms that attain these lower bounds. We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z)
Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering. We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure. We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models. Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.