Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics
- URL: http://arxiv.org/abs/2508.12840v2
- Date: Tue, 14 Oct 2025 11:07:20 GMT
- Title: Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics
- Authors: Giovanni Briglia, Francesco Fabiano, Stefano Mariani,
- Abstract summary: Multi-agent Epistemic Planning (MEP) is an autonomous planning framework for reasoning about both the physical world and the beliefs of agents.<n>The MEP requires states to be represented as Kripke structures, i.e., directed labeled graphs.<n>We exploit Graph Neural Networks (GNNs) to learn patterns and relational structures within epistemic states to guide the planning process.
- Score: 0.9786469751894747
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent Epistemic Planning (MEP) is an autonomous planning framework for reasoning about both the physical world and the beliefs of agents, with applications in domains where information flow and awareness among agents are critical. The richness of MEP requires states to be represented as Kripke structures, i.e., directed labeled graphs. This representation limits the applicability of existing heuristics, hindering the scalability of epistemic solvers, which must explore an exponential search space without guidance, resulting often in intractability. To address this, we exploit Graph Neural Networks (GNNs) to learn patterns and relational structures within epistemic states, to guide the planning process. GNNs, which naturally capture the graph-like nature of Kripke models, allow us to derive meaningful estimates of state quality -- e.g., the distance from the nearest goal -- by generalizing knowledge obtained from previously solved planning instances. We integrate these predictive heuristics into an epistemic planning pipeline and evaluate them against standard baselines, showing improvements in the scalability of multi-agent epistemic planning.
Related papers
- Diagnosing Generalization Failures from Representational Geometry Markers [8.403001493770427]
We study generalization failures inspired by medical biomarkers.<n>We design and test network markers to probe structure, function links, identify prognostic indicators, and validate predictions in real-world settings.<n>This work demonstrates that representational geometry can expose hidden vulnerabilities, offering more robust guidance for model selection and AI interpretability.
arXiv Detail & Related papers (2026-03-02T13:59:19Z) - TodoEvolve: Learning to Architect Agent Planning Systems [68.48983335970901]
TodoEvolve is a meta-planning paradigm that autonomously synthesizes and dynamically revises task-specific planning.<n>PlanFactory provides a common interface for heterogeneous planning patterns.<n>TodoEvolve consistently surpasses carefully engineered planning modules while maintaining economical API costs and runtime overhead.
arXiv Detail & Related papers (2026-02-08T06:37:01Z) - Towards agent-based-model informed neural networks [0.5787117733071417]
We present a framework for designing neural networks consistent with the underlying principles of agent-based models.<n>We validate the framework across three case studies of increasing complexity.
arXiv Detail & Related papers (2025-12-05T14:50:50Z) - A Pre-training Framework for Relational Data with Information-theoretic Principles [57.93973948947743]
We introduce Task Vector Estimation (TVE), a novel pre-training framework that constructs supervisory signals via set-based aggregation over relational graphs.<n>TVE consistently outperforms traditional pre-training baselines.<n>Our findings advocate for pre-training objectives that encode task heterogeneity and temporal structure as design principles for predictive modeling on relational databases.
arXiv Detail & Related papers (2025-07-14T00:17:21Z) - Dynamic Graph Structure Estimation for Learning Multivariate Point Process using Spiking Neural Networks [14.77536193242342]
Spiking Dynamic Graph Network is a novel framework that leverages the temporal processing capabilities of spiking neural networks (SNNs) and spike-dependent plasticity (STD-P)<n>It adapts to any dataset by learning dynamic-temporal dependencies directly from event data, enhancing generalizability and modeling.<n>Our evaluations conducted on both synthetic and real-world datasets including NYC Taxi, 911 Reddit, and Stack Overflow, demonstrate superior accuracy while maintaining computational efficiency.
arXiv Detail & Related papers (2025-04-01T23:23:10Z) - Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - A Planning Ontology to Represent and Exploit Planning Knowledge for Performance Efficiency [6.87593454486392]
We consider the problem of automated planning, where the objective is to find a sequence of actions that will move an agent from an initial state of the world to a desired goal state.
We hypothesize that given a large number of available planners and diverse planning domains; they carry essential information that can be leveraged to identify suitable planners and improve their performance for a domain.
arXiv Detail & Related papers (2023-07-25T14:51:07Z) - A Consciousness-Inspired Planning Agent for Model-Based Reinforcement
Learning [104.3643447579578]
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state.
The design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
arXiv Detail & Related papers (2021-06-03T19:35:19Z) - Cloth Manipulation Planning on Basis of Mesh Representations with
Incomplete Domain Knowledge and Voxel-to-Mesh Estimation [0.0]
We consider the problem of open-goal planning for robotic cloth manipulation.
Core of our system is a neural network trained as a forward model of cloth behaviour under manipulation.
We introduce a neural network-based routine for estimating mesh representations from voxel input, and perform planning in mesh format internally.
arXiv Detail & Related papers (2021-03-15T04:59:14Z) - Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z) - The efficacy of Neural Planning Metrics: A meta-analysis of PKL on
nuScenes [77.83263286776938]
A high-performing object detection system plays a crucial role in autonomous driving (AD)
The performance, typically evaluated in terms of mean Average Precision, does not take into account orientation and distance of the actors in the scene.
Recently, Philion et al. proposed a neural planning metric (PKL), based on the KL divergence of a planner's trajectory and the groundtruth route.
arXiv Detail & Related papers (2020-10-19T09:32:48Z) - Long-Horizon Visual Planning with Goal-Conditioned Hierarchical
Predictors [124.30562402952319]
The ability to predict and plan into the future is fundamental for agents acting in the world.
Current learning approaches for visual prediction and planning fail on long-horizon tasks.
We propose a framework for visual prediction and planning that is able to overcome both of these limitations.
arXiv Detail & Related papers (2020-06-23T17:58:56Z) - Goal-Directed Planning for Habituated Agents by Active Inference Using a
Variational Recurrent Neural Network [5.000272778136268]
This study shows that the predictive coding (PC) and active inference (AIF) frameworks can develop better generalization by learning a prior distribution in a low dimensional latent state space.
In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound.
Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data.
arXiv Detail & Related papers (2020-05-27T06:43:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.