Increasing the Value of Information During Planning in Uncertain Environments
- URL: http://arxiv.org/abs/2409.13754v1
- Date: Sat, 14 Sep 2024 22:04:34 GMT
- Title: Increasing the Value of Information During Planning in Uncertain Environments
- Authors: Gaurab Pokharel,
- Abstract summary: We develop a new algorithm that improves upon state-of-the-art online planning by better reflecting on the value of actions that gather information.
Results indicate that our new algorithm performs significantly better than POMCP.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prior studies have demonstrated that for many real-world problems, POMDPs can be solved through online algorithms both quickly and with near optimality. However, on an important set of problems where there is a large time delay between when the agent can gather information and when it needs to use that information, these solutions fail to adequately consider the value of information. As a result, information gathering actions, even when they are critical in the optimal policy, will be ignored by existing solutions, leading to sub-optimal decisions by the agent. In this research, we develop a novel solution that rectifies this problem by introducing a new algorithm that improves upon state-of-the-art online planning by better reflecting on the value of actions that gather information. We do this by adding Entropy to the UCB1 heuristic in the POMCP algorithm. We test this solution on the hallway problem. Results indicate that our new algorithm performs significantly better than POMCP.
Related papers
- Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement
Learning Approach [58.911515417156174]
We propose a new definition of Age of Information (AoI) and, based on the redefined AoI, we formulate an online AoI problem for MEC systems.
We introduce Post-Decision States (PDSs) to exploit the partial knowledge of the system's dynamics.
We also combine PDSs with deep RL to further improve the algorithm's applicability, scalability, and robustness.
arXiv Detail & Related papers (2023-12-01T01:30:49Z) - An Improved Artificial Fish Swarm Algorithm for Solving the Problem of
Investigation Path Planning [8.725702964289479]
We propose a chaotic artificial fish swarm algorithm based on multiple population differential evolution (DE-CAFSA)
We incorporate adaptive field of view and step size adjustments, replace random behavior with the 2-opt operation, and introduce chaos theory and sub-optimal solutions.
Experimental results demonstrate that DE-CAFSA outperforms other algorithms on various public datasets of different sizes.
arXiv Detail & Related papers (2023-10-20T09:35:51Z) - The Update-Equivalence Framework for Decision-Time Planning [78.44953498421854]
We introduce an alternative framework for decision-time planning that is not based on solving subgames, but rather on update equivalence.
We derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.
arXiv Detail & Related papers (2023-04-25T20:28:55Z) - An Efficient Approach to the Online Multi-Agent Path Finding Problem by
Using Sustainable Information [10.367412630626834]
Multi-agent path finding (MAPF) is the problem of moving agents to the goal without collision.
We propose a three-level approach to solve online MAPF utilizing sustainable information.
Our algorithm can be 1.48 times faster than SOTA on average under different agent number settings.
arXiv Detail & Related papers (2023-01-11T13:04:35Z) - Sequential Bayesian Optimization for Adaptive Informative Path Planning
with Multimodal Sensing [34.86734745942814]
We consider the problem of an agent equipped with multiple sensors, each with different sensing accuracy and energy costs.
The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments.
We formulate the AIPPMS problem as a belief Markov decision process with Gaussian process beliefs and solve it using a sequential Bayesian optimization approach with online planning.
arXiv Detail & Related papers (2022-09-16T00:50:36Z) - Evolutionary Optimization for Proactive and Dynamic Computing Resource
Allocation in Open Radio Access Network [4.9711284100869815]
Intelligent techniques are urged to achieve automatic allocation of the computing resource in Open Radio Access Network (O-RAN)
Existing problem formulation to solve this resource allocation problem is unsuitable as it defines the capacity utility of resource in an inappropriate way.
New formulation that better describes the problem is proposed.
arXiv Detail & Related papers (2022-01-12T08:52:04Z) - Policy Information Capacity: Information-Theoretic Measure for Task
Complexity in Deep Reinforcement Learning [83.66080019570461]
We propose two environment-agnostic, algorithm-agnostic quantitative metrics for task difficulty.
We show that these metrics have higher correlations with normalized task solvability scores than a variety of alternatives.
These metrics can also be used for fast and compute-efficient optimizations of key design parameters.
arXiv Detail & Related papers (2021-03-23T17:49:50Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - POMP: Pomcp-based Online Motion Planning for active visual search in
indoor environments [89.43830036483901]
We focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup.
Our POMP method uses as input the current pose of an agent and a RGB-D frame.
We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1.
arXiv Detail & Related papers (2020-09-17T08:23:50Z) - Learning What to Defer for Maximum Independent Sets [84.00112106334655]
We propose a novel DRL scheme, coined learning what to defer (LwD), where the agent adaptively shrinks or stretch the number of stages by learning to distribute the element-wise decisions of the solution at each stage.
We apply the proposed framework to the maximum independent set (MIS) problem, and demonstrate its significant improvement over the current state-of-the-art DRL scheme.
arXiv Detail & Related papers (2020-06-17T02:19:31Z) - sKPNSGA-II: Knee point based MOEA with self-adaptive angle for Mission
Planning Problems [2.191505742658975]
Some problems have many objectives which lead to a large number of non-dominated solutions.
This paper presents a new algorithm that has been designed to obtain the most significant solutions.
This new algorithm has been applied to the real world application in Unmanned Air Vehicle (UAV) Mission Planning Problem.
arXiv Detail & Related papers (2020-02-20T17:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.