Related papers: Adaptive Sampling using POMDPs with Domain-Specific Considerations

Adaptive Sampling using POMDPs with Domain-Specific Considerations

URL: http://arxiv.org/abs/2109.11595v1
Date: Thu, 23 Sep 2021 19:00:02 GMT
Title: Adaptive Sampling using POMDPs with Domain-Specific Considerations
Authors: Gautam Salhotra, Christopher E. Denniston, David A. Caron, Gaurav S. Sukhatme
Abstract summary: We investigate improving Monte Carlo Tree Search based solvers for adaptive sampling problems. We propose improvements in rollout allocation, the action exploration algorithm, and plan commitment. We show that it is possible to greatly reduce the number of rollouts by increasing the number of actions taken from a single planning tree.
Score: 9.670635276589248
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate improving Monte Carlo Tree Search based solvers for Partially Observable Markov Decision Processes (POMDPs), when applied to adaptive sampling problems. We propose improvements in rollout allocation, the action exploration algorithm, and plan commitment. The first allocates a different number of rollouts depending on how many actions the agent has taken in an episode. We find that rollouts are more valuable after some initial information is gained about the environment. Thus, a linear increase in the number of rollouts, i.e. allocating a fixed number at each step, is not appropriate for adaptive sampling tasks. The second alters which actions the agent chooses to explore when building the planning tree. We find that by using knowledge of the number of rollouts allocated, the agent can more effectively choose actions to explore. The third improvement is in determining how many actions the agent should take from one plan. Typically, an agent will plan to take the first action from the planning tree and then call the planner again from the new state. Using statistical techniques, we show that it is possible to greatly reduce the number of rollouts by increasing the number of actions taken from a single planning tree without affecting the agent's final reward. Finally, we demonstrate experimentally, on simulated and real aquatic data from an underwater robot, that these improvements can be combined, leading to better adaptive sampling. The code for this work is available at https://github.com/uscresl/AdaptiveSamplingPOMCP

Related papers

Value Gradients with Action Adaptive Search Trees in Continuous (PO)MDPs [7.170248667518935]
Solving POMDPs in continuous state, action and observation spaces is key for autonomous planning in real-world mobility and robotics applications. We formulate a novel Multiple Importance Sampling tree for value estimation, that allows to share value information between sibling action branches. Second, we propose a novel methodology to compute value gradients with online sampling based on transition likelihoods.
arXiv Detail & Related papers (2025-03-15T15:51:06Z)
Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms. We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths. We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z)
Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations. We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z)
Octo-planner: On-device Language Model for Planner-Action Agents [19.627197141903505]
Planner-Action framework separates planning and action execution into two distinct components. Agent first responds to user queries by decomposing tasks into a sequence of sub-steps, which are then executed by the action agent. We employ model fine-tuning instead of in-context learning, reducing computational costs and energy consumption.
arXiv Detail & Related papers (2024-06-26T05:40:10Z)
Experiment Planning with Function Approximation [49.50254688629728]
We study the problem of experiment planning with function approximation in contextual bandit problems. We propose two experiment planning strategies compatible with function approximation. We show that a uniform sampler achieves competitive optimality rates in the setting where the number of actions is small.
arXiv Detail & Related papers (2024-01-10T14:40:23Z)
Tree-Planner: Efficient Close-loop Task Planning with Large Language Models [63.06270302774049]
Tree-Planner reframes task planning with Large Language Models into three distinct phases. Tree-Planner achieves state-of-the-art performance while maintaining high efficiency.
arXiv Detail & Related papers (2023-10-12T17:59:50Z)
Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment. Standard sampling-based algorithms can be used to search for solutions in the robots' joint space. We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods. We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z)
An Efficient Approach to the Online Multi-Agent Path Finding Problem by Using Sustainable Information [10.367412630626834]
Multi-agent path finding (MAPF) is the problem of moving agents to the goal without collision. We propose a three-level approach to solve online MAPF utilizing sustainable information. Our algorithm can be 1.48 times faster than SOTA on average under different agent number settings.
arXiv Detail & Related papers (2023-01-11T13:04:35Z)
DGSAC: Density Guided Sampling and Consensus [4.808421423598809]
Kernel Residual Density is a key differentiator between inliers and outliers. We propose two model selection algorithms, an optimal quadratic program based, and a greedy. We evaluate our method on a wide variety of tasks like planar segmentation, motion segmentation, vanishing point estimation, plane fitting to 3D point cloud, line, and circle fitting.
arXiv Detail & Related papers (2020-06-03T17:42:53Z)
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination. We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.