Adaptive Sampling using POMDPs with Domain-Specific Considerations
- URL: http://arxiv.org/abs/2109.11595v1
- Date: Thu, 23 Sep 2021 19:00:02 GMT
- Title: Adaptive Sampling using POMDPs with Domain-Specific Considerations
- Authors: Gautam Salhotra, Christopher E. Denniston, David A. Caron, Gaurav S.
Sukhatme
- Abstract summary: We investigate improving Monte Carlo Tree Search based solvers for adaptive sampling problems.
We propose improvements in rollout allocation, the action exploration algorithm, and plan commitment.
We show that it is possible to greatly reduce the number of rollouts by increasing the number of actions taken from a single planning tree.
- Score: 9.670635276589248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate improving Monte Carlo Tree Search based solvers for Partially
Observable Markov Decision Processes (POMDPs), when applied to adaptive
sampling problems. We propose improvements in rollout allocation, the action
exploration algorithm, and plan commitment. The first allocates a different
number of rollouts depending on how many actions the agent has taken in an
episode. We find that rollouts are more valuable after some initial information
is gained about the environment. Thus, a linear increase in the number of
rollouts, i.e. allocating a fixed number at each step, is not appropriate for
adaptive sampling tasks. The second alters which actions the agent chooses to
explore when building the planning tree. We find that by using knowledge of the
number of rollouts allocated, the agent can more effectively choose actions to
explore. The third improvement is in determining how many actions the agent
should take from one plan. Typically, an agent will plan to take the first
action from the planning tree and then call the planner again from the new
state. Using statistical techniques, we show that it is possible to greatly
reduce the number of rollouts by increasing the number of actions taken from a
single planning tree without affecting the agent's final reward. Finally, we
demonstrate experimentally, on simulated and real aquatic data from an
underwater robot, that these improvements can be combined, leading to better
adaptive sampling. The code for this work is available at
https://github.com/uscresl/AdaptiveSamplingPOMCP
Related papers
- Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.
We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - Octo-planner: On-device Language Model for Planner-Action Agents [19.627197141903505]
Planner-Action framework separates planning and action execution into two distinct components.
Agent first responds to user queries by decomposing tasks into a sequence of sub-steps, which are then executed by the action agent.
We employ model fine-tuning instead of in-context learning, reducing computational costs and energy consumption.
arXiv Detail & Related papers (2024-06-26T05:40:10Z) - Experiment Planning with Function Approximation [49.50254688629728]
We study the problem of experiment planning with function approximation in contextual bandit problems.
We propose two experiment planning strategies compatible with function approximation.
We show that a uniform sampler achieves competitive optimality rates in the setting where the number of actions is small.
arXiv Detail & Related papers (2024-01-10T14:40:23Z) - Tree-Planner: Efficient Close-loop Task Planning with Large Language Models [63.06270302774049]
Tree-Planner reframes task planning with Large Language Models into three distinct phases.
Tree-Planner achieves state-of-the-art performance while maintaining high efficiency.
arXiv Detail & Related papers (2023-10-12T17:59:50Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - An Efficient Approach to the Online Multi-Agent Path Finding Problem by
Using Sustainable Information [10.367412630626834]
Multi-agent path finding (MAPF) is the problem of moving agents to the goal without collision.
We propose a three-level approach to solve online MAPF utilizing sustainable information.
Our algorithm can be 1.48 times faster than SOTA on average under different agent number settings.
arXiv Detail & Related papers (2023-01-11T13:04:35Z) - DGSAC: Density Guided Sampling and Consensus [4.808421423598809]
Kernel Residual Density is a key differentiator between inliers and outliers.
We propose two model selection algorithms, an optimal quadratic program based, and a greedy.
We evaluate our method on a wide variety of tasks like planar segmentation, motion segmentation, vanishing point estimation, plane fitting to 3D point cloud, line, and circle fitting.
arXiv Detail & Related papers (2020-06-03T17:42:53Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.