Related papers: Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

URL: http://arxiv.org/abs/2207.08130v1
Date: Sun, 17 Jul 2022 10:05:54 GMT
Title: Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works
Authors: Tin Lai
Abstract summary: We propose a novel approach for planning agents to compose abstract skills via observing and learning from historical interactions with the world. Our framework operates in a Markov state-space model via a set of actions under unknown pre-conditions. We show that this planning approach is experimentally very competitive in high-dimensional state space domains.
Score: 3.0839245814393728
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel approach for planning agents to compose abstract skills via observing and learning from historical interactions with the world. Our framework operates in a Markov state-space model via a set of actions under unknown pre-conditions. We formulate skills as high-level abstract policies that propose action plans based on the current state. Each policy learns new plans by observing the states' transitions while the agent interacts with the world. Such an approach automatically learns new plans to achieve specific intended effects, but the success of such plans is often dependent on the states in which they are applicable. Therefore, we formulate the evaluation of such plans as infinitely many multi-armed bandit problems, where we balance the allocation of resources on evaluating the success probability of existing arms and exploring new options. The result is a planner capable of automatically learning robust high-level skills under a noisy environment; such skills implicitly learn the action pre-condition without explicit knowledge. We show that this planning approach is experimentally very competitive in high-dimensional state space domains.

Related papers

Can LLM-Reasoning Models Replace Classical Planning? A Benchmark Study [0.0]
Large Language Models have sparked interest in their potential for robotic task planning.<n>While these models demonstrate strong generative capabilities, their effectiveness in producing structured and executable plans remains uncertain.<n>This paper presents a systematic evaluation of a broad spectrum of current state of the art language models.
arXiv Detail & Related papers (2025-07-31T14:25:54Z)
Agent Planning with World Knowledge Model [88.4897773735576]
We introduce parametric World Knowledge Model (WKM) to facilitate agent planning. We develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Our method can achieve superior performance compared to various strong baselines.
arXiv Detail & Related papers (2024-05-23T06:03:19Z)
LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner. Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z)
Learning adaptive planning representations with natural language guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations. Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z)
Statler: State-Maintaining Language Models for Embodied Reasoning [19.884696137429813]
We propose Statler, a framework in which large language models are prompted to maintain an estimate of the world state. Our framework then conditions each action on the estimate of the current world state. It significantly outperforms strong competing methods on several robot planning tasks.
arXiv Detail & Related papers (2023-06-30T17:58:02Z)
Integrating Action Knowledge and LLMs for Task Planning and Situation Handling in Open Worlds [10.077350377962482]
This paper introduces a novel framework, called COWP, for open-world task planning and situation handling. COWP dynamically augments the robot's action knowledge, including the preconditions and effects of actions, with task-oriented commonsense knowledge. Experimental results show that our approach outperforms competitive baselines from the literature in the success rate of service tasks.
arXiv Detail & Related papers (2023-05-27T22:30:15Z)
Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning [2.642698101441705]
Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents. We introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic abstraction of the environment's state for planning.
arXiv Detail & Related papers (2022-07-11T17:13:10Z)
Active Learning of Abstract Plan Feasibility [17.689758291966502]
We present an active learning approach to efficiently acquire an APF predictor through task-independent, curious exploration on a robot. We leverage an infeasible subsequence property to prune candidate plans in the active learning strategy, allowing our system to learn from less data. In a stacking domain where objects have non-uniform mass distributions, we show that our system permits real robot learning of an APF model in four hundred self-supervised interactions.
arXiv Detail & Related papers (2021-07-01T18:17:01Z)
Latent Skill Planning for Exploration and Transfer [49.25525932162891]
In this paper, we investigate how these two approaches can be integrated into a single reinforcement learning agent. We leverage the idea of partial amortization for fast adaptation at test time. We demonstrate the benefits of our design decisions across a suite of challenging locomotion tasks.
arXiv Detail & Related papers (2020-11-27T18:40:03Z)
Robust Hierarchical Planning with Policy Delegation [6.1678491628787455]
We propose a novel framework and algorithm for hierarchical planning based on the principle of delegation. We show this planning approach is experimentally very competitive to classic planning and reinforcement learning techniques on a variety of domains.
arXiv Detail & Related papers (2020-10-25T04:36:20Z)
STRIPS Action Discovery [67.73368413278631]
Recent approaches have shown the success of classical planning at synthesizing action models even when all intermediate states are missing. We propose a new algorithm to unsupervisedly synthesize STRIPS action models with a classical planner when action signatures are unknown.
arXiv Detail & Related papers (2020-01-30T17:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.