Skip-Plan: Procedure Planning in Instructional Videos via Condensed
Action Space Learning
- URL: http://arxiv.org/abs/2310.00608v1
- Date: Sun, 1 Oct 2023 08:02:33 GMT
- Title: Skip-Plan: Procedure Planning in Instructional Videos via Condensed
Action Space Learning
- Authors: Zhiheng Li, Wenjia Geng, Muheng Li, Lei Chen, Yansong Tang, Jiwen Lu,
Jie Zhou
- Abstract summary: Skip-Plan is a condensed action space learning method for procedure planning in instructional videos.
By skipping uncertain nodes and edges in action chains, we transfer long and complex sequence functions into short but reliable ones.
Our model explores all sorts of reliable sub-relations within an action sequence in the condensed action space.
- Score: 85.84504287685884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose Skip-Plan, a condensed action space learning method
for procedure planning in instructional videos. Current procedure planning
methods all stick to the state-action pair prediction at every timestep and
generate actions adjacently. Although it coincides with human intuition, such a
methodology consistently struggles with high-dimensional state supervision and
error accumulation on action sequences. In this work, we abstract the procedure
planning problem as a mathematical chain model. By skipping uncertain nodes and
edges in action chains, we transfer long and complex sequence functions into
short but reliable ones in two ways. First, we skip all the intermediate state
supervision and only focus on action predictions. Second, we decompose
relatively long chains into multiple short sub-chains by skipping unreliable
intermediate actions. By this means, our model explores all sorts of reliable
sub-relations within an action sequence in the condensed action space.
Extensive experiments show Skip-Plan achieves state-of-the-art performance on
the CrossTask and COIN benchmarks for procedure planning.
Related papers
- Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling [23.62433580021779]
We advocate a self-refining scheme that iteratively refines a draft plan until an equilibrium is reached.
A nested equilibrium sequence modeling procedure is devised for efficient closed-loop planning.
Our method is evaluated on the VirtualHome-Env benchmark, showing advanced performance with better scaling for inference.
arXiv Detail & Related papers (2024-10-02T11:42:49Z) - BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation [48.08416841005715]
We introduce a novel keypose-conditioned consistency policy tailored for bimanual manipulation.
It is a hierarchical imitation learning framework that consists of a high-level keypose predictor and a low-level trajectory generator.
Simulated and real-world experimental results demonstrate that the proposed approach surpasses baseline methods in terms of success rate and operational efficiency.
arXiv Detail & Related papers (2024-06-14T14:49:12Z) - Task and Motion Planning for Execution in the Real [24.01204729304763]
This work generates task and motion plans that include actions cannot be fully grounded at planning time.
Execution combines offline planned motions and online behaviors till reaching the task goal.
Forty real-robot trials and motivating demonstrations are performed to evaluate the proposed framework.
Results show faster execution time, less number of actions, and more success in problems where diverse gaps arise.
arXiv Detail & Related papers (2024-06-05T22:30:40Z) - RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos [46.26690150997731]
We propose a new and practical setting, called adaptive procedure planning in instructional videos.
RAP adaptively determines the conclusion of actions using an auto-regressive model architecture.
arXiv Detail & Related papers (2024-03-27T14:22:40Z) - TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models [7.653791106386385]
Two-agent planning goal decomposition leads to faster planning times than solving multi-agent PDDL problems directly.
We find that LLM-based approximations of subgoals can achieve similar multi-agent execution steps than those specified by human experts.
arXiv Detail & Related papers (2024-03-25T22:47:13Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Contingencies from Observations: Tractable Contingency Planning with
Learned Behavior Models [82.34305824719101]
Humans have a remarkable ability to make decisions by accurately reasoning about future events.
We develop a general-purpose contingency planner that is learned end-to-end using high-dimensional scene observations.
We show how this model can tractably learn contingencies from behavioral observations.
arXiv Detail & Related papers (2021-04-21T14:30:20Z) - STRIPS Action Discovery [67.73368413278631]
Recent approaches have shown the success of classical planning at synthesizing action models even when all intermediate states are missing.
We propose a new algorithm to unsupervisedly synthesize STRIPS action models with a classical planner when action signatures are unknown.
arXiv Detail & Related papers (2020-01-30T17:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.