Related papers: Language-Conditioned Path Planning

Language-Conditioned Path Planning

URL: http://arxiv.org/abs/2308.16893v1
Date: Thu, 31 Aug 2023 17:56:13 GMT
Title: Language-Conditioned Path Planning
Authors: Amber Xie, Youngwoon Lee, Pieter Abbeel, Stephen James
Abstract summary: Language-Conditioned Collision Functions (LACO) learns a collision function using only a single-view image, language prompt, and robot configuration. LACO predicts collisions between the robot and the environment, enabling flexible, conditional path planning without the need for object annotations, point cloud data, or ground-truth object meshes. In both simulation and the real world, we demonstrate that LACO can facilitate complex, nuanced path plans that allow for interaction with objects that are safe to collide, rather than prohibiting any collision.
Score: 68.13248140217222
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Contact is at the core of robotic manipulation. At times, it is desired (e.g. manipulation and grasping), and at times, it is harmful (e.g. when avoiding obstacles). However, traditional path planning algorithms focus solely on collision-free paths, limiting their applicability in contact-rich tasks. To address this limitation, we propose the domain of Language-Conditioned Path Planning, where contact-awareness is incorporated into the path planning problem. As a first step in this domain, we propose Language-Conditioned Collision Functions (LACO) a novel approach that learns a collision function using only a single-view image, language prompt, and robot configuration. LACO predicts collisions between the robot and the environment, enabling flexible, conditional path planning without the need for manual object annotations, point cloud data, or ground-truth object meshes. In both simulation and the real world, we demonstrate that LACO can facilitate complex, nuanced path plans that allow for interaction with objects that are safe to collide, rather than prohibiting any collision.

Related papers

Adaptive Interactive Navigation of Quadruped Robots using Large Language Models [14.14967096139099]
We present a primitive tree for task planning with large language models (LLMs) We adopt reinforcement learning to pre-train a comprehensive skill library containing versatile locomotion and interaction behaviors for motion planning. integrated with the tree structure, the replanning mechanism allows for convenient node addition and pruning.
arXiv Detail & Related papers (2025-03-29T02:17:52Z)
IMPACT: Intelligent Motion Planning with Acceptable Contact Trajectories via Vision-Language Models [2.889915951061306]
We propose IMPACT, a novel motion planning framework that uses Vision-Language Models (VLMs) to infer environment semantics. We perform experiments using 20 simulation and 10 real-world scenes and assess using task success rate, object displacements, and feedback from human evaluators. Our results over 3620 simulation and 200 real-world trials suggest that IMPACT enables efficient contact-rich motion planning in cluttered settings.
arXiv Detail & Related papers (2025-03-13T07:09:00Z)
Search-Based Path Planning in Interactive Environments among Movable Obstacles [8.023424148846265]
This paper introduces two PAMO formulations, i.e., bi-objective and resource constrained problems in an occupancy grid. We develop PAMO*, a planning method with completeness and solution optimality guarantees, to solve the two problems. Our results show that, PAMO* can often find optimal solutions within a second in cluttered maps with up to 400 objects.
arXiv Detail & Related papers (2024-10-24T00:02:58Z)
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control [58.72492647570062]
We introduce our method -- Learnable Latent Codes as Bridges (LCB) -- as an alternate architecture to overcome limitations. We find that methodoutperforms baselines that leverage pure language as the interface layer on tasks that require reasoning and multi-step behaviors.
arXiv Detail & Related papers (2024-05-08T04:14:06Z)
Grounding Language Plans in Demonstrations Through Counterfactual Perturbations [25.19071357445557]
Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks.
arXiv Detail & Related papers (2024-03-25T19:04:59Z)
Controllable Human-Object Interaction Synthesis [77.56877961681462]
We propose Controllable Human-Object Interaction Synthesis (CHOIS) to generate synchronized object motion and human motion in 3D scenes. Here, language descriptions inform style and intent, and waypoints, which can be effectively extracted from high-level planning, ground the motion in the scene. Our module seamlessly integrates with a path planning module, enabling the generation of long-term interactions in 3D environments.
arXiv Detail & Related papers (2023-12-06T21:14:20Z)
Learning Extrinsic Dexterity with Parameterized Manipulation Primitives [8.7221770019454]
We learn a sequence of actions that utilize the environment to change the object's pose. Our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment. We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace.
arXiv Detail & Related papers (2023-10-26T21:28:23Z)
Neural Potential Field for Obstacle-Aware Local Motion Planning [46.42871544295734]
We propose a neural network model that returns a differentiable collision cost based on robot pose, obstacle map, and robot footprint. Our architecture includes neural image encoders, which transform obstacle maps and robot footprints into embeddings. Experiment on Husky UGV mobile robot showed that our approach allows real-time and safe local planning.
arXiv Detail & Related papers (2023-10-25T05:00:21Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z)
Correcting Robot Plans with Natural Language Feedback [88.92824527743105]
We explore natural language as an expressive and flexible tool for robot correction. We show that these transformations enable users to correct goals, update robot motions, and recover from planning errors. Our method makes it possible to compose multiple constraints and generalizes to unseen scenes, objects, and sentences in simulated environments and real-world environments.
arXiv Detail & Related papers (2022-04-11T15:22:43Z)
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [56.00554240240515]
INVIGORATE is a robot system that interacts with human through natural language and grasps a specified object in clutter. We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping. We build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules.
arXiv Detail & Related papers (2021-08-25T07:35:21Z)
From Abstractions to Grounded Languages for Robust Coordination of Task Planning Robots [4.496989927037321]
We study the automatic construction of languages that are maximally flexible while being sufficiently explicative for coordination. Our language expresses a plan for any given task as a "plan sketch" to convey just-enough details while maximizing the flexibility to realize it.
arXiv Detail & Related papers (2019-05-01T22:05:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.