LLM Granularity for On-the-Fly Robot Control
- URL: http://arxiv.org/abs/2406.14653v1
- Date: Thu, 20 Jun 2024 18:17:48 GMT
- Title: LLM Granularity for On-the-Fly Robot Control
- Authors: Peng Wang, Mattia Robbiani, Zhihao Guo,
- Abstract summary: In circumstances where visuals become unreliable or unavailable, can we rely solely on language to control robots?
This work takes the initial steps to answer this question by: 1) evaluating the responses of assistive robots to language prompts of varying granularities; and 2) exploring the necessity and feasibility of controlling the robot on-the-fly.
- Score: 3.5015824313818578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Assistive robots have attracted significant attention due to their potential to enhance the quality of life for vulnerable individuals like the elderly. The convergence of computer vision, large language models, and robotics has introduced the `visuolinguomotor' mode for assistive robots, where visuals and linguistics are incorporated into assistive robots to enable proactive and interactive assistance. This raises the question: \textit{In circumstances where visuals become unreliable or unavailable, can we rely solely on language to control robots, i.e., the viability of the `linguomotor` mode for assistive robots?} This work takes the initial steps to answer this question by: 1) evaluating the responses of assistive robots to language prompts of varying granularities; and 2) exploring the necessity and feasibility of controlling the robot on-the-fly. We have designed and conducted experiments on a Sawyer cobot to support our arguments. A Turtlebot robot case is designed to demonstrate the adaptation of the solution to scenarios where assistive robots need to maneuver to assist. Codes will be released on GitHub soon to benefit the community.
Related papers
- Unifying 3D Representation and Control of Diverse Robots with a Single Camera [48.279199537720714]
We introduce Neural Jacobian Fields, an architecture that autonomously learns to model and control robots from vision alone.
Our approach achieves accurate closed-loop control and recovers the causal dynamic structure of each robot.
arXiv Detail & Related papers (2024-07-11T17:55:49Z) - Commonsense Reasoning for Legged Robot Adaptation with Vision-Language Models [81.55156507635286]
Legged robots are physically capable of navigating a diverse variety of environments and overcoming a wide range of obstructions.
Current learning methods often struggle with generalization to the long tail of unexpected situations without heavy human supervision.
We propose a system, VLM-Predictive Control (VLM-PC), combining two key components that we find to be crucial for eliciting on-the-fly, adaptive behavior selection.
arXiv Detail & Related papers (2024-07-02T21:00:30Z) - HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation [50.616995671367704]
We present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands.
Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning approach achieves superior performance when supported by robust low-level policies.
arXiv Detail & Related papers (2024-03-15T17:45:44Z) - HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt
interaction tasks [5.057755436092344]
Human robot interaction is an exciting task, which aimed to guide robots following instructions from human.
HuBo-VLM is proposed to tackle perception tasks associated with human robot interaction.
arXiv Detail & Related papers (2023-08-24T03:47:27Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - Exploring AI-enhanced Shared Control for an Assistive Robotic Arm [4.999814847776098]
In particular, we explore how Artifical Intelligence (AI) can be integrated into a shared control paradigm.
In particular, we focus on the consequential requirements for the interface between human and robot.
arXiv Detail & Related papers (2023-06-23T14:19:56Z) - Knowledge-Driven Robot Program Synthesis from Human VR Demonstrations [16.321053835017942]
We present a system for automatically generating executable robot control programs from human task demonstrations in virtual reality (VR)
We leverage common-sense knowledge and game engine-based physics to semantically interpret human VR demonstrations.
We demonstrate our approach in the context of force-sensitive fetch-and-place for a robotic shopping assistant.
arXiv Detail & Related papers (2023-06-05T09:37:53Z) - Open-World Object Manipulation using Pre-trained Vision-Language Models [72.87306011500084]
For robots to follow instructions from people, they must be able to connect the rich semantic information in human vocabulary.
We develop a simple approach, which leverages a pre-trained vision-language model to extract object-identifying information.
In a variety of experiments on a real mobile manipulator, we find that MOO generalizes zero-shot to a wide range of novel object categories and environments.
arXiv Detail & Related papers (2023-03-02T01:55:10Z) - Robots with Different Embodiments Can Express and Influence Carefulness
in Object Manipulation [104.5440430194206]
This work investigates the perception of object manipulations performed with a communicative intent by two robots.
We designed the robots' movements to communicate carefulness or not during the transportation of objects.
arXiv Detail & Related papers (2022-08-03T13:26:52Z) - Know Thyself: Transferable Visuomotor Control Through Robot-Awareness [22.405839096833937]
Training visuomotor robot controllers from scratch on a new robot typically requires generating large amounts of robot-specific data.
We propose a "robot-aware" solution paradigm that exploits readily available robot "self-knowledge"
Our experiments on tabletop manipulation tasks in simulation and on real robots demonstrate that these plug-in improvements dramatically boost the transferability of visuomotor controllers.
arXiv Detail & Related papers (2021-07-19T17:56:04Z) - Natural Language Interaction to Facilitate Mental Models of Remote
Robots [0.0]
High-stakes scenarios require robot operators to have clear mental models of what the robots can and can't do.
We propose that interaction with a conversational assistant, who acts as a mediator, can help the user with understanding the functionality of remote robots.
arXiv Detail & Related papers (2020-03-12T16:03:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.