Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly
- URL: http://arxiv.org/abs/2110.04450v1
- Date: Sat, 9 Oct 2021 04:22:21 GMT
- Title: Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly
- Authors: Shubham Agrawal, Yulong Li, Jen-Shuo Liu, Steven K. Feiner, Shuran
Song
- Abstract summary: We propose the framework "Scene Editing as Teleoperation" (SEaT)
Instead of controlling the robot, users focus on specifying the task's goal.
A user can perform teleoperation without any expert knowledge of the robot hardware.
- Score: 18.563562557565483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Studies in robot teleoperation have been centered around action
specifications -- from continuous joint control to discrete end-effector pose
control. However, these robot-centric interfaces often require skilled
operators with extensive robotics expertise. To make teleoperation accessible
to non-expert users, we propose the framework "Scene Editing as Teleoperation"
(SEaT), where the key idea is to transform the traditional "robot-centric"
interface into a "scene-centric" interface -- instead of controlling the robot,
users focus on specifying the task's goal by manipulating digital twins of the
real-world objects. As a result, a user can perform teleoperation without any
expert knowledge of the robot hardware. To achieve this goal, we utilize a
category-agnostic scene-completion algorithm that translates the real-world
workspace (with unknown objects) into a manipulable virtual scene
representation and an action-snapping algorithm that refines the user input
before generating the robot's action plan. To train the algorithms, we
procedurally generated a large-scale, diverse kit-assembly dataset that
contains object-kit pairs that mimic real-world object-kitting tasks. Our
experiments in simulation and on a real-world system demonstrate that our
framework improves both the efficiency and success rate for 6DoF kit-assembly
tasks. A user study demonstrates that SEaT framework participants achieve a
higher task success rate and report a lower subjective workload compared to an
alternative robot-centric interface. Video can be found at
https://www.youtube.com/watch?v=-NdR3mkPbQQ .
Related papers
- Zero-Cost Whole-Body Teleoperation for Mobile Manipulation [8.71539730969424]
MoMa-Teleop is a novel teleoperation method that delegates the base motions to a reinforcement learning agent.
We demonstrate that our approach results in a significant reduction in task completion time across a variety of robots and tasks.
arXiv Detail & Related papers (2024-09-23T15:09:45Z) - CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera [18.971816395021488]
Markerless pose estimation methods have eliminated the need for time-consuming physical setups for camera-to-robot calibration.
We propose a novel framework capable of estimating the robot pose with partially visible robot manipulators.
arXiv Detail & Related papers (2024-09-16T16:22:43Z) - Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - Open-TeleVision: Teleoperation with Immersive Active Visual Feedback [17.505318269362512]
Open-TeleVision allows operators to actively perceive the robot's surroundings in a stereoscopic manner.
The system mirrors the operator's arm and hand movements on the robot, creating an immersive experience.
We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks.
arXiv Detail & Related papers (2024-07-01T17:55:35Z) - Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation [65.46610405509338]
We seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation.
Our framework,Track2Act predicts tracks of how points in an image should move in future time-steps based on a goal.
We show that this approach of combining scalably learned track prediction with a residual policy enables diverse generalizable robot manipulation.
arXiv Detail & Related papers (2024-05-02T17:56:55Z) - What Matters to You? Towards Visual Representation Alignment for Robot
Learning [81.30964736676103]
When operating in service of people, robots need to optimize rewards aligned with end-user preferences.
We propose Representation-Aligned Preference-based Learning (RAPL), a method for solving the visual representation alignment problem.
arXiv Detail & Related papers (2023-10-11T23:04:07Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Can Foundation Models Perform Zero-Shot Task Specification For Robot
Manipulation? [54.442692221567796]
Task specification is critical for engagement of non-expert end-users and adoption of personalized robots.
A widely studied approach to task specification is through goals, using either compact state vectors or goal images from the same robot scene.
In this work, we explore alternate and more general forms of goal specification that are expected to be easier for humans to specify and use.
arXiv Detail & Related papers (2022-04-23T19:39:49Z) - iRoPro: An interactive Robot Programming Framework [2.7651063843287718]
iRoPro allows users with little to no technical background to teach a robot new reusable actions.
We implement iRoPro as an end-to-end system on a Baxter Research Robot.
arXiv Detail & Related papers (2021-12-08T13:53:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.