Related papers: RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

URL: http://arxiv.org/abs/2210.11668v1
Date: Fri, 21 Oct 2022 01:45:08 GMT
Title: RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control
Authors: Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield
Abstract summary: We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is computed. A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF.
Score: 71.51781695764872
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is computed. A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF. We show results on a real dataset collected and annotated in our lab.

Related papers

Unifying Scene Representation and Hand-Eye Calibration with 3D Foundation Models [13.58353565350936]
Representing the environment is a central challenge in robotics. Traditionally, users need to calibrate the camera using a specific external marker, such as a checkerboard or AprilTag. This paper advocates for the integration of 3D foundation representation into robotic systems equipped with manipulator-mounted RGB cameras.
arXiv Detail & Related papers (2024-04-17T18:29:32Z)
A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image [7.971699294672282]
We present a novel method for retrieving the joint angles of a robot manipulator using only a single RGB image of its current configuration. Our approach, based on a distance-geometric representation of the configuration space, exploits the knowledge of a robot's kinematic model.
arXiv Detail & Related papers (2023-01-05T12:57:45Z)
One-Shot Neural Fields for 3D Object Understanding [112.32255680399399]
We present a unified and compact scene representation for robotics. Each object in the scene is depicted by a latent code capturing geometry and appearance. This representation can be decoded for various tasks such as novel view rendering, 3D reconstruction, and stable grasp prediction.
arXiv Detail & Related papers (2022-10-21T17:33:14Z)
Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity. We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z)
Multi-Modal Fusion for Sensorimotor Coordination in Steering Angle Prediction [8.707695512525717]
Imitation learning is employed to learn sensorimotor coordination for steering angle prediction in an end-to-end fashion. This work explores the fusion of frame-based RGB and event data for learning end-to-end lateral control. We propose DRFuser, a novel convolutional encoder-decoder architecture for learning end-to-end lateral control.
arXiv Detail & Related papers (2022-02-11T08:22:36Z)
Real-time RGBD-based Extended Body Pose Estimation [57.61868412206493]
We present a system for real-time RGBD-based estimation of 3D human pose. We use parametric 3D deformable human mesh model (SMPL-X) as a representation. We train estimators of body pose and facial expression parameters.
arXiv Detail & Related papers (2021-03-05T13:37:50Z)
Nothing But Geometric Constraints: A Model-Free Method for Articulated Object Pose Estimation [89.82169646672872]
We propose an unsupervised vision-based system to estimate the joint configurations of the robot arm from a sequence of RGB or RGB-D images without knowing the model a priori. We combine a classical geometric formulation with deep learning and extend the use of epipolar multi-rigid-body constraints to solve this task.
arXiv Detail & Related papers (2020-11-30T20:46:48Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
Control of the Final-Phase of Closed-Loop Visual Grasping using Image-Based Visual Servoing [12.368559816913585]
Many current robotic grasping controllers are not closed-loop and therefore fail for moving objects. We propose the use of image-based visual servoing to guide the robot to the object-relative grasp pose using camera RGB information.
arXiv Detail & Related papers (2020-01-16T05:07:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.