Related papers: A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

URL: http://arxiv.org/abs/2301.02051v2
Date: Thu, 27 Apr 2023 16:18:38 GMT
Title: A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image
Authors: Ivan Bili\'c, Filip Mari\'c, Ivan Markovi\'c, Ivan Petrovi\'c
Abstract summary: We present a novel method for retrieving the joint angles of a robot manipulator using only a single RGB image of its current configuration. Our approach, based on a distance-geometric representation of the configuration space, exploits the knowledge of a robot's kinematic model.
Score: 7.971699294672282
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Autonomous manipulation systems operating in domains where human intervention is difficult or impossible (e.g., underwater, extraterrestrial or hazardous environments) require a high degree of robustness to sensing and communication failures. Crucially, motion planning and control algorithms require a stream of accurate joint angle data provided by joint encoders, the failure of which may result in an unrecoverable loss of functionality. In this paper, we present a novel method for retrieving the joint angles of a robot manipulator using only a single RGB image of its current configuration, opening up an avenue for recovering system functionality when conventional proprioceptive sensing is unavailable. Our approach, based on a distance-geometric representation of the configuration space, exploits the knowledge of a robot's kinematic model with the goal of training a shallow neural network that performs a 2D-to-3D regression of distances associated with detected structural keypoints. It is shown that the resulting Euclidean distance matrix uniquely corresponds to the observed configuration, where joint angles can be recovered via multidimensional scaling and a simple inverse kinematics procedure. We evaluate the performance of our approach on real RGB images of a Franka Emika Panda manipulator, showing that the proposed method is efficient and exhibits solid generalization ability. Furthermore, we show that our method can be easily combined with a dense refinement technique to obtain superior results.

Related papers

Kinematics-based 3D Human-Object Interaction Reconstruction from Single View [10.684643503514849]
Existing methods simply predict the body poses merely rely on network training on some indoor datasets. We propose a kinematics-based method that can drive the joints of human body to the human-object contact regions accurately.
arXiv Detail & Related papers (2024-07-19T05:44:35Z)
Robust Surgical Tool Tracking with Pixel-based Probabilities for Projected Geometric Primitives [28.857732667640068]
Controlling robotic manipulators via visual feedback requires a known coordinate frame transformation between the robot and the camera. Uncertainties in mechanical systems as well as camera calibration create errors in this coordinate frame transformation. We estimate the camera-to-base transform and joint angle measurement errors for surgical robotic tools using an image based insertion-shaft detection algorithm and probabilistic models.
arXiv Detail & Related papers (2024-03-08T00:57:03Z)
Egocentric RGB+Depth Action Recognition in Industry-Like Settings [50.38638300332429]
Our work focuses on recognizing actions from egocentric RGB and Depth modalities in an industry-like environment. Our framework is based on the 3D Video SWIN Transformer to encode both RGB and Depth modalities effectively. Our method also secured first place at the multimodal action recognition challenge at ICIAP 2023.
arXiv Detail & Related papers (2023-09-25T08:56:22Z)
Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain. GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors. We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z)
Occlusion-robust Visual Markerless Bone Tracking for Computer-Assisted Orthopaedic Surgery [41.681134859412246]
We propose a RGB-D sensing-based markerless tracking method that is robust against occlusion. By using a high-quality commercial RGB-D camera, our proposed visual tracking method achieves an accuracy of 1-2 degress and 2-4 mm on a model knee.
arXiv Detail & Related papers (2021-08-24T09:49:08Z)
Domain Adaptive Robotic Gesture Recognition with Unsupervised Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot. It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture. Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z)
Online Body Schema Adaptation through Cost-Sensitive Active Learning [63.84207660737483]
The work was implemented in a simulation environment, using the 7DoF arm of the iCub robot simulator. A cost-sensitive active learning approach is used to select optimal joint configurations. The results show cost-sensitive active learning has similar accuracy to the standard active learning approach, while reducing in about half the executed movement.
arXiv Detail & Related papers (2021-01-26T16:01:02Z)
Nothing But Geometric Constraints: A Model-Free Method for Articulated Object Pose Estimation [89.82169646672872]
We propose an unsupervised vision-based system to estimate the joint configurations of the robot arm from a sequence of RGB or RGB-D images without knowing the model a priori. We combine a classical geometric formulation with deep learning and extend the use of epipolar multi-rigid-body constraints to solve this task.
arXiv Detail & Related papers (2020-11-30T20:46:48Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
Depth by Poking: Learning to Estimate Depth from Self-Supervised Grasping [6.382990675677317]
We train a neural network model to estimate depth from RGB-D images. Our network predicts, for each pixel in an input image, the z position that a robot's end effector would reach if it attempted to grasp or poke at the corresponding position. We show our approach achieves significantly lower root mean squared error than traditional structured light sensors.
arXiv Detail & Related papers (2020-06-16T03:34:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.