Related papers: Self-Modeling Robots by Photographing

Self-Modeling Robots by Photographing

URL: http://arxiv.org/abs/2503.05398v1
Date: Fri, 07 Mar 2025 13:21:18 GMT
Title: Self-Modeling Robots by Photographing
Authors: Kejun Hu, Peng Yu, Ning Tan,
Abstract summary: We propose a high-quality, texture-aware, and link-level method for robot self-modeling.<n>We use 3D Gaussians to represent the static morphology and texture of robots, and cluster the 3D Gaussians to construct neural ellipsoid bones.<n>By feeding the kinematic neural network with joint angles, we can utilize the well-trained model to describe the corresponding morphology, kinematics and texture of robots at the link level.
Score: 4.482658473425829
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-modeling enables robots to build task-agnostic models of their morphology and kinematics based on data that can be automatically collected, with minimal human intervention and prior information, thereby enhancing machine intelligence. Recent research has highlighted the potential of data-driven technology in modeling the morphology and kinematics of robots. However, existing self-modeling methods suffer from either low modeling quality or excessive data acquisition costs. Beyond morphology and kinematics, texture is also a crucial component of robots, which is challenging to model and remains unexplored. In this work, a high-quality, texture-aware, and link-level method is proposed for robot self-modeling. We utilize three-dimensional (3D) Gaussians to represent the static morphology and texture of robots, and cluster the 3D Gaussians to construct neural ellipsoid bones, whose deformations are controlled by the transformation matrices generated by a kinematic neural network. The 3D Gaussians and kinematic neural network are trained using data pairs composed of joint angles, camera parameters and multi-view images without depth information. By feeding the kinematic neural network with joint angles, we can utilize the well-trained model to describe the corresponding morphology, kinematics and texture of robots at the link level, and render robot images from different perspectives with the aid of 3D Gaussian splatting. Furthermore, we demonstrate that the established model can be exploited to perform downstream tasks such as motion planning and inverse kinematics.

Related papers

VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation [53.63540587160549]
VidBot is a framework enabling zero-shot robotic manipulation using learned 3D affordance from in-the-wild monocular RGB-only human videos. VidBot paves the way for leveraging everyday human videos to make robot learning more scalable.
arXiv Detail & Related papers (2025-03-10T10:04:58Z)
Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling [10.247075501610492]
We introduce a framework to learn object dynamics directly from multi-view RGB videos. We train a particle-based dynamics model using Graph Neural Networks. Our method can predict object motions under varying initial configurations and unseen robot actions.
arXiv Detail & Related papers (2024-10-24T17:02:52Z)
Differentiable Robot Rendering [45.23538293501457]
We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to its control parameters. We demonstrate its capability and usage in applications including reconstruction of robot poses from images and controlling robots through vision language models.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models [102.13968267347553]
We present DiffuseBot, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks. We showcase a range of simulated and fabricated robots along with their capabilities.
arXiv Detail & Related papers (2023-11-28T18:58:48Z)
High-Degrees-of-Freedom Dynamic Neural Fields for Robot Self-Modeling and Motion Planning [6.229216953398305]
A robot self-model is a representation of the robot's physical morphology that can be used for motion planning tasks. We propose a new encoder-based neural density field architecture for dynamic object-centric scenes conditioned on high numbers of degrees of freedom. In a 7-DOF robot test setup, the learned self-model achieves a Chamfer-L2 distance of 2% of the robot's dimension workspace.
arXiv Detail & Related papers (2023-10-05T16:01:29Z)
Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot. We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z)
Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity. We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z)
RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks [32.00371492516123]
We present a model-based planning framework for modeling and manipulating elasto-plastic objects. Our system, RoboCraft, learns a particle-based dynamics model using graph neural networks (GNNs) to capture the structure of the underlying system. We show through experiments that with just 10 minutes of real-world robotic interaction data, our robot can learn a dynamics model that can be used to synthesize control signals to deform elasto-plastic objects into various target shapes.
arXiv Detail & Related papers (2022-05-05T20:28:15Z)
Full-Body Visual Self-Modeling of Robot Morphologies [29.76701883250049]
Internal computational models of physical bodies are fundamental to the ability of robots and animals alike to plan and control their actions. Recent progress in fully data-driven self-modeling has enabled machines to learn their own forward kinematics directly from task-agnostic interaction data. Here, we propose that instead of directly modeling forward-kinematics, a more useful form of self-modeling is one that could answer space occupancy queries.
arXiv Detail & Related papers (2021-11-11T18:58:07Z)
Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes. Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z)
3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations. A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.