Learning to See Physical Properties with Active Sensing Motor Policies
- URL: http://arxiv.org/abs/2311.01405v1
- Date: Thu, 2 Nov 2023 17:19:18 GMT
- Title: Learning to See Physical Properties with Active Sensing Motor Policies
- Authors: Gabriel B. Margolis, Xiang Fu, Yandong Ji, Pulkit Agrawal
- Abstract summary: We present a method that overcomes the challenge of building a vision system that takes as input the observed terrain and predicts physical properties.
We introduce Active Sensing Motor Policies (ASMP), which are trained to explore locomotion behaviors that increase the accuracy of estimating physical parameters.
The trained system is robust and works even with overhead images captured by a drone despite being trained on data collected by cameras attached to a quadruped robot walking on the ground.
- Score: 20.851419392513503
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge of terrain's physical properties inferred from color images can aid
in making efficient robotic locomotion plans. However, unlike image
classification, it is unintuitive for humans to label image patches with
physical properties. Without labeled data, building a vision system that takes
as input the observed terrain and predicts physical properties remains
challenging. We present a method that overcomes this challenge by
self-supervised labeling of images captured by robots during real-world
traversal with physical property estimators trained in simulation. To ensure
accurate labeling, we introduce Active Sensing Motor Policies (ASMP), which are
trained to explore locomotion behaviors that increase the accuracy of
estimating physical parameters. For instance, the quadruped robot learns to
swipe its foot against the ground to estimate the friction coefficient
accurately. We show that the visual system trained with a small amount of
real-world traversal data accurately predicts physical parameters. The trained
system is robust and works even with overhead images captured by a drone
despite being trained on data collected by cameras attached to a quadruped
robot walking on the ground.
Related papers
- Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction [52.12746368727368]
Differentiable simulation has become a powerful tool for system identification.
Our approach calibrates object properties by using information from the robot, without relying on data from the object itself.
We demonstrate the effectiveness of our method on a low-cost robotic platform.
arXiv Detail & Related papers (2024-10-04T20:48:38Z) - Identifying Terrain Physical Parameters from Vision -- Towards Physical-Parameter-Aware Locomotion and Navigation [33.10872127224328]
We propose a cross-modal self-supervised learning framework for vision-based environmental physical parameter estimation.
We train a physical decoder in simulation to predict friction and stiffness from multi-modal input.
The trained network allows the labeling of real-world images with physical parameters in a self-supervised manner to further train a visual network during deployment.
arXiv Detail & Related papers (2024-08-29T14:35:14Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Learning Semantics-Aware Locomotion Skills from Human Demonstration [35.996425893483796]
We present a framework that learns semantics-aware locomotion skills from perception for quadrupedal robots.
Our framework learns to adjust the speed and gait of the robot based on perceived terrain semantics, and enables the robot to walk over 6km without failure.
arXiv Detail & Related papers (2022-06-27T21:08:03Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - Learning Perceptual Locomotion on Uneven Terrains using Sparse Visual
Observations [75.60524561611008]
This work aims to exploit the use of sparse visual observations to achieve perceptual locomotion over a range of commonly seen bumps, ramps, and stairs in human-centred environments.
We first formulate the selection of minimal visual input that can represent the uneven surfaces of interest, and propose a learning framework that integrates such exteroceptive and proprioceptive data.
We validate the learned policy in tasks that require omnidirectional walking over flat ground and forward locomotion over terrains with obstacles, showing a high success rate.
arXiv Detail & Related papers (2021-09-28T20:25:10Z) - Physion: Evaluating Physical Prediction from Vision in Humans and
Machines [46.19008633309041]
We present a visual and physical prediction benchmark that precisely measures this capability.
We compare an array of algorithms on their ability to make diverse physical predictions.
We find that graph neural networks with access to the physical state best capture human behavior.
arXiv Detail & Related papers (2021-06-15T16:13:39Z) - Careful with That! Observation of Human Movements to Estimate Objects
Properties [106.925705883949]
We focus on the features of human motor actions that communicate insights on the weight of an object.
Our final goal is to enable a robot to autonomously infer the degree of care required in object handling.
arXiv Detail & Related papers (2021-03-02T08:14:56Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Learning to Identify Physical Parameters from Video Using Differentiable
Physics [2.15242029196761]
We propose a differentiable physics engine within an action-conditional video representation network to learn a physical latent representation.
We demonstrate that our network can learn to encode images and identify physical properties like mass and friction from videos and action sequences.
arXiv Detail & Related papers (2020-09-17T13:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.