Neural Volumetric Memory for Visual Locomotion Control
- URL: http://arxiv.org/abs/2304.01201v1
- Date: Mon, 3 Apr 2023 17:59:56 GMT
- Title: Neural Volumetric Memory for Visual Locomotion Control
- Authors: Ruihan Yang, Ge Yang, Xiaolong Wang
- Abstract summary: In this work, we consider the difficult problem of locomotion on challenging terrains using a single forward-facing depth camera.
To solve this problem, we follow the paradigm in computer vision that explicitly models the 3D geometry of the scene.
We show that our approach, which explicitly introduces geometric priors during training, offers superior performance than more na"ive methods.
- Score: 11.871849736648237
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Legged robots have the potential to expand the reach of autonomy beyond paved
roads. In this work, we consider the difficult problem of locomotion on
challenging terrains using a single forward-facing depth camera. Due to the
partial observability of the problem, the robot has to rely on past
observations to infer the terrain currently beneath it. To solve this problem,
we follow the paradigm in computer vision that explicitly models the 3D
geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric
memory architecture that explicitly accounts for the SE(3) equivariance of the
3D world. NVM aggregates feature volumes from multiple camera views by first
bringing them back to the ego-centric frame of the robot. We test the learned
visual-locomotion policy on a physical robot and show that our approach, which
explicitly introduces geometric priors during training, offers superior
performance than more na\"ive methods. We also include ablation studies and
show that the representations stored in the neural volumetric memory capture
sufficient geometric information to reconstruct the scene. Our project page
with videos is https://rchalyang.github.io/NVM .
Related papers
- OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation [30.76201018651464]
Traditional 3D scene understanding approaches rely on expensive labeled 3D datasets to train a model for a single task with supervision.
We propose OpenOcc, a novel framework unifying the 3D scene reconstruction and open vocabulary understanding with neural radiance fields.
We show that our approach achieves competitive performance in 3D scene understanding tasks, especially for small and long-tail objects.
arXiv Detail & Related papers (2024-03-18T13:53:48Z) - SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction [77.15924044466976]
We propose SelfOcc to explore a self-supervised way to learn 3D occupancy using only video sequences.
We first transform the images into the 3D space (e.g., bird's eye view) to obtain 3D representation of the scene.
We can then render 2D images of previous and future frames as self-supervision signals to learn the 3D representations.
arXiv Detail & Related papers (2023-11-21T17:59:14Z) - BAA-NGP: Bundle-Adjusting Accelerated Neural Graphics Primitives [6.431806897364565]
Implicit neural representations have become pivotal in robotic perception, enabling robots to comprehend 3D environments from 2D images.
We propose a framework called bundle-adjusting accelerated neural graphics primitives (BAA-NGP)
Results demonstrate 10 to 20 x speed improvement compared to other bundle-adjusting neural radiance field methods.
arXiv Detail & Related papers (2023-06-07T05:36:45Z) - Visibility Aware Human-Object Interaction Tracking from Single RGB
Camera [40.817960406002506]
We propose a novel method to track the 3D human, object, contacts between them, and their relative translation across frames from a single RGB camera.
We condition our neural field reconstructions for human and object on per-frame SMPL model estimates obtained by pre-fitting SMPL to a video sequence.
Human and object motion from visible frames provides valuable information to infer the occluded object.
arXiv Detail & Related papers (2023-03-29T06:23:44Z) - BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown
Objects [89.2314092102403]
We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence.
Our method works for arbitrary rigid objects, even when visual texture is largely absent.
arXiv Detail & Related papers (2023-03-24T17:13:49Z) - One-Shot Neural Fields for 3D Object Understanding [112.32255680399399]
We present a unified and compact scene representation for robotics.
Each object in the scene is depicted by a latent code capturing geometry and appearance.
This representation can be decoded for various tasks such as novel view rendering, 3D reconstruction, and stable grasp prediction.
arXiv Detail & Related papers (2022-10-21T17:33:14Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - 3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations.
A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.