Unsupervised Learning of Depth Estimation and Visual Odometry for Sparse
Light Field Cameras
- URL: http://arxiv.org/abs/2103.11322v1
- Date: Sun, 21 Mar 2021 07:13:14 GMT
- Title: Unsupervised Learning of Depth Estimation and Visual Odometry for Sparse
Light Field Cameras
- Authors: S. Tejaswi Digumarti (1 and 2), Joseph Daniel (1), Ahalya Ravendran (1
and 2), Donald G. Dansereau (1 and 2) ((1) School of Aerospace, Mechanical
and Mechatronic Engineering, The University of Sydney, (2) Sydney Institute
for Robotics and Intelligent Systems)
- Abstract summary: We generalise techniques from unsupervised learning to allow a robot to autonomously interpret new kinds of cameras.
We consider emerging sparse light field (LF) cameras, which capture a subset of the 4D LF function describing the set of light rays passing through a plane.
We introduce a generalised encoding of sparse LFs that allows unsupervised learning of odometry and depth.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While an exciting diversity of new imaging devices is emerging that could
dramatically improve robotic perception, the challenges of calibrating and
interpreting these cameras have limited their uptake in the robotics community.
In this work we generalise techniques from unsupervised learning to allow a
robot to autonomously interpret new kinds of cameras. We consider emerging
sparse light field (LF) cameras, which capture a subset of the 4D LF function
describing the set of light rays passing through a plane. We introduce a
generalised encoding of sparse LFs that allows unsupervised learning of
odometry and depth. We demonstrate the proposed approach outperforming
monocular and conventional techniques for dealing with 4D imagery, yielding
more accurate odometry and depth maps and delivering these with metric scale.
We anticipate our technique to generalise to a broad class of LF and sparse LF
cameras, and to enable unsupervised recalibration for coping with shifts in
camera behaviour over the lifetime of a robot. This work represents a first
step toward streamlining the integration of new kinds of imaging devices in
robotics applications.
Related papers
- Homography Estimation in Complex Topological Scenes [6.023710971800605]
Surveillance videos and images are used for a broad set of applications, ranging from traffic analysis to crime detection.
Extrinsic camera calibration data is important for most analysis applications.
We present an automated camera-calibration process leveraging a dictionary-based approach that does not require prior knowledge on any camera settings.
arXiv Detail & Related papers (2023-08-02T11:31:43Z) - A Flexible Framework for Virtual Omnidirectional Vision to Improve
Operator Situation Awareness [2.817412580574242]
We present a flexible framework for virtual projections to increase situation awareness based on a novel method to fuse multiple cameras mounted anywhere on the robot.
We propose a complementary approach to improve scene understanding by fusing camera images and geometric 3D Lidar data to obtain a colorized point cloud.
arXiv Detail & Related papers (2023-02-01T10:40:05Z) - SPARF: Neural Radiance Fields from Sparse and Noisy Poses [58.528358231885846]
We introduce Sparse Pose Adjusting Radiance Field (SPARF) to address the challenge of novel-view synthesis.
Our approach exploits multi-view geometry constraints in order to jointly learn the NeRF and refine the camera poses.
arXiv Detail & Related papers (2022-11-21T18:57:47Z) - NOCaL: Calibration-Free Semi-Supervised Learning of Odometry and Camera
Intrinsics [2.298932494750101]
We present NOCaL, Neural odometry and using Light fields, a semi-supervised learning architecture capable of interpreting previously unseen cameras without calibration.
We demonstrate NOCaL synthesis on rendered and captured imagery using conventional cameras, demonstrating calibration-free odometry and novel view geometries.
arXiv Detail & Related papers (2022-10-14T00:34:43Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Full Surround Monodepth from Multiple Cameras [31.145598985137468]
We extend self-supervised monocular depth and ego-motion estimation to large photo-baseline multi-camera rigs.
We learn a single network generating dense, consistent, and scale-aware point clouds that cover the same full surround 360 degree field of view as a typical LiDAR scanner.
arXiv Detail & Related papers (2021-03-31T22:52:04Z) - Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds [62.013872787987054]
We propose a new method for learning closed-loop control policies for 6D grasping.
Our policy takes a segmented point cloud of an object from an egocentric camera as input, and outputs continuous 6D control actions of the robot gripper for grasping the object.
arXiv Detail & Related papers (2020-10-02T07:42:00Z) - Neural Ray Surfaces for Self-Supervised Learning of Depth and Ego-motion [51.19260542887099]
We show that self-supervision can be used to learn accurate depth and ego-motion estimation without prior knowledge of the camera model.
Inspired by the geometric model of Grossberg and Nayar, we introduce Neural Ray Surfaces (NRS), convolutional networks that represent pixel-wise projection rays.
We demonstrate the use of NRS for self-supervised learning of visual odometry and depth estimation from raw videos obtained using a wide variety of camera systems.
arXiv Detail & Related papers (2020-08-15T02:29:13Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.