Weakly-Supervised Optical Flow Estimation for Time-of-Flight
- URL: http://arxiv.org/abs/2210.05298v1
- Date: Tue, 11 Oct 2022 09:47:23 GMT
- Title: Weakly-Supervised Optical Flow Estimation for Time-of-Flight
- Authors: Michael Schelling, Pedro Hermosilla, Timo Ropinski
- Abstract summary: We propose a training algorithm, which allows to supervise Optical Flow networks directly on the reconstructed depth.
We demonstrate that this approach enables the training of OF networks to align raw iToF measurements and compensate motion artifacts in the iToF depth images.
- Score: 11.496094830445054
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Indirect Time-of-Flight (iToF) cameras are a widespread type of 3D sensor,
which perform multiple captures to obtain depth values of the captured scene.
While recent approaches to correct iToF depths achieve high performance when
removing multi-path-interference and sensor noise, little research has been
done to tackle motion artifacts. In this work we propose a training algorithm,
which allows to supervise Optical Flow (OF) networks directly on the
reconstructed depth, without the need of having ground truth flows. We
demonstrate that this approach enables the training of OF networks to align raw
iToF measurements and compensate motion artifacts in the iToF depth images. The
approach is evaluated for both single- and multi-frequency sensors as well as
multi-tap sensors, and is able to outperform other motion compensation
techniques.
Related papers
- UFD-PRiME: Unsupervised Joint Learning of Optical Flow and Stereo Depth
through Pixel-Level Rigid Motion Estimation [4.445751695675388]
Both optical flow and stereo disparities are image matches and can therefore benefit from joint training.
We design a first network that estimates flow and disparity jointly and is trained without supervision.
A second network, trained with optical flow from the first as pseudo-labels, takes disparities from the first network, estimates 3D rigid motion at every pixel, and reconstructs optical flow again.
arXiv Detail & Related papers (2023-10-07T07:08:25Z) - Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a
Light-Weight ToF Sensor [58.305341034419136]
We present the first dense SLAM system with a monocular camera and a light-weight ToF sensor.
We propose a multi-modal implicit scene representation that supports rendering both the signals from the RGB camera and light-weight ToF sensor.
Experiments demonstrate that our system well exploits the signals of light-weight ToF sensors and achieves competitive results.
arXiv Detail & Related papers (2023-08-28T07:56:13Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Wild ToFu: Improving Range and Quality of Indirect Time-of-Flight Depth
with RGB Fusion in Challenging Environments [56.306567220448684]
We propose a new learning based end-to-end depth prediction network which takes noisy raw I-ToF signals as well as an RGB image.
We show more than 40% RMSE improvement on the final depth map compared to the baseline approach.
arXiv Detail & Related papers (2021-12-07T15:04:14Z) - T\"oRF: Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis [32.878225196378374]
We introduce a neural representation based on an image formation model for continuous-wave ToF cameras.
We show that this approach improves robustness of dynamic scene reconstruction to erroneous calibration and large motions.
arXiv Detail & Related papers (2021-09-30T17:12:59Z) - Thermal Image Processing via Physics-Inspired Deep Networks [21.094006629684376]
DeepIR combines physically accurate sensor modeling with deep network-based image representation.
DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target.
Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images.
arXiv Detail & Related papers (2021-08-18T04:57:48Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Self-Attention Dense Depth Estimation Network for Unrectified Video
Sequences [6.821598757786515]
LiDAR and radar sensors are the hardware solution for real-time depth estimation.
Deep learning based self-supervised depth estimation methods have shown promising results.
We propose a self-attention based depth and ego-motion network for unrectified images.
arXiv Detail & Related papers (2020-05-28T21:53:53Z) - Video Depth Estimation by Fusing Flow-to-Depth Proposals [65.24533384679657]
We present an approach with a differentiable flow-to-depth layer for video depth estimation.
The model consists of a flow-to-depth layer, a camera pose refinement module, and a depth fusion network.
Our approach outperforms state-of-the-art depth estimation methods, and has reasonable cross dataset generalization capability.
arXiv Detail & Related papers (2019-12-30T10:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.