RADU: Ray-Aligned Depth Update Convolutions for ToF Data Denoising
- URL: http://arxiv.org/abs/2111.15513v1
- Date: Tue, 30 Nov 2021 15:53:28 GMT
- Title: RADU: Ray-Aligned Depth Update Convolutions for ToF Data Denoising
- Authors: Michael Schelling, Pedro Hermosilla, Timo Ropinski
- Abstract summary: Time-of-Flight (ToF) cameras are subject to high levels of noise and distortions due to Multi-Path-Interference (MPI)
We propose an iterative denoising approach operating in 3D space, that is designed to learn on 2.5D data by enabling 3D point convolutions to correct the points' positions along the view direction.
We demonstrate that our method is able to outperform SOTA methods on several datasets, including two real world datasets and a new large-scale synthetic data set introduced in this paper.
- Score: 8.142947808507369
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time-of-Flight (ToF) cameras are subject to high levels of noise and
distortions due to Multi-Path-Interference (MPI). While recent research showed
that 2D neural networks are able to outperform previous traditional
State-of-the-Art (SOTA) methods on denoising ToF-Data, little research on
learning-based approaches has been done to make direct use of the 3D
information present in depth images. In this paper, we propose an iterative
denoising approach operating in 3D space, that is designed to learn on 2.5D
data by enabling 3D point convolutions to correct the points' positions along
the view direction. As labeled real world data is scarce for this task, we
further train our network with a self-training approach on unlabeled real world
data to account for real world statistics. We demonstrate that our method is
able to outperform SOTA methods on several datasets, including two real world
datasets and a new large-scale synthetic data set introduced in this paper.
Related papers
- Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance [11.090775523892074]
We introduce a novel semi-supervised framework to alleviate the dependency on densely annotated data.
Our approach leverages 2D foundation models to generate essential 3D scene geometric and semantic cues.
Our method achieves up to 85% of the fully-supervised performance using only 10% labeled data.
arXiv Detail & Related papers (2024-08-21T12:13:18Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D
Object Detection [46.041193889845474]
Ray Denoising is an innovative method that enhances detection accuracy by strategically sampling along camera rays to construct hard negative examples.
Ray Denoising is designed as a plug-and-play module, compatible with any DETR-style multi-view 3D detectors.
It achieves a 1.9% improvement in mean Average Precision (mAP) over the state-of-the-art StreamPETR method on the NuScenes dataset.
arXiv Detail & Related papers (2024-02-06T02:17:44Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - Leveraging Neural Radiance Fields for Uncertainty-Aware Visual
Localization [56.95046107046027]
We propose to leverage Neural Radiance Fields (NeRF) to generate training samples for scene coordinate regression.
Despite NeRF's efficiency in rendering, many of the rendered data are polluted by artifacts or only contain minimal information gain.
arXiv Detail & Related papers (2023-10-10T20:11:13Z) - Learning-based Point Cloud Registration for 6D Object Pose Estimation in
the Real World [55.7340077183072]
We tackle the task of estimating the 6D pose of an object from point cloud data.
Recent learning-based approaches to addressing this task have shown great success on synthetic datasets.
We analyze the causes of these failures, which we trace back to the difference between the feature distributions of the source and target point clouds.
arXiv Detail & Related papers (2022-03-29T07:55:04Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene
Understanding [19.134536179555102]
We propose an alternative approach to overcome the limitations of CNN based approaches by encoding the spatial features of raw 3D point clouds into undirected graph models.
The proposed method achieves on par state-of-the-art accuracy with improved training time and model stability thus indicating strong potential for further research.
arXiv Detail & Related papers (2020-11-29T12:56:19Z) - Bridging the Reality Gap for Pose Estimation Networks using Sensor-Based
Domain Randomization [1.4290119665435117]
Methods trained on synthetic data use 2D images, as domain randomization in 2D is more developed.
Our method integrates the 3D data into the network to increase the accuracy of the pose estimation.
Experiments on three large pose estimation benchmarks show that the presented method outperforms previous methods trained on synthetic data.
arXiv Detail & Related papers (2020-11-17T09:12:11Z) - SelfVoxeLO: Self-supervised LiDAR Odometry with Voxel-based Deep Neural
Networks [81.64530401885476]
We propose a self-supervised LiDAR odometry method, dubbed SelfVoxeLO, to tackle these two difficulties.
Specifically, we propose a 3D convolution network to process the raw LiDAR data directly, which extracts features that better encode the 3D geometric patterns.
We evaluate our method's performances on two large-scale datasets, i.e., KITTI and Apollo-SouthBay.
arXiv Detail & Related papers (2020-10-19T09:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.