Related papers: Converting Depth Images and Point Clouds for Feature-based Pose Estimation

Converting Depth Images and Point Clouds for Feature-based Pose Estimation

URL: http://arxiv.org/abs/2310.14924v1
Date: Mon, 23 Oct 2023 13:29:42 GMT
Title: Converting Depth Images and Point Clouds for Feature-based Pose Estimation
Authors: Robert L\"osch (1), Mark Sastuba (2), Jonas Toth (1), Bernhard Jung (1) ((1) Technical University Bergakademie Freiberg, Germany, (2) German Centre for Rail Traffic Research at the Federal Railway Authority, Germany)
Abstract summary: This paper presents a method of converting depth data into images capable of visualizing spatial details that are basically hidden in traditional depth images. Compared to Bearing Angle images, our method yields brighter, higher-contrast images with more visible contours and more details.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In recent years, depth sensors have become more and more affordable and have found their way into a growing amount of robotic systems. However, mono- or multi-modal sensor registration, often a necessary step for further processing, faces many challenges on raw depth images or point clouds. This paper presents a method of converting depth data into images capable of visualizing spatial details that are basically hidden in traditional depth images. After noise removal, a neighborhood of points forms two normal vectors whose difference is encoded into this new conversion. Compared to Bearing Angle images, our method yields brighter, higher-contrast images with more visible contours and more details. We tested feature-based pose estimation of both conversions in a visual odometry task and RGB-D SLAM. For all tested features, AKAZE, ORB, SIFT, and SURF, our new Flexion images yield better results than Bearing Angle images and show great potential to bridge the gap between depth data and classical computer vision. Source code is available here: https://rlsch.github.io/depth-flexion-conversion.

Related papers

An End-to-End Depth-Based Pipeline for Selfie Image Rectification [9.08591353212111]
Portraits or selfie images taken from a close distance typically suffer from perspective distortion. We propose an end-to-end deep learning-based rectification pipeline to mitigate the effects of perspective distortion. Our pipeline produces comparable results with a time-consuming 3D GAN-based method while being more than 260 times faster.
arXiv Detail & Related papers (2024-12-26T11:57:54Z)
DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications. This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors. Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z)
Diff-DOPE: Differentiable Deep Object Pose Estimation [29.703385848843414]
We introduce Diff-DOPE, a 6-DoF pose refiner that takes as input an image, a 3D textured model of an object, and an initial pose of the object. The method uses differentiable rendering to update the object pose to minimize the visual error between the image and the projection of the model. We show that this simple, yet effective, idea is able to achieve state-of-the-art results on pose estimation datasets.
arXiv Detail & Related papers (2023-09-30T18:52:57Z)
Pyramid Deep Fusion Network for Two-Hand Reconstruction from RGB-D Images [11.100398985633754]
We propose an end-to-end framework for recovering dense meshes for both hands. Our framework employs ResNet50 and PointNet++ to derive features from RGB and point cloud. We also introduce a novel pyramid deep fusion network (PDFNet) to aggregate features at different scales.
arXiv Detail & Related papers (2023-07-12T09:33:21Z)
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients [105.25109274550607]
Line segments are increasingly used in vision tasks. Traditional line detectors based on the image gradient are extremely fast and accurate, but lack robustness in noisy images and challenging conditions. We propose to combine traditional and learned approaches to get the best of both worlds: an accurate and robust line detector.
arXiv Detail & Related papers (2022-12-15T12:36:49Z)
Improving Pixel-Level Contrastive Learning by Leveraging Exogenous Depth Information [7.561849435043042]
Self-supervised representation learning based on Contrastive Learning (CL) has been the subject of much attention in recent years. In this paper we will focus on the depth information, which can be obtained by using a depth network or measured from available data. We show that using this estimation information in the contrastive loss leads to improved results and that the learned representations better follow the shapes of objects.
arXiv Detail & Related papers (2022-11-18T11:45:39Z)
Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image. We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z)
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection [83.18142309597984]
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. We develop a family of generic multi-modal 3D detection models named DeepFusion, which is more accurate than previous methods.
arXiv Detail & Related papers (2022-03-15T18:46:06Z)
Graph-Based Depth Denoising & Dequantization for Point Cloud Enhancement [47.61748619439693]
A 3D point cloud is typically constructed from depth measurements acquired by sensors at one or more viewpoints. Previous works denoise a point cloud textita posteriori after projecting the imperfect depth data onto 3D space. We enhance depth measurements directly on the sensed images textita priori, before synthesizing a 3D point cloud.
arXiv Detail & Related papers (2021-11-09T04:17:35Z)
DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud. Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar. We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z)
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose [114.89389528198738]
We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching.
arXiv Detail & Related papers (2021-03-16T17:40:12Z)
Defocus Blur Detection via Depth Distillation [64.78779830554731]
We introduce depth information into DBD for the first time. In detail, we learn the defocus blur from ground truth and the depth distilled from a well-trained depth estimation network. Our approach outperforms 11 other state-of-the-art methods on two popular datasets.
arXiv Detail & Related papers (2020-07-16T04:58:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.