Related papers: A 3D-Deep-Learning-based Augmented Reality Calibration Method for Robotic Environments using Depth Sensor Data

A 3D-Deep-Learning-based Augmented Reality Calibration Method for Robotic Environments using Depth Sensor Data

URL: http://arxiv.org/abs/1912.12101v1
Date: Fri, 27 Dec 2019 13:56:13 GMT
Title: A 3D-Deep-Learning-based Augmented Reality Calibration Method for Robotic Environments using Depth Sensor Data
Authors: Linh K\"astner, Vlad Catalin Frasineanu, Jens Lambrecht
Abstract summary: We propose a novel approach to calibrate the Augmented Reality device using 3D depth sensor data. We use the depth camera of a cutting edge Augmented Reality Device - the Microsoft Hololens for deep learning based calibration. We introduce an open source 3D point cloud labeling tool, which is to our knowledge the first open source tool for labeling raw point cloud data.
Score: 5.027571997864707
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Augmented Reality and mobile robots are gaining much attention within industries due to the high potential to make processes cost and time efficient. To facilitate augmented reality, a calibration between the Augmented Reality device and the environment is necessary. This is a challenge when dealing with mobile robots due to the mobility of all entities making the environment dynamic. On this account, we propose a novel approach to calibrate the Augmented Reality device using 3D depth sensor data. We use the depth camera of a cutting edge Augmented Reality Device - the Microsoft Hololens for deep learning based calibration. Therefore, we modified a neural network based on the recently published VoteNet architecture which works directly on the point cloud input observed by the Hololens. We achieve satisfying results and eliminate external tools like markers, thus enabling a more intuitive and flexible work flow for Augmented Reality integration. The results are adaptable to work with all depth cameras and are promising for further research. Furthermore, we introduce an open source 3D point cloud labeling tool, which is to our knowledge the first open source tool for labeling raw point cloud data.

Related papers

Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting [64.64738535860351]
We present a scalable pipeline that converts single-view images into comprehensive, scale- and appearance-realistic 3D representations.<n>Our method bridges the gap between the vast repository of imagery and the increasing demand for spatial scene understanding.<n>By automatically generating authentic, scale-aware 3D data from images, we significantly reduce data collection costs and open new avenues for advancing spatial intelligence.
arXiv Detail & Related papers (2025-07-24T14:53:26Z)
Precise Workcell Sketching from Point Clouds Using an AR Toolbox [1.249418440326334]
Capturing real-world 3D spaces as point clouds is efficient and descriptive, but it comes with sensor errors and lacks object parametrization. Our method for 3D workcell sketching from point clouds allows users to refine raw point clouds using an Augmented Reality interface. By utilizing a toolbox and an AR-enabled pointing device, users can enhance point cloud accuracy based on the device's position in 3D space.
arXiv Detail & Related papers (2024-10-01T08:07:51Z)
ClearDepth: Enhanced Stereo Perception of Transparent Objects for Robotic Manipulation [18.140839442955485]
We develop a vision transformer-based algorithm for stereo depth recovery of transparent objects. Our method incorporates a parameter-aligned, domain-adaptive, and physically realistic Sim2Real simulation for efficient data generation. Our experimental results demonstrate the model's exceptional Sim2Real generalizability in real-world scenarios.
arXiv Detail & Related papers (2024-09-13T15:44:38Z)
LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image [72.14973729674995]
Current 3D perception methods, particularly small models, struggle with processing logical reasoning, question-answering, and handling open scenario categories. We propose solutions: Spatial-Enhanced Local Feature Mining for better spatial feature extraction, 3D Query Token-Derived Info Decoding for precise geometric regression, and Geometry Projection-Based 3D Reasoning for handling camera focal length variations.
arXiv Detail & Related papers (2024-08-14T10:00:16Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation. We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z)
Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images. Our aim is to generate high-resolution images and videos from novel viewpoints. We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z)
Bayesian Imitation Learning for End-to-End Mobile Manipulation [80.47771322489422]
Augmenting policies with additional sensor inputs, such as RGB + depth cameras, is a straightforward approach to improving robot perception capabilities. We show that using the Variational Information Bottleneck to regularize convolutional neural networks improves generalization to held-out domains. We demonstrate that our method is able to help close the sim-to-real gap and successfully fuse RGB and depth modalities.
arXiv Detail & Related papers (2022-02-15T17:38:30Z)
Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch. In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings. We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z)
Xihe: A 3D Vision-based Lighting Estimation Framework for Mobile Augmented Reality [9.129335351176904]
We design an edge-assisted framework called Xihe to provide mobile AR applications the ability to obtain accurate omnidirectional lighting estimation in real time. We develop a tailored GPU pipeline for on-device point cloud processing and use an encoding technique that reduces network transmitted bytes. Our results show that Xihe takes as fast as 20.67ms per lighting estimation and achieves 9.4% better estimation accuracy than a state-of-the-art neural network.
arXiv Detail & Related papers (2021-05-30T13:48:29Z)
HDR Environment Map Estimation for Real-Time Augmented Reality [7.6146285961466]
We present a method to estimate an HDR environment map from a narrow field-of-view LDR camera image in real-time. This enables perceptually appealing reflections and shading on virtual objects of any material finish, from mirror to diffuse, rendered into a real physical environment using augmented reality.
arXiv Detail & Related papers (2020-11-21T01:01:53Z)
A Markerless Deep Learning-based 6 Degrees of Freedom PoseEstimation for with Mobile Robots using RGB Data [3.4806267677524896]
We propose a method to deploy state of the art neural networks for real time 3D object localization on augmented reality devices. We focus on fast 2D detection approaches which are extracting the 3D pose of the object fast and accurately by using only 2D input. For the 6D annotation of 2D images, we developed an annotation tool, which is, to our knowledge, the first open source tool to be available.
arXiv Detail & Related papers (2020-01-16T09:13:31Z)
CRAVES: Controlling Robotic Arm with a Vision-based Economic System [96.56564257199474]
Training a robotic arm to accomplish real-world tasks has been attracting increasing attention in both academia and industry.<n>This work discusses the role of computer vision algorithms in this field.<n>We present an alternative solution, which uses a 3D model to create a large number of synthetic data.
arXiv Detail & Related papers (2018-12-03T13:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.