Related papers: Detection and Pose Estimation of flat, Texture-less Industry Objects on HoloLens using synthetic Training

Detection and Pose Estimation of flat, Texture-less Industry Objects on HoloLens using synthetic Training

URL: http://arxiv.org/abs/2402.04979v1
Date: Wed, 7 Feb 2024 15:57:28 GMT
Title: Detection and Pose Estimation of flat, Texture-less Industry Objects on HoloLens using synthetic Training
Authors: Thomas P\"ollabauer, Fabian R\"ucker, Andreas Franek, Felix Gorschl\"uter
Abstract summary: Current state-of-the-art 6d pose estimation is too compute intensive to be deployed on edge devices. We propose a synthetically trained client-server-based augmented reality application.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current state-of-the-art 6d pose estimation is too compute intensive to be deployed on edge devices, such as Microsoft HoloLens (2) or Apple iPad, both used for an increasing number of augmented reality applications. The quality of AR is greatly dependent on its capabilities to detect and overlay geometry within the scene. We propose a synthetically trained client-server-based augmented reality application, demonstrating state-of-the-art object pose estimation of metallic and texture-less industry objects on edge devices. Synthetic data enables training without real photographs, i.e. for yet-to-be-manufactured objects. Our qualitative evaluation on an AR-assisted sorting task, and quantitative evaluation on both renderings, as well as real-world data recorded on HoloLens 2, sheds light on its real-world applicability.

Related papers

R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation [78.26308457952636]
This paper introduces R3D2, a lightweight, one-step diffusion model designed to overcome limitations in autonomous driving simulation.<n>It enables realistic insertion of complete 3D assets into existing scenes by generating plausible rendering effects-such as shadows and consistent lighting-in real time.<n>We show that R3D2 significantly enhances the realism of inserted assets, enabling use-cases like text-to-3D asset insertion and cross-scene/dataset object transfer.
arXiv Detail & Related papers (2025-06-09T14:50:19Z)
Synthetica: Large Scale Synthetic Data for Robot Perception [21.415878105900187]
We present Synthetica, a method for large-scale synthetic data generation for training robust state estimators. This paper focuses on the task of object detection, an important problem which can serve as the front-end for most state estimation problems. We leverage data from a ray-tracing, generating 2.7 million images, to train highly accurate real-time detection transformers. We demonstrate state-of-the-art performance on the task of object detection while having detectors that run at 50-100Hz which is 9 times faster than the prior SOTA.
arXiv Detail & Related papers (2024-10-28T15:50:56Z)
Investigation of the Impact of Synthetic Training Data in the Industrial Application of Terminal Strip Object Detection [4.327763441385371]
In this paper, we investigate the sim-to-real generalization performance of standard object detectors on the complex industrial application of terminal strip object detection. We manually annotated 300 real images of terminal strips for the evaluation. The results show the cruciality of the objects of interest to have the same scale in either domain.
arXiv Detail & Related papers (2024-03-06T18:33:27Z)
Reconstructing Objects in-the-wild for Realistic Sensor Simulation [41.55571880832957]
We present NeuSim, a novel approach that estimates accurate geometry and realistic appearance from sparse in-the-wild data. We model the object appearance with a robust physics-inspired reflectance representation effective for in-the-wild data. Our experiments show that NeuSim has strong view synthesis performance on challenging scenarios with sparse training views.
arXiv Detail & Related papers (2023-11-09T18:58:22Z)
Real-Time Onboard Object Detection for Augmented Reality: Enhancing Head-Mounted Display with YOLOv8 [2.1530718840070784]
This paper introduces a software architecture for real-time object detection using machine learning (ML) in an augmented reality (AR) environment. We show the image processing pipeline for the YOLOv8 model and the techniques used to make it real-time on the resource-limited edge computing platform of the headset.
arXiv Detail & Related papers (2023-06-06T09:35:45Z)
LaMAR: Benchmarking Localization and Mapping for Augmented Reality [80.23361950062302]
We introduce LaMAR, a new benchmark with a comprehensive capture and GT pipeline that co-registers realistic trajectories and sensor streams captured by heterogeneous AR devices. We publish a benchmark dataset of diverse and large-scale scenes recorded with head-mounted and hand-held AR devices.
arXiv Detail & Related papers (2022-10-19T17:58:17Z)
Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge TPU [58.720142291102135]
In this paper we propose a pose estimation software exploiting neural network architectures. We show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.
arXiv Detail & Related papers (2022-04-07T08:53:18Z)
ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations [52.226947570070784]
We present Object, a dataset of 100 objects that addresses both challenges with two key innovations. First, Object encodes the visual, auditory, and tactile sensory data for all objects, enabling a number of multisensory object recognition tasks. Second, Object employs a uniform, object-centric simulations, and implicit representation for each object's visual textures, tactile readings, and tactile readings, making the dataset flexible to use and easy to share.
arXiv Detail & Related papers (2021-09-16T14:00:59Z)
Xihe: A 3D Vision-based Lighting Estimation Framework for Mobile Augmented Reality [9.129335351176904]
We design an edge-assisted framework called Xihe to provide mobile AR applications the ability to obtain accurate omnidirectional lighting estimation in real time. We develop a tailored GPU pipeline for on-device point cloud processing and use an encoding technique that reduces network transmitted bytes. Our results show that Xihe takes as fast as 20.67ms per lighting estimation and achieves 9.4% better estimation accuracy than a state-of-the-art neural network.
arXiv Detail & Related papers (2021-05-30T13:48:29Z)
Analysis of voxel-based 3D object detection methods efficiency for real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper. Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances. Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z)
Object-based Illumination Estimation with Rendering-aware Neural Networks [56.01734918693844]
We present a scheme for fast environment light estimation from the RGBD appearance of individual objects and their local image areas. With the estimated lighting, virtual objects can be rendered in AR scenarios with shading that is consistent to the real scene.
arXiv Detail & Related papers (2020-08-06T08:23:19Z)
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets [103.54691385842314]
We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes. Our goal is to make the dataset creation process widely accessible. This enables important applications in inverse rendering, scene understanding and robotics.
arXiv Detail & Related papers (2020-07-25T06:48:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.