Related papers: E3D: Event-Based 3D Shape Reconstruction

E3D: Event-Based 3D Shape Reconstruction

URL: http://arxiv.org/abs/2012.05214v2
Date: Thu, 10 Dec 2020 12:26:59 GMT
Title: E3D: Event-Based 3D Shape Reconstruction
Authors: Alexis Baudron, Zihao W. Wang, Oliver Cossairt and Aggelos K. Katsaggelos
Abstract summary: 3D shape reconstruction is a primary component of augmented/virtual reality. Previous solutions based on RGB, RGB-D and Lidar sensors are power and data intensive. We approach 3D reconstruction with an event camera, a sensor with significantly lower power, latency and data expense.
Score: 19.823758341937605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D shape reconstruction is a primary component of augmented/virtual reality. Despite being highly advanced, existing solutions based on RGB, RGB-D and Lidar sensors are power and data intensive, which introduces challenges for deployment in edge devices. We approach 3D reconstruction with an event camera, a sensor with significantly lower power, latency and data expense while enabling high dynamic range. While previous event-based 3D reconstruction methods are primarily based on stereo vision, we cast the problem as multi-view shape from silhouette using a monocular event camera. The output from a moving event camera is a sparse point set of space-time gradients, largely sketching scene/object edges and contours. We first introduce an event-to-silhouette (E2S) neural network module to transform a stack of event frames to the corresponding silhouettes, with additional neural branches for camera pose regression. Second, we introduce E3D, which employs a 3D differentiable renderer (PyTorch3D) to enforce cross-view 3D mesh consistency and fine-tune the E2S and pose network. Lastly, we introduce a 3D-to-events simulation pipeline and apply it to publicly available object datasets and generate synthetic event/silhouette training pairs for supervised learning.

Related papers

SR3D: Unleashing Single-view 3D Reconstruction for Transparent and Specular Object Grasping [7.222966501323922]
We propose a training free framework SR3D that enables robotic grasping of transparent and specular objects from a single view observation.<n>Specifically, given single view RGB and depth images, SR3D first uses the external visual models to generate 3D reconstructed object mesh.<n>Then, the key idea is to determine the 3D object's pose and scale to accurately localize the reconstructed object back into its original depth corrupted 3D scene.
arXiv Detail & Related papers (2025-05-30T07:38:46Z)
Glissando-Net: Deep sinGLe vIew category level poSe eStimation ANd 3D recOnstruction [23.243959739520427]
We present a deep learning model, dubbed Glissando-Net, to simultaneously estimate the pose and reconstruct the 3D shape of objects. Glissando-Net is composed of two auto-encoders that are jointly trained.
arXiv Detail & Related papers (2025-01-24T19:39:15Z)
LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors [107.83398512719981]
Single-image 3D reconstruction remains a fundamental challenge in computer vision. Recent advances in Latent Video Diffusion Models offer promising 3D priors learned from large-scale video data. We propose LiftImage3D, a framework that effectively releases LVDMs' generative priors while ensuring 3D consistency.
arXiv Detail & Related papers (2024-12-12T18:58:42Z)
E-3DGS: Gaussian Splatting with Exposure and Motion Events [29.042018288378447]
We propose E-3DGS, a novel event-based approach that partitions events into motion and exposure. We introduce a novel integration of 3DGS with exposure events for high-quality reconstruction of explicit scene representations. Our method is faster and delivers better reconstruction quality than event-based NeRF while being more cost-effective than NeRF methods.
arXiv Detail & Related papers (2024-10-22T13:17:20Z)
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [76.02450110026747]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution. We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS. We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z)
IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera [7.515256982860307]
IncEventGS is an incremental 3D Gaussian splatting reconstruction algorithm with a single event camera. We exploit the tracking and mapping paradigm of conventional SLAM pipelines for IncEventGS.
arXiv Detail & Related papers (2024-10-10T16:54:23Z)
Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion [54.197343533492486]
Event3DGS can reconstruct high-fidelity 3D structure and appearance under high-speed egomotion. Experiments on multiple synthetic and real-world datasets demonstrate the superiority of Event3DGS compared with existing event-based dense 3D scene reconstruction frameworks. Our framework also allows one to incorporate a few motion-blurred frame-based measurements into the reconstruction process to further improve appearance fidelity without loss of structural accuracy.
arXiv Detail & Related papers (2024-06-05T06:06:03Z)
EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting [5.160735014509357]
We propose the first event-based generalizable 3D reconstruction framework, called EvGGS. It reconstructs scenes as 3D Gaussians from only event input in a feedforward manner. Our approach performs better than all baselines in reconstruction quality, and depth/intensity predictions with satisfactory speed.
arXiv Detail & Related papers (2024-05-23T18:10:26Z)
EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [59.77837807004765]
This paper introduces a new problem, i.e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens. Event streams have high temporal resolution and provide reliable cues for 3D human motion capture under high-speed human motions and rapidly changing illumination. Our EE3D demonstrates robustness and superior 3D accuracy compared to existing solutions while supporting real-time 3D pose update rates of 140Hz.
arXiv Detail & Related papers (2024-04-12T17:59:47Z)
Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes. First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes. Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z)
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation. For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z)
EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls [46.94040300725127]
3D reconstruction from multiple views is a successful computer vision field with multiple deployments in applications. We study the problem of 3D reconstruction from event-cameras, motivated by the advantages of event-based cameras in terms of low power and latency. We propose Apparent Contour Events (ACE), a novel event-based representation that defines the geometry of the apparent contour of an object.
arXiv Detail & Related papers (2023-04-11T15:46:16Z)
3D-to-2D Distillation for Indoor Scene Parsing [78.36781565047656]
We present a new approach that enables us to leverage 3D features extracted from large-scale 3D data repository to enhance 2D features extracted from RGB images. First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training. Second, we design a two-stage dimension normalization scheme to calibrate the 2D and 3D features for better integration. Third, we design a semantic-aware adversarial training model to extend our framework for training with unpaired 3D data.
arXiv Detail & Related papers (2021-04-06T02:22:24Z)
CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection [8.134961550216618]
We introduce a method for 3D object detection using a single monocular image. We show that we can pre-train the AE using paired RGB and depth images from simulation data once and subsequently only train the 3DOD network using real data. Our 3DOD network utilizes a particular cubification' of 3D space around the camera, where each cuboid is tasked with predicting N object poses, along with their class and confidence values.
arXiv Detail & Related papers (2020-06-07T08:17:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.