Differentiable Event Stream Simulator for Non-Rigid 3D Tracking
- URL: http://arxiv.org/abs/2104.15139v1
- Date: Fri, 30 Apr 2021 17:58:07 GMT
- Title: Differentiable Event Stream Simulator for Non-Rigid 3D Tracking
- Authors: Jalees Nehvi and Vladislav Golyanik and Franziska Mueller and
Hans-Peter Seidel and Mohamed Elgharib and Christian Theobalt
- Abstract summary: Our differentiable simulator enables non-rigid 3D tracking of deformable objects from event streams.
We show the effectiveness of our approach for various types of non-rigid objects and compare to existing methods for non-rigid 3D tracking.
- Score: 82.56690776283428
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces the first differentiable simulator of event streams,
i.e., streams of asynchronous brightness change signals recorded by event
cameras. Our differentiable simulator enables non-rigid 3D tracking of
deformable objects (such as human hands, isometric surfaces and general
watertight meshes) from event streams by leveraging an analysis-by-synthesis
principle. So far, event-based tracking and reconstruction of non-rigid objects
in 3D, like hands and body, has been either tackled using explicit event
trajectories or large-scale datasets. In contrast, our method does not require
any such processing or data, and can be readily applied to incoming event
streams. We show the effectiveness of our approach for various types of
non-rigid objects and compare to existing methods for non-rigid 3D tracking. In
our experiments, the proposed energy-based formulations outperform competing
RGB-based methods in terms of 3D errors. The source code and the new data are
publicly available.
Related papers
- IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera [7.515256982860307]
IncEventGS is an incremental 3D Gaussian splatting reconstruction algorithm with a single event camera.
We exploit the tracking and mapping paradigm of conventional SLAM pipelines for IncEventGS.
arXiv Detail & Related papers (2024-10-10T16:54:23Z) - Elite-EvGS: Learning Event-based 3D Gaussian Splatting by Distilling Event-to-Video Priors [8.93657924734248]
Event cameras are bio-inspired sensors that output asynchronous and sparse event streams, instead of fixed frames.
We propose a novel event-based 3DGS framework, named Elite-EvGS.
Our key idea is to distill the prior knowledge from the off-the-shelf event-to-video (E2V) models to effectively reconstruct 3D scenes from events.
arXiv Detail & Related papers (2024-09-20T10:47:52Z) - Inverse Neural Rendering for Explainable Multi-Object Tracking [35.072142773300655]
We recast 3D multi-object tracking from RGB cameras as an emphInverse Rendering (IR) problem.
We optimize an image loss over generative latent spaces that inherently disentangle shape and appearance properties.
We validate the generalization and scaling capabilities of our method by learning the generative prior exclusively from synthetic data.
arXiv Detail & Related papers (2024-04-18T17:37:53Z) - EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [59.77837807004765]
This paper introduces a new problem, i.e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens.
Event streams have high temporal resolution and provide reliable cues for 3D human motion capture under high-speed human motions and rapidly changing illumination.
Our EE3D demonstrates robustness and superior 3D accuracy compared to existing solutions while supporting real-time 3D pose update rates of 140Hz.
arXiv Detail & Related papers (2024-04-12T17:59:47Z) - Exploring Event-based Human Pose Estimation with 3D Event Representations [26.34100847541989]
We introduce two 3D event representations: the Rasterized Event Point Cloud (Ras EPC) and the Decoupled Event Voxel (DEV)
The Ras EPC aggregates events within concise temporal slices at identical positions, preserving their 3D attributes along with statistical information, thereby significantly reducing memory and computational demands.
Our methods are tested on the DHP19 public dataset, MMHPSD dataset, and our EV-3DPW dataset, with further qualitative validation via a derived driving scene dataset EV-JAAD and an outdoor collection vehicle.
arXiv Detail & Related papers (2023-11-08T10:45:09Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Lifting Monocular Events to 3D Human Poses [22.699272716854967]
This paper presents a novel 3D human pose estimation approach using a single stream of asynchronous events as input.
We propose the first learning-based method for 3D human pose from a single stream of events.
Experiments demonstrate that our method achieves solid accuracy, narrowing the performance gap between standard RGB and event-based vision.
arXiv Detail & Related papers (2021-04-21T16:07:12Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - EventHands: Real-Time Neural 3D Hand Reconstruction from an Event Stream [80.15360180192175]
3D hand pose estimation from monocular videos is a long-standing and challenging problem.
We address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes.
Our approach has characteristics previously not demonstrated with a single RGB or depth camera.
arXiv Detail & Related papers (2020-12-11T16:45:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.