Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in
the Wild
- URL: http://arxiv.org/abs/2007.12806v1
- Date: Fri, 24 Jul 2020 23:50:46 GMT
- Title: Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in
the Wild
- Authors: Minh Vo, Yaser Sheikh, and Srinivasa G. Narasimhan
- Abstract summary: We present a framework that jointly estimates camera temporal alignment and 3D point triangulation.
We reconstruct 3D motion trajectories of human bodies in events captured by multiple unsynchronized and unsynchronized video cameras.
- Score: 49.672487902268706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bundle adjustment jointly optimizes camera intrinsics and extrinsics and 3D
point triangulation to reconstruct a static scene. The triangulation
constraint, however, is invalid for moving points captured in multiple
unsynchronized videos and bundle adjustment is not designed to estimate the
temporal alignment between cameras. We present a spatiotemporal bundle
adjustment framework that jointly optimizes four coupled sub-problems:
estimating camera intrinsics and extrinsics, triangulating static 3D points, as
well as sub-frame temporal alignment between cameras and computing 3D
trajectories of dynamic points. Key to our joint optimization is the careful
integration of physics-based motion priors within the reconstruction pipeline,
validated on a large motion capture corpus of human subjects. We devise an
incremental reconstruction and alignment algorithm to strictly enforce the
motion prior during the spatiotemporal bundle adjustment. This algorithm is
further made more efficient by a divide and conquer scheme while still
maintaining high accuracy. We apply this algorithm to reconstruct 3D motion
trajectories of human bodies in dynamic events captured by multiple
uncalibrated and unsynchronized video cameras in the wild. To make the
reconstruction visually more interpretable, we fit a statistical 3D human body
model to the asynchronous video streams.Compared to the baseline, the fitting
significantly benefits from the proposed spatiotemporal bundle adjustment
procedure. Because the videos are aligned with sub-frame precision, we
reconstruct 3D motion at much higher temporal resolution than the input videos.
Related papers
- Spatiotemporal Multi-Camera Calibration using Freely Moving People [32.288669810272864]
We propose a novel method for multi-camera calibration using freely moving people in multiview videos.
We use 3D human poses obtained from an off-the-temporal monotemporal shelf to transform them into 3D points on a unit sphere.
We employ a probabilistic approach that can jointly solve both problems of aligningtemporal data and establishing correspondences.
arXiv Detail & Related papers (2025-02-18T05:15:52Z) - VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment [62.6737516863285]
VideoLifter is a novel framework that incrementally optimize a globally sparse to dense 3D representation directly from video sequences.
By tracking and propagating sparse point correspondences across frames and fragments, VideoLifter incrementally refines camera poses and 3D structure.
This approach significantly accelerates the reconstruction process, reducing training time by over 82% while surpassing current state-of-the-art methods in visual fidelity and computational efficiency.
arXiv Detail & Related papers (2025-01-03T18:52:36Z) - DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild [85.03973683867797]
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild.
We show that the proposed method achieves state-of-the-art performance in terms of camera pose estimation even in complex dynamic challenge scenes.
arXiv Detail & Related papers (2024-11-20T13:01:16Z) - CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion-Blurred Images [14.738528284246545]
CRiM-GS is a textbfContinuous textbfRigid textbfMotion-aware textbfGaussian textbfSplatting.
It reconstructs precise 3D scenes from motion-blurred images while maintaining real-time rendering speed.
arXiv Detail & Related papers (2024-07-04T13:37:04Z) - Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion [25.54868552979793]
We present a method that adapts to camera motion and allows high-quality scene reconstruction with handheld video data.
Our results with both synthetic and real data demonstrate superior performance in mitigating camera motion over existing methods.
arXiv Detail & Related papers (2024-03-20T06:19:41Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred
Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.
Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z) - Consistent Depth of Moving Objects in Video [52.72092264848864]
We present a method to estimate depth of a dynamic scene, containing arbitrary moving objects, from an ordinary video captured with a moving camera.
We formulate this objective in a new test-time training framework where a depth-prediction CNN is trained in tandem with an auxiliary scene-flow prediction over the entire input video.
We demonstrate accurate and temporally coherent results on a variety of challenging videos containing diverse moving objects (pets, people, cars) as well as camera motion.
arXiv Detail & Related papers (2021-08-02T20:53:18Z) - Visual Odometry with an Event Camera Using Continuous Ray Warping and
Volumetric Contrast Maximization [31.627936023222052]
We present a new solution to tracking and mapping with an event camera.
The motion of the camera contains both rotation and translation, and the displacements happen in an arbitrarily structured environment.
We introduce a new solution to this problem by performing contrast in 3D.
The practical validity of our approach is supported by an application to AGV motion estimation and 3D reconstruction with a single vehicle-mounted event camera.
arXiv Detail & Related papers (2021-07-07T04:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.