Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in
the Wild
- URL: http://arxiv.org/abs/2007.12806v1
- Date: Fri, 24 Jul 2020 23:50:46 GMT
- Title: Spatiotemporal Bundle Adjustment for Dynamic 3D Human Reconstruction in
the Wild
- Authors: Minh Vo, Yaser Sheikh, and Srinivasa G. Narasimhan
- Abstract summary: We present a framework that jointly estimates camera temporal alignment and 3D point triangulation.
We reconstruct 3D motion trajectories of human bodies in events captured by multiple unsynchronized and unsynchronized video cameras.
- Score: 49.672487902268706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bundle adjustment jointly optimizes camera intrinsics and extrinsics and 3D
point triangulation to reconstruct a static scene. The triangulation
constraint, however, is invalid for moving points captured in multiple
unsynchronized videos and bundle adjustment is not designed to estimate the
temporal alignment between cameras. We present a spatiotemporal bundle
adjustment framework that jointly optimizes four coupled sub-problems:
estimating camera intrinsics and extrinsics, triangulating static 3D points, as
well as sub-frame temporal alignment between cameras and computing 3D
trajectories of dynamic points. Key to our joint optimization is the careful
integration of physics-based motion priors within the reconstruction pipeline,
validated on a large motion capture corpus of human subjects. We devise an
incremental reconstruction and alignment algorithm to strictly enforce the
motion prior during the spatiotemporal bundle adjustment. This algorithm is
further made more efficient by a divide and conquer scheme while still
maintaining high accuracy. We apply this algorithm to reconstruct 3D motion
trajectories of human bodies in dynamic events captured by multiple
uncalibrated and unsynchronized video cameras in the wild. To make the
reconstruction visually more interpretable, we fit a statistical 3D human body
model to the asynchronous video streams.Compared to the baseline, the fitting
significantly benefits from the proposed spatiotemporal bundle adjustment
procedure. Because the videos are aligned with sub-frame precision, we
reconstruct 3D motion at much higher temporal resolution than the input videos.
Related papers
- DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild [85.03973683867797]
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild.
We show that the proposed method achieves state-of-the-art performance in terms of camera pose estimation even in complex dynamic challenge scenes.
arXiv Detail & Related papers (2024-11-20T13:01:16Z) - CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion Blur Images [12.603775893040972]
We propose continuous rigid motion-aware gaussian splatting (CRiM-GS) to reconstruct accurate 3D scene from blurry images with real-time rendering speed.
We leverage rigid body transformations to model the camera motion with proper regularization, preserving the shape and size of the object.
Furthermore, we introduce a continuous deformable 3D transformation in the textitSE(3) field to adapt the rigid body transformation to real-world problems.
arXiv Detail & Related papers (2024-07-04T13:37:04Z) - Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion [25.54868552979793]
We present a method that adapts to camera motion and allows high-quality scene reconstruction with handheld video data.
Our results with both synthetic and real data demonstrate superior performance in mitigating camera motion over existing methods.
arXiv Detail & Related papers (2024-03-20T06:19:41Z) - SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner.
Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input.
We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred
Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.
Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z) - Consistent Depth of Moving Objects in Video [52.72092264848864]
We present a method to estimate depth of a dynamic scene, containing arbitrary moving objects, from an ordinary video captured with a moving camera.
We formulate this objective in a new test-time training framework where a depth-prediction CNN is trained in tandem with an auxiliary scene-flow prediction over the entire input video.
We demonstrate accurate and temporally coherent results on a variety of challenging videos containing diverse moving objects (pets, people, cars) as well as camera motion.
arXiv Detail & Related papers (2021-08-02T20:53:18Z) - Visual Odometry with an Event Camera Using Continuous Ray Warping and
Volumetric Contrast Maximization [31.627936023222052]
We present a new solution to tracking and mapping with an event camera.
The motion of the camera contains both rotation and translation, and the displacements happen in an arbitrarily structured environment.
We introduce a new solution to this problem by performing contrast in 3D.
The practical validity of our approach is supported by an application to AGV motion estimation and 3D reconstruction with a single vehicle-mounted event camera.
arXiv Detail & Related papers (2021-07-07T04:32:57Z) - A Graph Attention Spatio-temporal Convolutional Network for 3D Human
Pose Estimation in Video [7.647599484103065]
We improve the learning of constraints in human skeleton by modeling local global spatial information via attention mechanisms.
Our approach effectively mitigates depth ambiguity and self-occlusion, generalizes to half upper body estimation, and achieves competitive performance on 2D-to-3D video pose estimation.
arXiv Detail & Related papers (2020-03-11T14:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.