Online Adaptation for Implicit Object Tracking and Shape Reconstruction
in the Wild
- URL: http://arxiv.org/abs/2111.12728v1
- Date: Wed, 24 Nov 2021 19:00:05 GMT
- Title: Online Adaptation for Implicit Object Tracking and Shape Reconstruction
in the Wild
- Authors: Jianglong Ye, Yuntao Chen, Naiyan Wang, Xiaolong Wang
- Abstract summary: We introduce a novel and unified framework which utilizes a DeepSDF model to simultaneously track and reconstruct 3D objects in the wild.
We show significant improvements over state-of-the-art methods for both tracking and shape reconstruction.
- Score: 22.19769576901151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tracking and reconstructing 3D objects from cluttered scenes are the key
components for computer vision, robotics and autonomous driving systems. While
recent progress in implicit function (e.g., DeepSDF) has shown encouraging
results on high-quality 3D shape reconstruction, it is still very challenging
to generalize to cluttered and partially observable LiDAR data. In this paper,
we propose to leverage the continuity in video data. We introduce a novel and
unified framework which utilizes a DeepSDF model to simultaneously track and
reconstruct 3D objects in the wild. We online adapt the DeepSDF model in the
video, iteratively improving the shape reconstruction while in return improving
the tracking, and vice versa. We experiment with both Waymo and KITTI datasets,
and show significant improvements over state-of-the-art methods for both
tracking and shape reconstruction.
Related papers
- UVRM: A Scalable 3D Reconstruction Model from Unposed Videos [69.89526627921612]
Training 3D reconstruction models with 2D visual data traditionally requires prior knowledge of camera poses for the training samples.
We introduce UVRM, a novel 3D reconstruction model capable of being trained and evaluated on monocular videos without requiring any information about the pose.
arXiv Detail & Related papers (2025-01-16T08:00:17Z) - Street Gaussians without 3D Object Tracker [86.62329193275916]
Existing methods rely on labor-intensive manual labeling of object poses to reconstruct dynamic objects in canonical space and move them based on these poses during rendering.
We propose a stable object tracking module by leveraging associations from 2D deep trackers within a 3D object fusion strategy.
We address inevitable tracking errors by further introducing a motion learning strategy in an implicit feature space that autonomously corrects trajectory errors and recovers missed detections.
arXiv Detail & Related papers (2024-12-07T05:49:42Z) - AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - AutoRecon: Automated 3D Object Discovery and Reconstruction [41.60050228813979]
We propose a novel framework named AutoRecon for the automated discovery and reconstruction of an object from multi-view images.
We demonstrate that foreground objects can be robustly located and segmented from SfM point clouds by leveraging self-supervised 2D vision transformer features.
Experiments on the DTU, BlendedMVS and CO3D-V2 datasets demonstrate the effectiveness and robustness of AutoRecon.
arXiv Detail & Related papers (2023-05-15T17:16:46Z) - gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object
Reconstruction [94.46581592405066]
We exploit the hand structure and use it as guidance for SDF-based shape reconstruction.
We predict kinematic chains of pose transformations and align SDFs with highly-articulated hand poses.
arXiv Detail & Related papers (2023-04-24T10:05:48Z) - SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and
3D Mesh Reconstruction from Video Data [124.2624568006391]
We present SAIL-VOS 3D: a synthetic video dataset with frame-by-frame mesh annotations.
We also develop first baselines for reconstruction of 3D meshes from video data via temporal models.
arXiv Detail & Related papers (2021-05-18T15:42:37Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Learning monocular 3D reconstruction of articulated categories from
motion [39.811816510186475]
Video self-supervision forces the consistency of consecutive 3D reconstructions by a motion-based cycle loss.
We introduce an interpretable model of 3D template deformations that controls a 3D surface through the displacement of a small number of local, learnable handles.
We obtain state-of-the-art reconstructions with diverse shapes, viewpoints and textures for multiple articulated object categories.
arXiv Detail & Related papers (2021-03-30T13:50:27Z) - SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static
Images [44.78174845839193]
Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes.
These techniques still require multi-view annotations of the same object instance during training.
We propose SDF-SRN, an approach that requires only a single view of objects at training time.
arXiv Detail & Related papers (2020-10-20T17:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.