Multi-object Monocular SLAM for Dynamic Environments
- URL: http://arxiv.org/abs/2002.03528v2
- Date: Mon, 11 May 2020 11:42:42 GMT
- Title: Multi-object Monocular SLAM for Dynamic Environments
- Authors: Gokul B. Nair, Swapnil Daga, Rahul Sajnani, Anirudha Ramesh, Junaid
Ahmed Ansari, Krishna Murthy Jatavallabhula, K. Madhava Krishna
- Abstract summary: The term multibody, implies that we track the motion of the camera, as well as that of other dynamic participants in the scene.
Existing approaches solve restricted variants of the problem, but the solutions suffer relative scale ambiguity.
We propose a multi pose-graph optimization formulation, to resolve the relative and absolute scale factor ambiguities involved.
- Score: 12.537311048732017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we tackle the problem of multibody SLAM from a monocular
camera. The term multibody, implies that we track the motion of the camera, as
well as that of other dynamic participants in the scene. The quintessential
challenge in dynamic scenes is unobservability: it is not possible to
unambiguously triangulate a moving object from a moving monocular camera.
Existing approaches solve restricted variants of the problem, but the solutions
suffer relative scale ambiguity (i.e., a family of infinitely many solutions
exist for each pair of motions in the scene). We solve this rather intractable
problem by leveraging single-view metrology, advances in deep learning, and
category-level shape estimation. We propose a multi pose-graph optimization
formulation, to resolve the relative and absolute scale factor ambiguities
involved. This optimization helps us reduce the average error in trajectories
of multiple bodies over real-world datasets, such as KITTI. To the best of our
knowledge, our method is the first practical monocular multi-body SLAM system
to perform dynamic multi-object and ego localization in a unified framework in
metric scale.
Related papers
- MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes.
By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes.
We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z) - MultiViPerFrOG: A Globally Optimized Multi-Viewpoint Perception Framework for Camera Motion and Tissue Deformation [18.261678529996104]
We propose a framework that can flexibly integrate the output of low-level perception modules with kinematic and scene-modeling priors.
Overall, our method shows robustness to combined noisy input measures and can process hundreds of points in a few milliseconds.
arXiv Detail & Related papers (2024-08-08T10:55:55Z) - Shape of Motion: 4D Reconstruction from a Single Video [51.04575075620677]
We introduce a method capable of reconstructing generic dynamic scenes, featuring explicit, full-sequence-long 3D motion.
We exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE3 motion bases.
Our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes.
arXiv Detail & Related papers (2024-07-18T17:59:08Z) - Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth
Estimation in Dynamic Scenes [51.20150148066458]
We propose a novel method to learn to fuse the multi-view and monocular cues encoded as volumes without needing the generalizationally crafted masks.
Experiments on real-world datasets prove the significant effectiveness and ability of the proposed method.
arXiv Detail & Related papers (2023-04-18T13:55:24Z) - Progressive Multi-view Human Mesh Recovery with Self-Supervision [68.60019434498703]
Existing solutions typically suffer from poor generalization performance to new settings.
We propose a novel simulation-based training pipeline for multi-view human mesh recovery.
arXiv Detail & Related papers (2022-12-10T06:28:29Z) - Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance [83.25826307000717]
We study the challenging problem of recovering detailed motion from a single motion-red image.
Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region.
In this paper, we explicitly account for such motion ambiguity, allowing us to generate multiple plausible solutions all in sharp detail.
arXiv Detail & Related papers (2022-07-20T18:05:53Z) - Disentangling Object Motion and Occlusion for Unsupervised Multi-frame
Monocular Depth [37.021579239596164]
Existing dynamic-object-focused methods only partially solved the mismatch problem at the training loss level.
We propose a novel multi-frame monocular depth prediction method to solve these problems at both the prediction and supervision loss levels.
Our method, called DynamicDepth, is a new framework trained via a self-supervised cycle consistent learning scheme.
arXiv Detail & Related papers (2022-03-29T01:36:11Z) - DyGLIP: A Dynamic Graph Model with Link Prediction for Accurate
Multi-Camera Multiple Object Tracking [25.98400206361454]
Multi-Camera Multiple Object Tracking (MC-MOT) is a significant computer vision problem due to its emerging applicability in several real-world applications.
This work proposes a new Dynamic Graph Model with Link Prediction approach to solve the data association task.
Experimental results show that we outperform existing MC-MOT algorithms by a large margin on several practical datasets.
arXiv Detail & Related papers (2021-06-12T20:22:30Z) - MoCo-Flow: Neural Motion Consensus Flow for Dynamic Humans in Stationary
Monocular Cameras [98.40768911788854]
We introduce MoCo-Flow, a representation that models the dynamic scene using a 4D continuous time-variant function.
At the heart of our work lies a novel optimization formulation, which is constrained by a motion consensus regularization on the motion flow.
We extensively evaluate MoCo-Flow on several datasets that contain human motions of varying complexity.
arXiv Detail & Related papers (2021-06-08T16:03:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.