Related papers: ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild

ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild

URL: http://arxiv.org/abs/2207.09137v1
Date: Tue, 19 Jul 2022 09:19:45 GMT
Title: ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild
Authors: Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang, Yong-Jin Liu
Abstract summary: We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow. A novel neural network architecture is proposed for processing irregular point trajectory data. Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
Score: 57.37891682117178
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Estimating the pose of a moving camera from monocular video is a challenging problem, especially due to the presence of moving objects in dynamic environments, where the performance of existing camera pose estimation methods are susceptible to pixels that are not geometrically consistent. To tackle this challenge, we present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence initialized from pairwise optical flow. Our key idea is to optimize long-range video correspondence as dense point trajectories and use it to learn robust estimation of motion segmentation. A novel neural network architecture is proposed for processing irregular point trajectory data. Camera poses are then estimated and optimized with global bundle adjustment over the portion of long-range point trajectories that are classified as static. Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories compared to existing state-of-the-art methods. In addition, our method is able to retain reasonable accuracy of camera poses on fully static scenes, which consistently outperforms strong state-of-the-art dense correspondence based methods with end-to-end deep learning, demonstrating the potential of dense indirect methods based on optical flow and point trajectories. As the point trajectory representation is general, we further present results and comparisons on in-the-wild monocular videos with complex motion of dynamic objects. Code is available at https://github.com/bytedance/particle-sfm.

Related papers

Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction [78.27956235915622]
Traditional SLAM systems struggle with highly dynamic scenes commonly found in casual videos. This work leverages a 3D point tracker to separate the camera-induced motion from the observed motion of dynamic objects. Our framework combines the core of traditional SLAM -- bundle adjustment -- with a robust learning-based 3D tracker front-end.
arXiv Detail & Related papers (2025-04-20T07:29:42Z)
MONA: Moving Object Detection from Videos Shot by Dynamic Camera [20.190677328673836]
We introduce MONA, a framework for robust moving object detection and segmentation from videos shot by dynamic cameras. MonA comprises two key modules: Dynamic Points Extraction, which leverages optical flow and tracking any point to identify dynamic points, and Moving Object, which employs adaptive bounding box filtering. We validate MONA by integrating with the camera trajectory estimation method LEAP-VO, and it achieves state-of-the-art results on the MPI Sintel dataset.
arXiv Detail & Related papers (2025-01-22T19:30:28Z)
Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency [58.719310295870024]
This paper presents an event-based framework for tracking any point. It tackles the challenges posed by spatial sparsity and motion sensitivity in events. It achieves 150% faster processing with competitive model parameters.
arXiv Detail & Related papers (2024-12-02T09:13:29Z)
DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild [85.03973683867797]
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild. We show that the proposed method achieves state-of-the-art performance in terms of camera pose estimation even in complex dynamic challenge scenes.
arXiv Detail & Related papers (2024-11-20T13:01:16Z)
ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras [33.81592783496106]
Event-based visual odometry aims at solving tracking and mapping sub-problems in parallel. We build an event-based stereo visual-inertial odometry system on top of our previous direct pipeline Event-based Stereo Visual Odometry.
arXiv Detail & Related papers (2024-10-12T05:35:27Z)
Decomposition Betters Tracking Everything Everywhere [8.199205242808592]
We propose a new test-time optimization method, named DecoMotion, for estimating per-pixel and long-range motion. Our method boosts the point-tracking accuracy by a large margin and performs on par with some state-of-the-art dedicated point-tracking solutions.
arXiv Detail & Related papers (2024-07-09T04:01:23Z)
Tracking Everything Everywhere All at Once [111.00807055441028]
We present a new test-time optimization method for estimating dense and long-range motion from a video sequence. We propose a complete and globally consistent motion representation, dubbed OmniMotion. Our approach outperforms prior state-of-the-art methods by a large margin both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-06-08T17:59:29Z)
HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking [5.236567998857959]
We present a unified convolutional neural network (CNN) model that jointly considers homography, visibility, and confidence. Our approach outperforms the state-of-the-art methods on public POT and TMT datasets.
arXiv Detail & Related papers (2022-09-19T11:11:56Z)
Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework. It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z)
Attentive and Contrastive Learning for Joint Depth and Motion Field Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task. We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z)
Event-based Motion Segmentation with Spatio-Temporal Graph Cuts [51.17064599766138]
We have developed a method to identify independently objects acquired with an event-based camera. The method performs on par or better than the state of the art without having to predetermine the number of expected moving objects.
arXiv Detail & Related papers (2020-12-16T04:06:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.