Related papers: Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras

Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras

URL: http://arxiv.org/abs/2409.18673v1
Date: Fri, 27 Sep 2024 11:59:00 GMT
Title: Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras
Authors: Yipeng Lu, Yifan Zhao, Haiping Wang, Zhiwei Ruan, Yuan Liu, Zhen Dong, Bisheng Yang,
Abstract summary: We propose a precise pose estimation method for dashcam images, leveraging the inherent camera motion prior. Our method is 22% better than the baseline for pose estimation in AUC5textdegree, and it can estimate poses for 19% more images with less reprojection error.
Score: 17.010390107028275
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dashboard cameras (dashcams) record millions of driving videos daily, offering a valuable potential data source for various applications, including driving map production and updates. A necessary step for utilizing these dashcam data involves the estimation of camera poses. However, the low-quality images captured by dashcams, characterized by motion blurs and dynamic objects, pose challenges for existing image-matching methods in accurately estimating camera poses. In this study, we propose a precise pose estimation method for dashcam images, leveraging the inherent camera motion prior. Typically, image sequences captured by dash cameras exhibit pronounced motion prior, such as forward movement or lateral turns, which serve as essential cues for correspondence estimation. Building upon this observation, we devise a pose regression module aimed at learning camera motion prior, subsequently integrating these prior into both correspondences and pose estimation processes. The experiment shows that, in real dashcams dataset, our method is 22% better than the baseline for pose estimation in AUC5\textdegree, and it can estimate poses for 19% more images with less reprojection error in Structure from Motion (SfM).

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos [52.726585508669686]
We propose AnyCam, a fast transformer model that directly estimates camera poses and intrinsics from a dynamic video sequence. We test AnyCam on established datasets, where it delivers accurate camera poses and intrinsics both qualitatively and quantitatively. By combining camera information, uncertainty, and depth, our model can produce high-quality 4D pointclouds.
arXiv Detail & Related papers (2025-03-30T02:22:11Z)
Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image [14.485182089870928]
We propose a novel framework that leverages motion blur as a rich cue for motion estimation. Our approach works by predicting a dense motion flow field and a monocular depth map directly from a single motion-blurred image. Our method produces an IMU-like measurement that robustly captures fast and aggressive camera movements.
arXiv Detail & Related papers (2025-03-21T17:58:56Z)
An object detection approach for lane change and overtake detection from motion profiles [3.545178658731506]
In this paper, we address the identification of overtake and lane change maneuvers with a novel object detection approach applied to motion profiles. To train and test our model we created an internal dataset of motion profile images obtained from a heterogeneous set of dashcam videos. In addition to a standard object-detection approach, we show how the inclusion of CoordConvolution layers further improves the model performance.
arXiv Detail & Related papers (2025-02-06T17:36:35Z)
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers [66.29824750770389]
We analyze camera motion from a first principles perspective, uncovering insights that enable precise 3D camera manipulation. We compound these findings to design the Advanced 3D Camera Control (AC3D) architecture.
arXiv Detail & Related papers (2024-11-27T18:49:13Z)
CamI2V: Camera-Controlled Image-to-Video Diffusion Model [11.762824216082508]
In this paper, we emphasize the necessity of integrating explicit physical constraints into model design. Epipolar attention is proposed for modeling all cross-frame relationships from a novel perspective of noised condition. We achieve a 25.5% improvement in camera controllability on RealEstate10K while maintaining strong generalization to out-of-domain images.
arXiv Detail & Related papers (2024-10-21T12:36:27Z)
KRONC: Keypoint-based Robust Camera Optimization for 3D Car Reconstruction [58.04846444985808]
This paper introduces KRONC, a novel approach aimed at inferring view poses by leveraging prior knowledge about the object to reconstruct and its representation through semantic keypoints. With a focus on vehicle scenes, KRONC is able to estimate the position of the views as a solution to a light optimization problem targeting the convergence of keypoints' back-projections to a singular point.
arXiv Detail & Related papers (2024-09-09T08:08:05Z)
Line-based 6-DoF Object Pose Estimation and Tracking With an Event Camera [19.204896246140155]
Event cameras possess remarkable attributes such as high dynamic range, low latency, and resilience against motion blur. We propose a line-based robust pose estimation and tracking method for planar or non-planar objects using an event camera.
arXiv Detail & Related papers (2024-08-06T14:36:43Z)
VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques. We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step. Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z)
Continuous Pose for Monocular Cameras in Neural Implicit Representation [65.40527279809474]
In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time. We exploit the proposed method in four diverse experimental settings. Using the assumption of continuous motion, changes in pose may actually live in a manifold that has lower than 6 degrees of freedom (DOF) We call this low DOF motion representation as the emphintrinsic motion and use the approach in vSLAM settings, showing impressive camera tracking performance.
arXiv Detail & Related papers (2023-11-28T13:14:58Z)
Extrinsic Camera Calibration with Semantic Segmentation [60.330549990863624]
We present an extrinsic camera calibration approach that automatizes the parameter estimation by utilizing semantic segmentation information. Our approach relies on a coarse initial measurement of the camera pose and builds on lidar sensors mounted on a vehicle. We evaluate our method on simulated and real-world data to demonstrate low error measurements in the calibration results.
arXiv Detail & Related papers (2022-08-08T07:25:03Z)
Towards view-invariant vehicle speed detection from driving simulator images [0.31498833540989407]
We address the question of whether complex 3D-CNN architectures are capable of implicitly learning view-invariant speeds using a single model. The results are very promising as they show that a single model with data from multiple views reports even better accuracy than camera-specific models.
arXiv Detail & Related papers (2022-06-01T09:14:45Z)
Attentive and Contrastive Learning for Joint Depth and Motion Field Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task. We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z)
Towards Accurate Human Pose Estimation in Videos of Crowded Scenes [134.60638597115872]
We focus on improving human pose estimation in videos of crowded scenes from the perspectives of exploiting temporal context and collecting new data. For one frame, we forward the historical poses from the previous frames and backward the future poses from the subsequent frames to current frame, leading to stable and accurate human pose estimation in videos. In this way, our model achieves best performance on 7 out of 13 videos and 56.33 average w_AP on test dataset of HIE challenge.
arXiv Detail & Related papers (2020-10-16T13:19:11Z)
Vehicle-Human Interactive Behaviors in Emergency: Data Extraction from Traffic Accident Videos [0.0]
Currently, studying the vehicle-human interactive behavior in the emergency needs a large amount of datasets in the actual emergent situations that are almost unavailable. This paper provides a new yet convenient way to extract the interactive behavior data (i.e., the trajectories of vehicles and humans) from actual accident videos. The main challenge for data extraction from real-time accident video lies in the fact that the recording cameras are un-calibrated and the angles of surveillance are unknown.
arXiv Detail & Related papers (2020-03-02T22:17:46Z)
Unsupervised Learning of Camera Pose with Compositional Re-estimation [10.251550038802343]
Given an input video sequence, our goal is to estimate the camera pose (i.e. the camera motion) between consecutive frames. We propose an alternative approach that utilizes a compositional re-estimation process for camera pose estimation. Our approach significantly improves the predicted camera motion both quantitatively and visually.
arXiv Detail & Related papers (2020-01-17T18:59:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.