MotionHint: Self-Supervised Monocular Visual Odometry with Motion
Constraints
- URL: http://arxiv.org/abs/2109.06768v2
- Date: Wed, 15 Sep 2021 07:58:20 GMT
- Title: MotionHint: Self-Supervised Monocular Visual Odometry with Motion
Constraints
- Authors: Cong Wang, Yu-Ping Wang, Dinesh Manocha
- Abstract summary: We present a novel self-supervised algorithm named MotionHint for monocular visual odometry (VO)
Our MotionHint algorithm can be easily applied to existing open-sourced state-of-the-art SSM-VO systems.
- Score: 70.76761166614511
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel self-supervised algorithm named MotionHint for monocular
visual odometry (VO) that takes motion constraints into account. A key aspect
of our approach is to use an appropriate motion model that can help existing
self-supervised monocular VO (SSM-VO) algorithms to overcome issues related to
the local minima within their self-supervised loss functions. The motion model
is expressed with a neural network named PPnet. It is trained to coarsely
predict the next pose of the camera and the uncertainty of this prediction. Our
self-supervised approach combines the original loss and the motion loss, which
is the weighted difference between the prediction and the generated ego-motion.
Taking two existing SSM-VO systems as our baseline, we evaluate our MotionHint
algorithm on the standard KITTI benchmark. Experimental results show that our
MotionHint algorithm can be easily applied to existing open-sourced
state-of-the-art SSM-VO systems to greatly improve the performance by reducing
the resulting ATE by up to 28.73%.
Related papers
- Initialization of Monocular Visual Navigation for Autonomous Agents Using Modified Structure from Small Motion [13.69678622755871]
We propose a standalone monocular visual Simultaneous Localization and Mapping (vSLAM) pipeline for autonomous space robots.
Our method, a state-of-the-art factor graph optimization pipeline, extends Structure from Small Motion to robustly initialize a monocular agent in spacecraft inspection trajectories.
We validate our approach on realistic, simulated satellite inspection image sequences with a tumbling spacecraft and demonstrate the method's effectiveness.
arXiv Detail & Related papers (2024-09-24T21:33:14Z) - COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions.
COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z) - Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring [71.60457491155451]
Eliminating image blur produced by various kinds of motion has been a challenging problem.
We propose a novel real-world deblurring filtering model called the Motion-adaptive Separable Collaborative Filter.
Our method provides an effective solution for real-world motion blur removal and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-19T19:44:24Z) - Self-Supervised Bird's Eye View Motion Prediction with Cross-Modality
Signals [38.20643428486824]
Learning the dense bird's eye view (BEV) motion flow in a self-supervised manner is an emerging research for robotics and autonomous driving.
Current self-supervised methods mainly rely on point correspondences between point clouds.
We introduce a novel cross-modality self-supervised training framework that effectively addresses these issues by leveraging multi-modality data.
arXiv Detail & Related papers (2024-01-21T14:09:49Z) - EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting
Ego-Motion Rigidity [13.02735046166494]
Self-supervised monocular scene flow estimation has received increasing attention for its simple and economical sensor setup.
We propose a superior model named EMR-MSF by borrowing the advantages of network architecture design under the scope of supervised learning.
On the KITTI scene flow benchmark, our approach improves the SF-all metric of the state-of-the-art self-supervised monocular method by 44%.
arXiv Detail & Related papers (2023-09-04T00:30:06Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Data-Driven Stochastic Motion Evaluation and Optimization with Image by
Spatially-Aligned Temporal Encoding [8.104557130048407]
This paper proposes a probabilistic motion prediction for long motions. The motion is predicted so that it accomplishes a task from the initial state observed in the given image.
Our method seamlessly integrates the image and motion data into the image feature domain by spatially-aligned temporal encoding.
The effectiveness of the proposed method is demonstrated with a variety of experiments with similar SOTA methods.
arXiv Detail & Related papers (2023-02-10T04:06:00Z) - Dyna-DepthFormer: Multi-frame Transformer for Self-Supervised Depth
Estimation in Dynamic Scenes [19.810725397641406]
We propose a novel Dyna-Depthformer framework, which predicts scene depth and 3D motion field jointly.
Our contributions are two-fold. First, we leverage multi-view correlation through a series of self- and cross-attention layers in order to obtain enhanced depth feature representation.
Second, we propose a warping-based Motion Network to estimate the motion field of dynamic objects without using semantic prior.
arXiv Detail & Related papers (2023-01-14T09:43:23Z) - Improving Unsupervised Video Object Segmentation with Motion-Appearance
Synergy [52.03068246508119]
We present IMAS, a method that segments the primary objects in videos without manual annotation in training or inference.
IMAS achieves Improved UVOS with Motion-Appearance Synergy.
We demonstrate its effectiveness in tuning critical hyperparams previously tuned with human annotation or hand-crafted hyperparam-specific metrics.
arXiv Detail & Related papers (2022-12-17T06:47:30Z) - Self-Supervised Learning of Perceptually Optimized Block Motion
Estimates for Video Compression [50.48504867843605]
We propose a search-free block motion estimation framework using a multi-stage convolutional neural network.
We deploy the multi-scale structural similarity (MS-SSIM) loss function to optimize the perceptual quality of the motion compensated predicted frames.
arXiv Detail & Related papers (2021-10-05T03:38:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.