VM-MODNet: Vehicle Motion aware Moving Object Detection for Autonomous
Driving
- URL: http://arxiv.org/abs/2104.10985v1
- Date: Thu, 22 Apr 2021 10:46:55 GMT
- Title: VM-MODNet: Vehicle Motion aware Moving Object Detection for Autonomous
Driving
- Authors: Hazem Rashed, Ahmad El Sallab and Senthil Yogamani
- Abstract summary: Moving object Detection (MOD) is a critical task in autonomous driving.
We aim to leverage the vehicle motion information and feed it into the model to have an adaptation mechanism based on ego-motion.
The proposed model using Vehicle Motion (VMT) achieves an absolute improvement of 5.6% in mIoU over the baseline architecture.
- Score: 3.6550372593827887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Moving object Detection (MOD) is a critical task in autonomous driving as
moving agents around the ego-vehicle need to be accurately detected for safe
trajectory planning. It also enables appearance agnostic detection of objects
based on motion cues. There are geometric challenges like motion-parallax
ambiguity which makes it a difficult problem. In this work, we aim to leverage
the vehicle motion information and feed it into the model to have an adaptation
mechanism based on ego-motion. The motivation is to enable the model to
implicitly perform ego-motion compensation to improve performance. We convert
the six degrees of freedom vehicle motion into a pixel-wise tensor which can be
fed as input to the CNN model. The proposed model using Vehicle Motion Tensor
(VMT) achieves an absolute improvement of 5.6% in mIoU over the baseline
architecture. We also achieve state-of-the-art results on the public
KITTI_MoSeg_Extended dataset even compared to methods which make use of LiDAR
and additional input frames. Our model is also lightweight and runs at 85 fps
on a TitanX GPU. Qualitative results are provided in
https://youtu.be/ezbfjti-kTk.
Related papers
- MotionFix: Text-Driven 3D Human Motion Editing [52.11745508960547]
Key challenges include the scarcity of training data and the need to design a model that accurately edits the source motion.
We propose a methodology to semi-automatically collect a dataset of triplets comprising (i) a source motion, (ii) a target motion, and (iii) an edit text.
Access to this data allows us to train a conditional diffusion model, TMED, that takes both the source motion and the edit text as input.
arXiv Detail & Related papers (2024-08-01T16:58:50Z) - MM-Tracker: Motion Mamba with Margin Loss for UAV-platform Multiple Object Tracking [12.326023523101806]
Multiple object tracking (MOT) from unmanned aerial vehicle platforms requires efficient motion modeling.
We propose the Motion Mamba Module, which explores both local and global motion features.
We also design motion margin loss to effectively improve the detection accuracy of motion blurred objects.
Based on the Motion Mamba module and motion margin loss, our proposed MM-Tracker surpasses the state-of-the-art in two widely open-source UAV-MOT datasets.
arXiv Detail & Related papers (2024-07-15T07:13:27Z) - Ego-Motion Aware Target Prediction Module for Robust Multi-Object Tracking [2.7898966850590625]
We introduce a novel KF-based prediction module called Ego-motion Aware Target Prediction (EMAP)
Our proposed method decouples the impact of camera rotational and translational velocity from the object trajectories by reformulating the Kalman Filter.
EMAP remarkably drops the number of identity switches (IDSW) of OC-SORT and Deep OC-SORT by 73% and 21%, respectively.
arXiv Detail & Related papers (2024-04-03T23:24:25Z) - DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - BootsTAP: Bootstrapped Training for Tracking-Any-Point [62.585297341343505]
Tracking-Any-Point (TAP) can be formalized as an algorithm to track any point on solid surfaces in a video.
We show how large-scale, unlabeled, uncurated real-world data can improve a TAP model with minimal architectural changes.
We demonstrate state-of-the-art performance on the TAP-Vid benchmark surpassing previous results by a wide margin.
arXiv Detail & Related papers (2024-02-01T18:38:55Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Observation-Centric SORT: Rethinking SORT for Robust Multi-Object
Tracking [32.32109475782992]
We show that a simple motion model can obtain state-of-the-art tracking performance without other cues like appearance.
We thus name the proposed method as Observation-Centric SORT, OC-SORT for short.
arXiv Detail & Related papers (2022-03-27T17:57:08Z) - Optical Flow Based Motion Detection for Autonomous Driving [0.0]
We train a neural network model to classify the motion status using optical flow field information as the input.
The experiments result in high accuracy, showing that our idea is viable and promising.
arXiv Detail & Related papers (2022-03-03T03:24:14Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Object Tracking by Detection with Visual and Motion Cues [1.7818230914983044]
Self-driving cars need to detect and track objects in camera images.
We present a simple online tracking algorithm that is based on a constant velocity motion model with a Kalman filter.
We evaluate our approach on the challenging BDD100 dataset.
arXiv Detail & Related papers (2021-01-19T10:29:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.