MM-Tracker: Motion Mamba with Margin Loss for UAV-platform Multiple Object Tracking
- URL: http://arxiv.org/abs/2407.10485v2
- Date: Sat, 17 Aug 2024 15:42:14 GMT
- Title: MM-Tracker: Motion Mamba with Margin Loss for UAV-platform Multiple Object Tracking
- Authors: Mufeng Yao, Jinlong Peng, Qingdong He, Bo Peng, Hao Chen, Mingmin Chi, Chao Liu, Jon Atli Benediktsson,
- Abstract summary: Multiple object tracking (MOT) from unmanned aerial vehicle platforms requires efficient motion modeling.
We propose the Motion Mamba Module, which explores both local and global motion features.
We also design motion margin loss to effectively improve the detection accuracy of motion blurred objects.
Based on the Motion Mamba module and motion margin loss, our proposed MM-Tracker surpasses the state-of-the-art in two widely open-source UAV-MOT datasets.
- Score: 12.326023523101806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multiple object tracking (MOT) from unmanned aerial vehicle (UAV) platforms requires efficient motion modeling. This is because UAV-MOT faces both local object motion and global camera motion. Motion blur also increases the difficulty of detecting large moving objects. Previous UAV motion modeling approaches either focus only on local motion or ignore motion blurring effects, thus limiting their tracking performance and speed. To address these issues, we propose the Motion Mamba Module, which explores both local and global motion features through cross-correlation and bi-directional Mamba Modules for better motion modeling. To address the detection difficulties caused by motion blur, we also design motion margin loss to effectively improve the detection accuracy of motion blurred objects. Based on the Motion Mamba module and motion margin loss, our proposed MM-Tracker surpasses the state-of-the-art in two widely open-source UAV-MOT datasets. Code will be available.
Related papers
- MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model [18.607106274732885]
We introduce a Mamba-based motion model named Mamba moTion Predictor (MTP)
MTP takes the spatial-temporal location dynamics of objects as input, captures the motion pattern using a bi-Mamba encoding layer, and predicts the next motion.
Our proposed tracker, MambaTrack, demonstrates advanced performance on benchmarks such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2024-08-17T11:58:47Z) - MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion [94.66090422753126]
MotionFollower is a lightweight score-guided diffusion model for video motion editing.
It delivers superior motion editing performance and exclusively supports large camera movements and actions.
Compared with MotionEditor, the most advanced motion editing model, MotionFollower achieves an approximately 80% reduction in GPU memory.
arXiv Detail & Related papers (2024-05-30T17:57:30Z) - ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking [4.250337979548885]
We propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack.
Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns.
We show ETTrack achieves a competitive performance compared with state-of-the-art trackers on DanceTrack and SportsMOT.
arXiv Detail & Related papers (2024-05-24T17:51:33Z) - Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring [71.60457491155451]
Eliminating image blur produced by various kinds of motion has been a challenging problem.
We propose a novel real-world deblurring filtering model called the Motion-adaptive Separable Collaborative Filter.
Our method provides an effective solution for real-world motion blur removal and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-19T19:44:24Z) - Spectral Motion Alignment for Video Motion Transfer using Diffusion Models [54.32923808964701]
Spectral Motion Alignment (SMA) is a framework that refines and aligns motion vectors using Fourier and wavelet transforms.
SMA learns motion patterns by incorporating frequency-domain regularization, facilitating the learning of whole-frame global motion dynamics.
Extensive experiments demonstrate SMA's efficacy in improving motion transfer while maintaining computational efficiency and compatibility across various video customization frameworks.
arXiv Detail & Related papers (2024-03-22T14:47:18Z) - Delving into Motion-Aware Matching for Monocular 3D Object Tracking [81.68608983602581]
We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking.
We propose MoMA-M3T, a framework that mainly consists of three motion-aware components.
We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
arXiv Detail & Related papers (2023-08-22T17:53:58Z) - FOLT: Fast Multiple Object Tracking from UAV-captured Videos Based on
Optical Flow [27.621524657473945]
Multiple object tracking (MOT) has been successfully investigated in computer vision.
However, MOT for the videos captured by unmanned aerial vehicles (UAV) is still challenging due to small object size, blurred object appearance, and very large and/or irregular motion.
We propose FOLT to mitigate these problems and reach fast and accurate MOT in UAV view.
arXiv Detail & Related papers (2023-08-14T15:24:44Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Spatio-Temporal Action Detection Under Large Motion [86.3220533375967]
We study the performance of cuboid-aware feature aggregation in action detection under large action.
We propose to enhance actor representation under large motion by tracking actors and performing temporal feature aggregation along the respective tracks.
We find that cuboid-aware feature aggregation consistently achieves a large improvement in action detection performance compared to the cuboid-aware baseline.
arXiv Detail & Related papers (2022-09-06T06:55:26Z) - VM-MODNet: Vehicle Motion aware Moving Object Detection for Autonomous
Driving [3.6550372593827887]
Moving object Detection (MOD) is a critical task in autonomous driving.
We aim to leverage the vehicle motion information and feed it into the model to have an adaptation mechanism based on ego-motion.
The proposed model using Vehicle Motion (VMT) achieves an absolute improvement of 5.6% in mIoU over the baseline architecture.
arXiv Detail & Related papers (2021-04-22T10:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.