Traffic Video Object Detection using Motion Prior
- URL: http://arxiv.org/abs/2311.10092v1
- Date: Thu, 16 Nov 2023 18:59:46 GMT
- Title: Traffic Video Object Detection using Motion Prior
- Authors: Lihao Liu, Yanqi Cheng, Dongdong Chen, Jing He, Pietro Li\`o,
Carola-Bibiane Sch\"onlieb, Angelica I Aviles-Rivero
- Abstract summary: We propose two innovative methods to exploit the motion prior and boost the performance of traffic video object detection.
Firstly, we introduce a new self-attention module that leverages the motion prior to guide temporal information integration.
Secondly, we utilise a pseudo-labelling mechanism to eliminate noisy pseudo labels for the semi-supervised setting.
- Score: 16.63738085066699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traffic videos inherently differ from generic videos in their stationary
camera setup, thus providing a strong motion prior where objects often move in
a specific direction over a short time interval. Existing works predominantly
employ generic video object detection framework for traffic video object
detection, which yield certain advantages such as broad applicability and
robustness to diverse scenarios. However, they fail to harness the strength of
motion prior to enhance detection accuracy. In this work, we propose two
innovative methods to exploit the motion prior and boost the performance of
both fully-supervised and semi-supervised traffic video object detection.
Firstly, we introduce a new self-attention module that leverages the motion
prior to guide temporal information integration in the fully-supervised
setting. Secondly, we utilise the motion prior to develop a pseudo-labelling
mechanism to eliminate noisy pseudo labels for the semi-supervised setting.
Both of our motion-prior-centred methods consistently demonstrates superior
performance, outperforming existing state-of-the-art approaches by a margin of
2% in terms of mAP.
Related papers
- MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent [58.09607975296408]
We propose MotionAgent, enabling fine-grained motion control for text-guided image-to-video generation.
The key technique is the motion field agent that converts motion information in text prompts into explicit motion fields.
We construct a subset of VBench to evaluate the alignment of motion information in the text and the generated video, outperforming other advanced models on motion generation accuracy.
arXiv Detail & Related papers (2025-02-05T14:26:07Z) - MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation [55.238542326124545]
Image-to-video (I2V) generation is conditioned on the static image, which has been enhanced recently by the motion intensity as an additional control signal.
These motion-aware models are appealing to generate diverse motion patterns, yet there lacks a reliable motion estimator for training such models on large-scale video set in the wild.
This paper addresses the challenge with a new motion estimator, capable of measuring the decoupled motion intensities of objects and cameras in video.
arXiv Detail & Related papers (2024-12-08T08:12:37Z) - Trajectory Attention for Fine-grained Video Motion Control [20.998809534747767]
This paper introduces trajectory attention, a novel approach that performs attention along available pixel trajectories for fine-grained camera motion control.
We show that our approach can be extended to other video motion control tasks, such as first-frame-guided video editing.
arXiv Detail & Related papers (2024-11-28T18:59:51Z) - ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking [4.250337979548885]
We propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack.
Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns.
We show ETTrack achieves a competitive performance compared with state-of-the-art trackers on DanceTrack and SportsMOT.
arXiv Detail & Related papers (2024-05-24T17:51:33Z) - MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video
Generation [131.1446077627191]
Zero-shot Text-to-Video synthesis generates videos based on prompts without any videos.
We propose a prompt-adaptive and disentangled motion control strategy coined as MotionZero.
Our strategy could correctly control motion of different objects and support versatile applications including zero-shot video edit.
arXiv Detail & Related papers (2023-11-28T09:38:45Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Learning Variational Motion Prior for Video-based Motion Capture [31.79649766268877]
We present a novel variational motion prior (VMP) learning approach for video-based motion capture.
Our framework can effectively reduce temporal jittering and failure modes in frame-wise pose estimation.
Experiments over both public datasets and in-the-wild videos have demonstrated the efficacy and generalization capability of our framework.
arXiv Detail & Related papers (2022-10-27T02:45:48Z) - Treating Motion as Option to Reduce Motion Dependency in Unsupervised
Video Object Segmentation [5.231219025536678]
Unsupervised video object segmentation (VOS) aims to detect the most salient object in a video sequence at the pixel level.
Most state-of-the-art methods leverage motion cues obtained from optical flow maps in addition to appearance cues to exploit the property that salient objects usually have distinctive movements compared to the background.
arXiv Detail & Related papers (2022-09-04T18:05:52Z) - E^2TAD: An Energy-Efficient Tracking-based Action Detector [78.90585878925545]
This paper presents a tracking-based solution to accurately and efficiently localize predefined key actions.
It won first place in the UAV-Video Track of 2021 Low-Power Computer Vision Challenge (LPCVC)
arXiv Detail & Related papers (2022-04-09T07:52:11Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.