MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition
in UAV Videos
- URL: http://arxiv.org/abs/2008.01699v2
- Date: Sat, 8 Aug 2020 04:28:59 GMT
- Title: MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition
in UAV Videos
- Authors: Murari Mandal, Lav Kush Kumar, Santosh Kumar Vipparthi
- Abstract summary: We introduce MOR-UAV, a large-scale video dataset for moving object recognition in aerial videos.
We annotate 89,783 moving object instances collected from 30 UAV videos, consisting of 10,948 frames in various scenarios.
We propose a deep unified framework MOR-UAVNet for MOR in UAV videos.
- Score: 12.25388945174071
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual data collected from Unmanned Aerial Vehicles (UAVs) has opened a new
frontier of computer vision that requires automated analysis of aerial
images/videos. However, the existing UAV datasets primarily focus on object
detection. An object detector does not differentiate between the moving and
non-moving objects. Given a real-time UAV video stream, how can we both
localize and classify the moving objects, i.e. perform moving object
recognition (MOR)? The MOR is one of the essential tasks to support various UAV
vision-based applications including aerial surveillance, search and rescue,
event recognition, urban and rural scene understanding.To the best of our
knowledge, no labeled dataset is available for MOR evaluation in UAV videos.
Therefore, in this paper, we introduce MOR-UAV, a large-scale video dataset for
MOR in aerial videos. We achieve this by labeling axis-aligned bounding boxes
for moving objects which requires less computational resources than producing
pixel-level estimates. We annotate 89,783 moving object instances collected
from 30 UAV videos, consisting of 10,948 frames in various scenarios such as
weather conditions, occlusion, changing flying altitude and multiple camera
views. We assigned the labels for two categories of vehicles (car and heavy
vehicle). Furthermore, we propose a deep unified framework MOR-UAVNet for MOR
in UAV videos. Since, this is a first attempt for MOR in UAV videos, we present
16 baseline results based on the proposed framework over the MOR-UAV dataset
through quantitative and qualitative experiments. We also analyze the
motion-salient regions in the network through multiple layer visualizations.
The MOR-UAVNet works online at inference as it requires only few past frames.
Moreover, it doesn't require predefined target initialization from user.
Experiments also demonstrate that the MOR-UAV dataset is quite challenging.
Related papers
- UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles [12.278437831053985]
Unmanned Aerial Vehicles (UAVs) are employed in numerous applications, including aerial photography, surveillance, and agriculture.
Existing benchmarks for UAV applications are mainly designed for traditional 2D perception tasks.
UAV3D comprises 1,000 scenes, each of which has 20 frames with fully annotated 3D bounding boxes on vehicles.
arXiv Detail & Related papers (2024-10-14T22:24:11Z) - EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV [19.07281015014683]
Vehicle detection in Unmanned Aerial Vehicle (UAV) captured images has wide applications in aerial photography and remote sensing.
Recent studies show that adding an adversarial patch on objects can fool the well-trained deep neural networks based object detectors.
We propose a new dataset named EVD4UAV as an altitude-sensitive benchmark to evade vehicle detection in UAV.
arXiv Detail & Related papers (2024-03-08T16:19:39Z) - Real Time Human Detection by Unmanned Aerial Vehicles [0.0]
Two crucial data sources for public security are the thermal infrared (TIR) remote sensing photos and videos produced by unmanned aerial vehicles (UAVs)
due to the small scale of the target, complex scene information, low resolution relative to the viewable videos, and dearth of publicly available labeled datasets and training models, their object detection procedure is still difficult.
A UAV TIR object detection framework for pictures and videos is suggested in this study.
arXiv Detail & Related papers (2024-01-06T18:28:01Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - Evidential Detection and Tracking Collaboration: New Problem, Benchmark
and Algorithm for Robust Anti-UAV System [56.51247807483176]
Unmanned Aerial Vehicles (UAVs) have been widely used in many areas, including transportation, surveillance, and military.
Previous works have simplified such an anti-UAV task as a tracking problem, where prior information of UAVs is always provided.
In this paper, we first formulate a new and practical anti-UAV problem featuring the UAVs perception in complex scenes without prior UAVs information.
arXiv Detail & Related papers (2023-06-27T19:30:23Z) - Learning to Compress Unmanned Aerial Vehicle (UAV) Captured Video:
Benchmark and Analysis [54.07535860237662]
We propose a novel task for learned UAV video coding and construct a comprehensive and systematic benchmark for such a task.
It is expected that the benchmark will accelerate the research and development in video coding on drone platforms.
arXiv Detail & Related papers (2023-01-15T15:18:02Z) - BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object
Detection for Autonomous Driving [2.9769485817170387]
CNNs can leverage the global context in the scene to project better.
We create an extended KITTI-raw dataset consisting of 12.9k images with annotations of moving object masks in BEV space for five classes.
We observe a significant improvement of 13% in mIoU using the simple baseline implementation.
arXiv Detail & Related papers (2021-07-11T01:11:58Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking [59.06167734555191]
Unmanned Aerial Vehicle (UAV) offers lots of applications in both commerce and recreation.
We consider the task of tracking UAVs, providing rich information such as location and trajectory.
We propose a dataset, Anti-UAV, with more than 300 video pairs containing over 580k manually annotated bounding boxes.
arXiv Detail & Related papers (2021-01-21T07:00:15Z) - Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images.
We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z) - AU-AIR: A Multi-modal Unmanned Aerial Vehicle Dataset for Low Altitude
Traffic Surveillance [20.318367304051176]
Unmanned aerial vehicles (UAVs) with mounted cameras have the advantage of capturing aerial (bird-view) images.
Several aerial datasets have been introduced, including visual data with object annotations.
We propose a multi-purpose aerial dataset (AU-AIR) that has multi-modal sensor data collected in real-world outdoor environments.
arXiv Detail & Related papers (2020-01-31T09:45:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.