Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception?
- URL: http://arxiv.org/abs/2312.04548v1
- Date: Thu, 7 Dec 2023 18:59:14 GMT
- Title: Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception?
- Authors: Aritra Dutta, Srijan Das, Jacob Nielsen, Rajatsubhra Chakraborty,
Mubarak Shah
- Abstract summary: We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
- Score: 57.77643186237265
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the commercial abundance of UAVs, aerial data acquisition remains
challenging, and the existing Asia and North America-centric open-source UAV
datasets are small-scale or low-resolution and lack diversity in scene
contextuality. Additionally, the color content of the scenes, solar-zenith
angle, and population density of different geographies influence the data
diversity. These two factors conjointly render suboptimal aerial-visual
perception of the deep neural network (DNN) models trained primarily on the
ground-view data, including the open-world foundational models.
To pave the way for a transformative era of aerial detection, we present
Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record
synchronized scenes from different perspectives -- ground camera and
drone-mounted camera. MAVREC consists of around 2.5 hours of industry-standard
2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million
annotated bounding boxes. This makes MAVREC the largest ground and aerial-view
dataset, and the fourth largest among all drone-based datasets across all
modalities and tasks. Through our extensive benchmarking on MAVREC, we
recognize that augmenting object detectors with ground-view images from the
corresponding geographical location is a superior pre-training strategy for
aerial detection. Building on this strategy, we benchmark MAVREC with a
curriculum-based semi-supervised object detection approach that leverages
labeled (ground and aerial) and unlabeled (only aerial) images to enhance the
aerial detection. We publicly release the MAVREC dataset:
https://mavrec.github.io.
Related papers
- Game4Loc: A UAV Geo-Localization Benchmark from Game Data [0.0]
We introduce a more practical UAV geo-localization task including partial matches of cross-view paired data.
Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization.
arXiv Detail & Related papers (2024-09-25T13:33:28Z) - UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization [20.37586403749362]
We present a large-scale dataset, UAV-VisLoc, to facilitate the UAV visual localization task.
Our dataset includes 6,742 drone images and 11 satellite maps, with metadata such as latitude, longitude, altitude, and capture date.
arXiv Detail & Related papers (2024-05-20T10:24:10Z) - TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-based Scenes [58.180556221044235]
We present a new approach to bridge the domain gap between synthetic and real-world data for unmanned aerial vehicle (UAV)-based perception.
Our formulation is designed for dynamic scenes, consisting of small moving objects or human actions.
We evaluate its performance on challenging datasets, including Okutama Action and UG2.
arXiv Detail & Related papers (2024-05-04T21:55:33Z) - Towards Viewpoint Robustness in Bird's Eye View Segmentation [85.99907496019972]
We study how AV perception models are affected by changes in camera viewpoint.
Small changes to pitch, yaw, depth, or height of the camera at inference time lead to large drops in performance.
We introduce a technique for novel view synthesis and use it to transform collected data to the viewpoint of target rigs.
arXiv Detail & Related papers (2023-09-11T02:10:07Z) - Ground-to-Aerial Person Search: Benchmark Dataset and Approach [42.54151390290665]
We construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS.
G2APS contains 31,770 images of 260,559 annotated bounding boxes for 2,644 identities appearing in both of the UAVs and ground surveillance cameras.
arXiv Detail & Related papers (2023-08-24T11:11:26Z) - VDD: Varied Drone Dataset for Semantic Segmentation [9.581655974280217]
We release a large-scale, densely labeled collection of 400 high-resolution images spanning 7 classes.
This dataset features various scenes in urban, industrial, rural, and natural areas, captured from different camera angles and under diverse lighting conditions.
We train seven state-of-the-art models on drone datasets as baselines.
arXiv Detail & Related papers (2023-05-23T02:16:14Z) - Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images.
We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z) - Multiview Detection with Feature Perspective Transformation [59.34619548026885]
We propose a novel multiview detection system, MVDet.
We take an anchor-free approach to aggregate multiview information by projecting feature maps onto the ground plane.
Our entire model is end-to-end learnable and achieves 88.2% MODA on the standard Wildtrack dataset.
arXiv Detail & Related papers (2020-07-14T17:58:30Z) - AU-AIR: A Multi-modal Unmanned Aerial Vehicle Dataset for Low Altitude
Traffic Surveillance [20.318367304051176]
Unmanned aerial vehicles (UAVs) with mounted cameras have the advantage of capturing aerial (bird-view) images.
Several aerial datasets have been introduced, including visual data with object annotations.
We propose a multi-purpose aerial dataset (AU-AIR) that has multi-modal sensor data collected in real-world outdoor environments.
arXiv Detail & Related papers (2020-01-31T09:45:12Z) - Detection and Tracking Meet Drones Challenge [131.31749447313197]
This paper presents a review of object detection and tracking datasets and benchmarks, and discusses the challenges of collecting large-scale drone-based object detection and tracking datasets with manual annotations.
We describe our VisDrone dataset, which is captured over various urban/suburban areas of 14 different cities across China from North to South.
We provide a detailed analysis of the current state of the field of large-scale object detection and tracking on drones, and conclude the challenge as well as propose future directions.
arXiv Detail & Related papers (2020-01-16T00:11:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.