Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment
- URL: http://arxiv.org/abs/2011.02356v1
- Date: Wed, 4 Nov 2020 15:37:49 GMT
- Title: Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment
- Authors: Qi Kuang, Xin Jin, Qinping Zhao, Bin Zhou
- Abstract summary: We present a method of deep multimodality learning for UAV video aesthetic quality assessment.
A novel specially designed motion stream network is proposed for this new multistream framework.
We present three application examples of UAV video grading, professional segment detection and aesthetic-based UAV path planning.
- Score: 22.277636020333198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the growing number of unmanned aerial vehicles (UAVs) and aerial
videos, there is a paucity of studies focusing on the aesthetics of aerial
videos that can provide valuable information for improving the aesthetic
quality of aerial photography. In this article, we present a method of deep
multimodality learning for UAV video aesthetic quality assessment. More
specifically, a multistream framework is designed to exploit aesthetic
attributes from multiple modalities, including spatial appearance, drone camera
motion, and scene structure. A novel specially designed motion stream network
is proposed for this new multistream framework. We construct a dataset with
6,000 UAV video shots captured by drone cameras. Our model can judge whether a
UAV video was shot by professional photographers or amateurs together with the
scene type classification. The experimental results reveal that our method
outperforms the video classification methods and traditional SVM-based methods
for video aesthetics. In addition, we present three application examples of UAV
video grading, professional segment detection and aesthetic-based UAV path
planning using the proposed method.
Related papers
- Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - Learning to Compress Unmanned Aerial Vehicle (UAV) Captured Video:
Benchmark and Analysis [54.07535860237662]
We propose a novel task for learned UAV video coding and construct a comprehensive and systematic benchmark for such a task.
It is expected that the benchmark will accelerate the research and development in video coding on drone platforms.
arXiv Detail & Related papers (2023-01-15T15:18:02Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos [15.244418294614857]
We design a UAV system with a Panoramic Annular Lens (PAL), which has the characteristics of small size, low weight, and a 360-degree annular FoV.
A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing.
A comprehensive variety of experiments shows that the designed system performs satisfactorily in aerial panoramic scene parsing.
arXiv Detail & Related papers (2021-05-15T12:01:16Z) - Real-time dense 3D Reconstruction from monocular video data captured by
low-cost UAVs [0.3867363075280543]
Real-time 3D reconstruction enables fast dense mapping of the environment which benefits numerous applications, such as navigation or live evaluation of an emergency.
In contrast to most real-time capable approaches, our approach does not need an explicit depth sensor.
By exploiting the self-motion of the unmanned aerial vehicle (UAV) flying with oblique view around buildings, we estimate both camera trajectory and depth for selected images with enough novel content.
arXiv Detail & Related papers (2021-04-21T13:12:17Z) - Broaden Your Views for Self-Supervised Video Learning [97.52216510672251]
We introduce BraVe, a self-supervised learning framework for video.
In BraVe, one of the views has access to a narrow temporal window of the video while the other view has a broad access to the video content.
We demonstrate that BraVe achieves state-of-the-art results in self-supervised representation learning on standard video and audio classification benchmarks.
arXiv Detail & Related papers (2021-03-30T17:58:46Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking [59.06167734555191]
Unmanned Aerial Vehicle (UAV) offers lots of applications in both commerce and recreation.
We consider the task of tracking UAVs, providing rich information such as location and trajectory.
We propose a dataset, Anti-UAV, with more than 300 video pairs containing over 580k manually annotated bounding boxes.
arXiv Detail & Related papers (2021-01-21T07:00:15Z) - Self-supervised monocular depth estimation from oblique UAV videos [8.876469413317341]
This paper aims to estimate depth from a single UAV aerial image using deep learning.
We propose a novel architecture with two 2D CNN encoders and a 3D CNN decoder for extracting information from consecutive temporal frames.
arXiv Detail & Related papers (2020-12-19T14:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.