Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
- URL: http://arxiv.org/abs/2301.12058v1
- Date: Sat, 28 Jan 2023 02:25:30 GMT
- Title: Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
- Authors: Liya Wang, Alex Tien
- Abstract summary: Vision Transformer Detector (ViTDet) was proposed to extract multi-scale features for object detection.
ViTDet's simple design achieves good performance on natural scene images and can be easily embedded into any detector architecture.
Our results show that ViTDet can consistently outperform its convolutional neural network counterparts on horizontal bounding box (HBB) object detection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The past few years have seen an increased interest in aerial image object
detection due to its critical value to large-scale geo-scientific research like
environmental studies, urban planning, and intelligence monitoring. However,
the task is very challenging due to the birds-eye view perspective, complex
backgrounds, large and various image sizes, different appearances of objects,
and the scarcity of well-annotated datasets. Recent advances in computer vision
have shown promise tackling the challenge. Specifically, Vision Transformer
Detector (ViTDet) was proposed to extract multi-scale features for object
detection. The empirical study shows that ViTDet's simple design achieves good
performance on natural scene images and can be easily embedded into any
detector architecture. To date, ViTDet's potential benefit to challenging
aerial image object detection has not been explored. Therefore, in our study,
25 experiments were carried out to evaluate the effectiveness of ViTDet for
aerial image object detection on three well-known datasets: Airbus Aircraft,
RarePlanes, and Dataset of Object DeTection in Aerial images (DOTA). Our
results show that ViTDet can consistently outperform its convolutional neural
network counterparts on horizontal bounding box (HBB) object detection by a
large margin (up to 17% on average precision) and that it achieves the
competitive performance for oriented bounding box (OBB) object detection. Our
results also establish a baseline for future research.
Related papers
- Visible and Clear: Finding Tiny Objects in Difference Map [50.54061010335082]
We introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects.
Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects.
We further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear.
arXiv Detail & Related papers (2024-05-18T12:22:26Z) - FlightScope: A Deep Comprehensive Review of Aircraft Detection Algorithms in Satellite Imagery [2.9687381456164004]
This paper critically evaluates and compares a suite of advanced object detection algorithms customized for the task of identifying aircraft within satellite imagery.
This research encompasses an array of methodologies including YOLO versions 5 and 8, Faster RCNN, CenterNet, RetinaNet, RTMDet, and DETR, all trained from scratch.
YOLOv5 emerges as a robust solution for aerial object detection, underlining its importance through superior mean average precision, Recall, and Intersection over Union scores.
arXiv Detail & Related papers (2024-04-03T17:24:27Z) - Instance-aware Multi-Camera 3D Object Detection with Structural Priors
Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - On the Robustness of Object Detection Models in Aerial Images [37.50307094643692]
We introduce two novel benchmarks based on DOTA-v1.0.
The first benchmark encompasses 19 prevalent corruptions, while the second focuses on cloud-corrupted images.
We find that enhanced model architectures, larger networks, well-crafted modules, and judicious data augmentation strategies collectively enhance the robustness of aerial object detection models.
arXiv Detail & Related papers (2023-08-29T15:16:51Z) - Object Detection in Aerial Images with Uncertainty-Aware Graph Network [61.02591506040606]
We propose a novel uncertainty-aware object detection framework with a structured-graph, where nodes and edges are denoted by objects.
We refer to our model as Uncertainty-Aware Graph network for object DETection (UAGDet)
arXiv Detail & Related papers (2022-08-23T07:29:03Z) - Object Detection in Aerial Images: What Improves the Accuracy? [9.857292888257144]
deep learning-based object detection approaches have been actively explored for the problem of object detection in aerial images.
In this work, we investigate the impact of Faster R-CNN for aerial object detection and explore numerous strategies to improve its performance for aerial images.
arXiv Detail & Related papers (2022-01-21T16:22:48Z) - Artificial and beneficial -- Exploiting artificial images for aerial
vehicle detection [1.4528189330418975]
We propose a generative approach that generates top-down images by overlaying artificial vehicles created from 2D CAD drawings on artificial or real backgrounds.
Our experiments with a modified RetinaNet object detection network show that adding these images to small real-world datasets significantly improves detection performance.
arXiv Detail & Related papers (2021-04-07T11:06:15Z) - Object Detection in Aerial Images: A Large-Scale Benchmark and
Challenges [124.48654341780431]
We present a large-scale dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI.
The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images.
We build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated.
arXiv Detail & Related papers (2021-02-24T11:20:55Z) - Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images.
We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z) - EAGLE: Large-scale Vehicle Detection Dataset in Real-World Scenarios
using Aerial Imagery [3.8902657229395894]
We introduce a large-scale dataset for multi-class vehicle detection with object orientation information in aerial imagery.
It features high-resolution aerial images composed of different real-world situations with a wide variety of camera sensor, resolution, flight altitude, weather, illumination, haze, shadow, time, city, country, occlusion, and camera angle.
It contains 215,986 instances annotated with oriented bounding boxes defined by four points and orientation, making it by far the largest dataset to date in this task.
It also supports researches on the haze and shadow removal as well as super-resolution and in-painting applications.
arXiv Detail & Related papers (2020-07-12T23:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.