Related papers: Aerial Image Object Detection With Vision Transformer Detector (ViTDet)

Aerial Image Object Detection With Vision Transformer Detector (ViTDet)

URL: http://arxiv.org/abs/2301.12058v1
Date: Sat, 28 Jan 2023 02:25:30 GMT
Title: Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Authors: Liya Wang, Alex Tien
Abstract summary: Vision Transformer Detector (ViTDet) was proposed to extract multi-scale features for object detection. ViTDet's simple design achieves good performance on natural scene images and can be easily embedded into any detector architecture. Our results show that ViTDet can consistently outperform its convolutional neural network counterparts on horizontal bounding box (HBB) object detection.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The past few years have seen an increased interest in aerial image object detection due to its critical value to large-scale geo-scientific research like environmental studies, urban planning, and intelligence monitoring. However, the task is very challenging due to the birds-eye view perspective, complex backgrounds, large and various image sizes, different appearances of objects, and the scarcity of well-annotated datasets. Recent advances in computer vision have shown promise tackling the challenge. Specifically, Vision Transformer Detector (ViTDet) was proposed to extract multi-scale features for object detection. The empirical study shows that ViTDet's simple design achieves good performance on natural scene images and can be easily embedded into any detector architecture. To date, ViTDet's potential benefit to challenging aerial image object detection has not been explored. Therefore, in our study, 25 experiments were carried out to evaluate the effectiveness of ViTDet for aerial image object detection on three well-known datasets: Airbus Aircraft, RarePlanes, and Dataset of Object DeTection in Aerial images (DOTA). Our results show that ViTDet can consistently outperform its convolutional neural network counterparts on horizontal bounding box (HBB) object detection by a large margin (up to 17% on average precision) and that it achieves the competitive performance for oriented bounding box (OBB) object detection. Our results also establish a baseline for future research.

Related papers

Analysis of Object Detection Models for Tiny Object in Satellite Imagery: A Dataset-Centric Approach [0.0]
This paper delves into the domain of Small-Object-Detection (SOD) within satellite imagery. Traditional object detection models face difficulties in detecting small objects due to limited contextual information and class imbalances. Our study aims to provide valuable insights into small object detection in satellite imagery by empirically evaluating state-of-the-art models.
arXiv Detail & Related papers (2024-12-12T07:06:22Z)
Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection [54.78470057491049]
Occupancy has emerged as a promising alternative for 3D scene perception. We introduce object-centric occupancy as a supplement to object bboxes. We show that our occupancy features significantly enhance the detection results of state-of-the-art 3D object detectors.
arXiv Detail & Related papers (2024-12-06T16:12:38Z)
NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection [11.564184330068775]
This letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection. We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images. Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells.
arXiv Detail & Related papers (2024-09-14T12:25:14Z)
Visible and Clear: Finding Tiny Objects in Difference Map [50.54061010335082]
We introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects. Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects. We further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear.
arXiv Detail & Related papers (2024-05-18T12:22:26Z)
FlightScope: A Deep Comprehensive Review of Aircraft Detection Algorithms in Satellite Imagery [2.9687381456164004]
This paper critically evaluates and compares a suite of advanced object detection algorithms customized for the task of identifying aircraft within satellite imagery. This research encompasses an array of methodologies including YOLO versions 5 and 8, Faster RCNN, CenterNet, RetinaNet, RTMDet, and DETR, all trained from scratch. YOLOv5 emerges as a robust solution for aerial object detection, underlining its importance through superior mean average precision, Recall, and Intersection over Union scores.
arXiv Detail & Related papers (2024-04-03T17:24:27Z)
Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field. We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z)
Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head. The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement. This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z)
On the Robustness of Object Detection Models in Aerial Images [37.50307094643692]
We introduce two novel benchmarks based on DOTA-v1.0. The first benchmark encompasses 19 prevalent corruptions, while the second focuses on cloud-corrupted images. We find that enhanced model architectures, larger networks, well-crafted modules, and judicious data augmentation strategies collectively enhance the robustness of aerial object detection models.
arXiv Detail & Related papers (2023-08-29T15:16:51Z)
Object Detection in Aerial Images with Uncertainty-Aware Graph Network [61.02591506040606]
We propose a novel uncertainty-aware object detection framework with a structured-graph, where nodes and edges are denoted by objects. We refer to our model as Uncertainty-Aware Graph network for object DETection (UAGDet)
arXiv Detail & Related papers (2022-08-23T07:29:03Z)
Aerial Monocular 3D Object Detection [67.20369963664314]
DVDET is proposed to achieve aerial monocular 3D object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module. To encourage more researchers to investigate this area, we will release the dataset and related code.
arXiv Detail & Related papers (2022-08-08T08:32:56Z)
Object Detection in Aerial Images: What Improves the Accuracy? [9.857292888257144]
deep learning-based object detection approaches have been actively explored for the problem of object detection in aerial images. In this work, we investigate the impact of Faster R-CNN for aerial object detection and explore numerous strategies to improve its performance for aerial images.
arXiv Detail & Related papers (2022-01-21T16:22:48Z)
Artificial and beneficial -- Exploiting artificial images for aerial vehicle detection [1.4528189330418975]
We propose a generative approach that generates top-down images by overlaying artificial vehicles created from 2D CAD drawings on artificial or real backgrounds. Our experiments with a modified RetinaNet object detection network show that adding these images to small real-world datasets significantly improves detection performance.
arXiv Detail & Related papers (2021-04-07T11:06:15Z)
Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges [124.48654341780431]
We present a large-scale dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI. The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images. We build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated.
arXiv Detail & Related papers (2021-02-24T11:20:55Z)
Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images. We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z)
EAGLE: Large-scale Vehicle Detection Dataset in Real-World Scenarios using Aerial Imagery [3.8902657229395894]
We introduce a large-scale dataset for multi-class vehicle detection with object orientation information in aerial imagery. It features high-resolution aerial images composed of different real-world situations with a wide variety of camera sensor, resolution, flight altitude, weather, illumination, haze, shadow, time, city, country, occlusion, and camera angle. It contains 215,986 instances annotated with oriented bounding boxes defined by four points and orientation, making it by far the largest dataset to date in this task. It also supports researches on the haze and shadow removal as well as super-resolution and in-painting applications.
arXiv Detail & Related papers (2020-07-12T23:00:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.