SuperYOLO: Super Resolution Assisted Object Detection in Multimodal
Remote Sensing Imagery
- URL: http://arxiv.org/abs/2209.13351v2
- Date: Sat, 8 Apr 2023 09:50:26 GMT
- Title: SuperYOLO: Super Resolution Assisted Object Detection in Multimodal
Remote Sensing Imagery
- Authors: Jiaqing Zhang, Jie Lei, Weiying Xie, Zhenman Fang, Yunsong Li, Qian Du
- Abstract summary: We propose SuperYOLO, which fuses multimodal data and performs high-resolution (HR) object detection on multiscale objects.
Our proposed model shows a favorable accuracy and speed tradeoff compared to the state-of-the-art models.
- Score: 36.216230299131404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately and timely detecting multiscale small objects that contain tens of
pixels from remote sensing images (RSI) remains challenging. Most of the
existing solutions primarily design complex deep neural networks to learn
strong feature representations for objects separated from the background, which
often results in a heavy computation burden. In this article, we propose an
accurate yet fast object detection method for RSI, named SuperYOLO, which fuses
multimodal data and performs high-resolution (HR) object detection on
multiscale objects by utilizing the assisted super resolution (SR) learning and
considering both the detection accuracy and computation cost. First, we utilize
a symmetric compact multimodal fusion (MF) to extract supplementary information
from various data for improving small object detection in RSI. Furthermore, we
design a simple and flexible SR branch to learn HR feature representations that
can discriminate small objects from vast backgrounds with low-resolution (LR)
input, thus further improving the detection accuracy. Moreover, to avoid
introducing additional computation, the SR branch is discarded in the inference
stage, and the computation of the network model is reduced due to the LR input.
Experimental results show that, on the widely used VEDAI RS dataset, SuperYOLO
achieves an accuracy of 75.09% (in terms of mAP50 ), which is more than 10%
higher than the SOTA large models, such as YOLOv5l, YOLOv5x, and RS designed
YOLOrs. Meanwhile, the parameter size and GFLOPs of SuperYOLO are about 18
times and 3.8 times less than YOLOv5x. Our proposed model shows a favorable
accuracy and speed tradeoff compared to the state-of-the-art models. The code
will be open-sourced at https://github.com/icey-zhang/SuperYOLO.
Related papers
- SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes [1.3812010983144802]
Small Object Detection YOLOv8 (SOD-YOLOv8) is designed for scenarios involving numerous small objects.
SOD-YOLOv8 significantly improves small object detection, surpassing widely used models in various metrics.
In dynamic real-world traffic scenes, SOD-YOLOv8 demonstrated notable improvements in diverse conditions.
arXiv Detail & Related papers (2024-08-08T23:05:25Z) - YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities.
Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency.
YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z) - From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection
with Super Resolution [4.107182710549721]
We present an innovative approach that combines super-resolution and an adapted lightweight YOLOv5 architecture.
Our experimental results demonstrate the model's superior performance in detecting small and densely clustered objects.
arXiv Detail & Related papers (2024-01-26T05:50:58Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Remote Sensing Image Super-resolution and Object Detection: Benchmark
and State of the Art [7.74389937337756]
This paper reviews current datasets and object detection methods (deep learning-based) for remote sensing images.
We propose a large-scale, publicly available benchmark Remote Sensing Super-resolution Object Detection dataset.
We also propose a novel Multi-class Cyclic super-resolution Generative adversarial network with Residual feature aggregation (MCGR) and auxiliary YOLOv5 detector to benchmark image super-resolution-based object detection.
arXiv Detail & Related papers (2021-11-05T04:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.