A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery
- URL: http://arxiv.org/abs/2204.02325v1
- Date: Tue, 5 Apr 2022 16:29:49 GMT
- Title: A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery
- Authors: Alessandro Betti
- Abstract summary: We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
- Score: 94.78943497436492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the breakthrough deep learning performances achieved for automatic
object detection, small target detection is still a challenging problem,
especially when looking at fast and accurate solutions suitable for mobile or
edge applications. In this work we present YOLO-S, a simple, fast and efficient
network for small target detection. The architecture exploits a small feature
extractor based on Darknet20, as well as skip connection, via both bypass and
concatenation, and reshape-passthrough layer to alleviate the vanishing
gradient problem, promote feature reuse across network and combine low-level
positional information with more meaningful high-level information. To verify
the performances of YOLO-S, we build "AIRES", a novel dataset for cAr detectIon
fRom hElicopter imageS acquired in Europe, and set up experiments on both AIRES
and VEDAI datasets, benchmarking this architecture with four baseline
detectors. Furthermore, in order to handle efficiently the issue of data
insufficiency and domain gap when dealing with a transfer learning strategy, we
introduce a transitional learning task over a combined dataset based on DOTAv2
and VEDAI and demonstrate that can enhance the overall accuracy with respect to
more general features transferred from COCO data. YOLO-S is from 25% to 50%
faster than YOLOv3 and only 15-25% slower than Tiny-YOLOv3, outperforming also
YOLOv3 in terms of accuracy in a wide range of experiments. Further simulations
performed on SARD dataset demonstrate also its applicability to different
scenarios such as for search and rescue operations. Besides, YOLO-S has an 87%
decrease of parameter size and almost one half FLOPs of YOLOv3, making
practical the deployment for low-power industrial applications.
Related papers
- What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector [0.0]
This study focuses on the YOLOv9 object detection model, focusing on its architectural innovations, training methodologies, and performance improvements.
Key advancements, such as the Generalized Efficient Layer Aggregation Network GELAN and Programmable Gradient Information PGI, significantly enhance feature extraction and gradient flow.
This paper provides the first in depth exploration of YOLOv9s internal features and their real world applicability, establishing it as a state of the art solution for real time object detection.
arXiv Detail & Related papers (2024-09-12T07:46:58Z) - YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO [10.183459286746196]
We introduce YOLO Phantom, one of the smallest YOLO models ever conceived.
YOLO Phantom achieves comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size.
Its real-world efficacy is demonstrated on an IoT platform with advanced low-light and RGB cameras, seamlessly connecting to an AWS-based notification endpoint.
arXiv Detail & Related papers (2024-02-12T18:56:53Z) - YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities.
Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency.
YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z) - Active search and coverage using point-cloud reinforcement learning [50.741409008225766]
This paper presents an end-to-end deep reinforcement learning solution for target search and coverage.
We show that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points.
We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome.
arXiv Detail & Related papers (2023-12-18T18:16:30Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - Scaling Data Generation in Vision-and-Language Navigation [116.95534559103788]
We propose an effective paradigm for generating large-scale data for learning.
We apply 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs.
Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning.
arXiv Detail & Related papers (2023-07-28T16:03:28Z) - Real-time object detection method based on improved YOLOv4-tiny [0.0]
YOLOv4-tiny is proposed based on YOLOv4 to simple the network structure and reduce parameters, which makes it be suitable for developing on the mobile and embedded devices.
It firstly uses two ResBlock-D modules in ResNet-D network instead of two CSPBlock modules in Yolov4-tiny, which reduces the computation complexity.
In the design of auxiliary network, two consecutive 3x3 convolutions are used to obtain 5x5 receptive fields to extract global features, and channel attention and spatial attention are also used to extract more effective information.
arXiv Detail & Related papers (2020-11-09T08:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.