Related papers: Real-Time Flying Object Detection with YOLOv8

Real-Time Flying Object Detection with YOLOv8

URL: http://arxiv.org/abs/2305.09972v2
Date: Wed, 22 May 2024 05:05:38 GMT
Title: Real-Time Flying Object Detection with YOLOv8
Authors: Dillon Reis, Jordan Kupec, Jacqueline Hong, Ahmad Daoudi,
Abstract summary: This paper presents a generalized model for real-time detection of flying objects. We also present a refined model that achieves state-of-the-art results for flying object detection.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a generalized model for real-time detection of flying objects that can be used for transfer learning and further research, as well as a refined model that achieves state-of-the-art results for flying object detection. We achieve this by training our first (generalized) model on a data set containing 40 different classes of flying objects, forcing the model to extract abstract feature representations. We then perform transfer learning with these learned parameters on a data set more representative of real world environments (i.e. higher frequency of occlusion, very small spatial sizes, rotations, etc.) to generate our refined model. Object detection of flying objects remains challenging due to large variances of object spatial sizes/aspect ratios, rate of speed, occlusion, and clustered backgrounds. To address some of the presented challenges while simultaneously maximizing performance, we utilize the current state-of-the-art single-shot detector, YOLOv8, in an attempt to find the best trade-off between inference speed and mean average precision (mAP). While YOLOv8 is being regarded as the new state-of-the-art, an official paper has not been released as of yet. Thus, we provide an in-depth explanation of the new architecture and functionality that YOLOv8 has adapted. Our final generalized model achieves a mAP50 of 79.2%, mAP50-95 of 68.5%, and an average inference speed of 50 frames per second (fps) on 1080p videos. Our final refined model maintains this inference speed and achieves an improved mAP50 of 99.1% and mAP50-95 of 83.5%

Related papers

Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation [68.81887041766373]
We introduce a diffusion-based paradigm for domain-generalized 9-DoF object pose estimation. We propose an effective diffusion model to redefine 9-DoF object pose estimation from a generative perspective. We show that our method achieves state-of-the-art domain generalization performance.
arXiv Detail & Related papers (2025-02-04T17:46:34Z)
Optimizing YOLO Architectures for Optimal Road Damage Detection and Classification: A Comparative Study from YOLOv7 to YOLOv10 [0.0]
This paper presents a comprehensive workflow for road damage detection using deep learning models. To accommodate hardware limitations, large images are cropped, and lightweight models are utilized. The proposed approach employs multiple model architectures, including a custom YOLOv7 model with Coordinate Attention layers and a Tiny YOLOv7 model.
arXiv Detail & Related papers (2024-10-10T22:55:12Z)
BootsTAP: Bootstrapped Training for Tracking-Any-Point [62.585297341343505]
Tracking-Any-Point (TAP) can be formalized as an algorithm to track any point on solid surfaces in a video. We show how large-scale, unlabeled, uncurated real-world data can improve a TAP model with minimal architectural changes. We demonstrate state-of-the-art performance on the TAP-Vid benchmark surpassing previous results by a wide margin.
arXiv Detail & Related papers (2024-02-01T18:38:55Z)
From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution [4.107182710549721]
We present an innovative approach that combines super-resolution and an adapted lightweight YOLOv5 architecture. Our experimental results demonstrate the model's superior performance in detecting small and densely clustered objects.
arXiv Detail & Related papers (2024-01-26T05:50:58Z)
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS. We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets. Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z)
Exploring the Effectiveness of Dataset Synthesis: An application of Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection. We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset. Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z)
Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection [20.161887223481994]
We propose a long-sequence modeling framework, named StreamPETR, for multi-view 3D object detection. StreamPETR achieves significant performance improvements only with negligible cost, compared to the single-frame baseline. The lightweight version realizes 45.0% mAP and 31.7 FPS, outperforming the state-of-the-art method (SOLOFusion) by 2.3% mAP and 1.8x faster FPS.
arXiv Detail & Related papers (2023-03-21T15:19:20Z)
EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework. We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects. Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z)
Could Giant Pretrained Image Models Extract Universal Representations? [94.97056702288317]
We present a study of frozen pretrained models when applied to diverse and representative computer vision tasks. Our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes.
arXiv Detail & Related papers (2022-11-03T17:57:10Z)
Analysis and Adaptation of YOLOv4 for Object Detection in Aerial Images [0.0]
Our work shows the adaptation of the popular YOLOv4 framework for predicting the objects and their locations in aerial images. The trained model resulted in a mean average precision (mAP) of 45.64% with an inference speed reaching 8.7 FPS on the Tesla K80 GPU. A comparative study with several contemporary aerial object detectors proved that YOLOv4 performed better, implying a more suitable detection algorithm to incorporate on aerial platforms.
arXiv Detail & Related papers (2022-03-18T23:51:09Z)
Evaluation of YOLO Models with Sliced Inference for Small Object Detection [0.0]
This work aims to benchmark the YOLOv5 and YOLOX models for small object detection. The effects of sliced fine-tuning and sliced inference combined produced substantial improvement for all models.
arXiv Detail & Related papers (2022-03-09T15:24:30Z)
Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge [57.647371468876116]
We introduce our real-time 2D object detection system for the realistic autonomous driving scenario. Our detector is built on a newly designed YOLO model, called YOLOX. On the Argoverse-HD dataset, our system achieves 41.0 streaming AP, which surpassed second place by 7.8/6.1 on detection-only track/fully track, respectively.
arXiv Detail & Related papers (2021-07-27T06:36:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.