Lightweight Object Detection: A Study Based on YOLOv7 Integrated with
ShuffleNetv2 and Vision Transformer
- URL: http://arxiv.org/abs/2403.01736v1
- Date: Mon, 4 Mar 2024 05:29:32 GMT
- Title: Lightweight Object Detection: A Study Based on YOLOv7 Integrated with
ShuffleNetv2 and Vision Transformer
- Authors: Wenkai Gong
- Abstract summary: This study zeroes in on optimizing the YOLOv7 algorithm to boost its operational efficiency and speed on mobile platforms.
The experimental outcomes reveal that the refined YOLO model demonstrates exceptional performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As mobile computing technology rapidly evolves, deploying efficient object
detection algorithms on mobile devices emerges as a pivotal research area in
computer vision. This study zeroes in on optimizing the YOLOv7 algorithm to
boost its operational efficiency and speed on mobile platforms while ensuring
high accuracy. Leveraging a synergy of advanced techniques such as Group
Convolution, ShuffleNetV2, and Vision Transformer, this research has
effectively minimized the model's parameter count and memory usage, streamlined
the network architecture, and fortified the real-time object detection
proficiency on resource-constrained devices. The experimental outcomes reveal
that the refined YOLO model demonstrates exceptional performance, markedly
enhancing processing velocity while sustaining superior detection accuracy.
Related papers
- YOLOv12: A Breakdown of the Key Architectural Features [0.5639904484784127]
YOLOv12 is a significant advancement in single-stage, real-time object detection.
It incorporates an optimised backbone (R-ELAN), 7x7 separable convolutions, and FlashAttention-driven area-based attention.
It offers scalable solutions for both latency-sensitive and high-accuracy applications.
arXiv Detail & Related papers (2025-02-20T17:08:43Z) - Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving [3.617580194719686]
This paper introduces Fast-COS, a novel single-stage object detection framework crafted specifically for driving scenes.
RAViT achieves 81.4% Top-1 accuracy on the ImageNet-1K dataset.
It surpasses leading models in efficiency, delivering up to 75.9% faster GPU inference and 1.38 higher throughput on edge devices.
arXiv Detail & Related papers (2025-02-11T09:54:09Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Cutting-Edge Detection of Fatigue in Drivers: A Comparative Study of Object Detection Models [0.0]
This research delves into the development of a fatigue detection system based on modern object detection algorithms, including YOLOv5, YOLOv6, YOLOv7, and YOLOv8.
By comparing the performance of these models, we evaluate their effectiveness in real-time detection of fatigue-related behavior in drivers.
The study addresses challenges like environmental variability and detection accuracy and suggests a roadmap for enhancing real-time detection.
arXiv Detail & Related papers (2024-10-19T08:06:43Z) - What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector [0.0]
This study focuses on the YOLOv9 object detection model, focusing on its architectural innovations, training methodologies, and performance improvements.
Key advancements, such as the Generalized Efficient Layer Aggregation Network GELAN and Programmable Gradient Information PGI, significantly enhance feature extraction and gradient flow.
This paper provides the first in depth exploration of YOLOv9s internal features and their real world applicability, establishing it as a state of the art solution for real time object detection.
arXiv Detail & Related papers (2024-09-12T07:46:58Z) - Cross-Cluster Shifting for Efficient and Effective 3D Object Detection
in Autonomous Driving [69.20604395205248]
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving.
We introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector.
We conduct extensive experiments on the KITTI, runtime, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD.
arXiv Detail & Related papers (2024-03-10T10:36:32Z) - Leveraging the Power of Data Augmentation for Transformer-based Tracking [64.46371987827312]
We propose two data augmentation methods customized for tracking.
First, we optimize existing random cropping via a dynamic search radius mechanism and simulation for boundary samples.
Second, we propose a token-level feature mixing augmentation strategy, which enables the model against challenges like background interference.
arXiv Detail & Related papers (2023-09-15T09:18:54Z) - StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception.
We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy.
Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device [53.323878851563414]
We propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques.
Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically.
The proposed framework achieves real-time 3D object detection on mobile devices with competitive detection performance.
arXiv Detail & Related papers (2020-12-26T19:41:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.