Related papers: LeYOLO, New Embedded Architecture for Object Detection

LeYOLO, New Embedded Architecture for Object Detection

URL: http://arxiv.org/abs/2406.14239v2
Date: Tue, 03 Jun 2025 11:48:17 GMT
Title: LeYOLO, New Embedded Architecture for Object Detection
Authors: Lilian Hollard, Lucas Mohimont, Nathalie Gaveau, Luiz Angelo Steffenel,
Abstract summary: We introduce two key contributions to object detection models using MSCOCO as a base validation set.<n>First, we propose LeNeck, a general-purpose detection framework that maintains inference speed comparable to SSDLite.<n>Second, we present LeYOLO, an efficient object detection model designed to enhance computational efficiency in YOLO-based architectures.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient computation in deep neural networks is crucial for real-time object detection. However, recent advancements primarily result from improved high-performing hardware rather than improving parameters and FLOP efficiency. This is especially evident in the latest YOLO architectures, where speed is prioritized over lightweight design. As a result, object detection models optimized for low-resource environments like microcontrollers have received less attention. For devices with limited computing power, existing solutions primarily rely on SSDLite or combinations of low-parameter classifiers, creating a noticeable gap between YOLO-like architectures and truly efficient lightweight detectors. This raises a key question: Can a model optimized for parameter and FLOP efficiency achieve accuracy levels comparable to mainstream YOLO models? To address this, we introduce two key contributions to object detection models using MSCOCO as a base validation set. First, we propose LeNeck, a general-purpose detection framework that maintains inference speed comparable to SSDLite while significantly improving accuracy and reducing parameter count. Second, we present LeYOLO, an efficient object detection model designed to enhance computational efficiency in YOLO-based architectures. LeYOLO effectively bridges the gap between SSDLite-based detectors and YOLO models, offering high accuracy in a model as compact as MobileNets. Both contributions are particularly well-suited for mobile, embedded, and ultra-low-power devices, including microcontrollers, where computational efficiency is critical.

Related papers

A lightweight model FDM-YOLO for small target improvement based on YOLOv8 [0.0]
Small targets are difficult to detect due to their low pixel count, complex backgrounds, and varying shooting angles. This paper focuses on small target detection and explores methods for object detection under low computational constraints.
arXiv Detail & Related papers (2025-03-06T14:06:35Z)
YOLOv12: A Breakdown of the Key Architectural Features [0.5639904484784127]
YOLOv12 is a significant advancement in single-stage, real-time object detection.<n>It incorporates an optimised backbone (R-ELAN), 7x7 separable convolutions, and FlashAttention-driven area-based attention.<n>It offers scalable solutions for both latency-sensitive and high-accuracy applications.
arXiv Detail & Related papers (2025-02-20T17:08:43Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector [0.0]
This study focuses on the YOLOv9 object detection model, focusing on its architectural innovations, training methodologies, and performance improvements. Key advancements, such as the Generalized Efficient Layer Aggregation Network GELAN and Programmable Gradient Information PGI, significantly enhance feature extraction and gradient flow. This paper provides the first in depth exploration of YOLOv9s internal features and their real world applicability, establishing it as a state of the art solution for real time object detection.
arXiv Detail & Related papers (2024-09-12T07:46:58Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 [19.388112026410045]
YOLO-TLA is an advanced object detection model building on YOLOv5. We first introduce an additional detection layer for small objects in the neck network pyramid architecture. This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters.
arXiv Detail & Related papers (2024-02-22T05:55:17Z)
MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO [10.183459286746196]
We introduce YOLO Phantom, one of the smallest YOLO models ever conceived. YOLO Phantom achieves comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size. Its real-world efficacy is demonstrated on an IoT platform with advanced low-light and RGB cameras, seamlessly connecting to an AWS-based notification endpoint.
arXiv Detail & Related papers (2024-02-12T18:56:53Z)
SATAY: A Streaming Architecture Toolflow for Accelerating YOLO Models on FPGA Devices [48.47320494918925]
This work tackles the challenges of deploying stateof-the-art object detection models onto FPGA devices for ultralow latency applications. We employ a streaming architecture design for our YOLO accelerators, implementing the complete model on-chip in a deeply pipelined fashion. We introduce novel hardware components to support the operations of YOLO models in a dataflow manner, and off-chip memory buffering to address the limited on-chip memory resources.
arXiv Detail & Related papers (2023-09-04T13:15:01Z)
EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework. We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects. Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z)
CNN-transformer mixed model for object detection [3.5897534810405403]
In this paper, I propose a convolutional module with a transformer. It aims to improve the recognition accuracy of the model by fusing the detailed features extracted by CNN with the global features extracted by a transformer. After 100 rounds of training on the Pascal VOC dataset, the accuracy of the results reached 81%, which is 4.6 better than the faster RCNN[4] using resnet101[5] as the backbone.
arXiv Detail & Related papers (2022-12-13T16:35:35Z)
DAMO-YOLO : A Report on Real-Time Object Detection Design [19.06518351354291]
We present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. We use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone. In the design of necks and heads, we follow the rule of large neck, small head''
arXiv Detail & Related papers (2022-11-23T17:59:12Z)
FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task. Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
NAS-FCOS: Efficient Search for Object Detection Architectures [113.47766862146389]
We propose an efficient method to obtain better object detectors by searching for the feature pyramid network (FPN) and the prediction head of a simple anchor-free object detector. With carefully designed search space, search algorithms, and strategies for evaluating network quality, we are able to find top-performing detection architectures within 4 days using 8 V100 GPUs.
arXiv Detail & Related papers (2021-10-24T12:20:04Z)
Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic [7.503338065129185]
We propose an Entropy-Based Convolutional Layer Estimation (EBCLE) which is robust and simple. We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE.
arXiv Detail & Related papers (2021-06-27T10:34:39Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
EfficientPose: Efficient Human Pose Estimation with Neural Architecture Search [47.30243595690131]
We propose an efficient framework targeted at human pose estimation including two parts, the efficient backbone and the efficient head. Our smallest model has only 0.65 GFLOPs with 88.1% PCKh@0.5 on MPII and our large model has only 2 GFLOPs while its accuracy is competitive with the state-of-the-art large model.
arXiv Detail & Related papers (2020-12-13T15:38:38Z)
Real-time object detection method based on improved YOLOv4-tiny [0.0]
YOLOv4-tiny is proposed based on YOLOv4 to simple the network structure and reduce parameters, which makes it be suitable for developing on the mobile and embedded devices. It firstly uses two ResBlock-D modules in ResNet-D network instead of two CSPBlock modules in Yolov4-tiny, which reduces the computation complexity. In the design of auxiliary network, two consecutive 3x3 convolutions are used to obtain 5x5 receptive fields to extract global features, and channel attention and spatial attention are also used to extract more effective information.
arXiv Detail & Related papers (2020-11-09T08:26:28Z)
FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints. FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.